January 28, 2020

3248 words 16 mins read

Paper Group ANR 772

Biases for Emergent Communication in Multi-agent Reinforcement Learning. LOGAN: Unpaired Shape Transform in Latent Overcomplete Space. Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification. Human detection of machine manipulated media. Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Spea …

Biases for Emergent Communication in Multi-agent Reinforcement Learning


Title	Biases for Emergent Communication in Multi-agent Reinforcement Learning
Authors	Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, Thore Graepel
Abstract	We study the problem of emergent communication, in which language arises because speakers and listeners must communicate information in order to solve tasks. In temporally extended reinforcement learning domains, it has proved hard to learn such communication without centralized training of agents, due in part to a difficult joint exploration problem. We introduce inductive biases for positive signalling and positive listening, which ease this problem. In a simple one-step environment, we demonstrate how these biases ease the learning problem. We also apply our methods to a more extended environment, showing that agents with these inductive biases achieve better performance, and analyse the resulting communication protocols.
Tasks	Multi-agent Reinforcement Learning
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05676v1
PDF	https://arxiv.org/pdf/1912.05676v1.pdf
PWC	https://paperswithcode.com/paper/biases-for-emergent-communication-in-multi-1
Repo
Framework

LOGAN: Unpaired Shape Transform in Latent Overcomplete Space


Title	LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
Authors	Kangxue Yin, Zhiqin Chen, Hui Huang, Daniel Cohen-Or, Hao Zhang
Abstract	We introduce LOGAN, a deep neural network aimed at learning general-purpose shape transforms from unpaired domains. The network is trained on two sets of shapes, e.g., tables and chairs, while there is neither a pairing between shapes from the domains as supervision nor any point-wise correspondence between any shapes. Once trained, LOGAN takes a shape from one domain and transforms it into the other. Our network consists of an autoencoder to encode shapes from the two input domains into a common latent space, where the latent codes concatenate multi-scale shape features, resulting in an overcomplete representation. The translator is based on a generative adversarial network (GAN), operating in the latent space, where an adversarial loss enforces cross-domain translation while a feature preservation loss ensures that the right shape features are preserved for a natural shape transform. We conduct ablation studies to validate each of our key network designs and demonstrate superior capabilities in unpaired shape transforms on a variety of examples over baselines and state-of-the-art approaches. We show that LOGAN is able to learn what shape features to preserve during shape translation, either local or non-local, whether content or style, depending solely on the input domains for training.
Tasks
Published	2019-03-25
URL	https://arxiv.org/abs/1903.10170v3
PDF	https://arxiv.org/pdf/1903.10170v3.pdf
PWC	https://paperswithcode.com/paper/logan-unpaired-shape-transform-in-latent
Repo
Framework

Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification


Title	Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification
Authors	Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng
Abstract	High-performance spoofing countermeasure systems for automatic speaker verification (ASV) have been proposed in the ASVspoof 2019 challenge. However, the robustness of such systems under adversarial attacks has not been studied yet. In this paper, we investigate the vulnerability of spoofing countermeasures for ASV under both white-box and black-box adversarial attacks with the fast gradient sign method (FGSM) and the projected gradient descent (PGD) method. We implement high-performing countermeasure models in the ASVspoof 2019 challenge and conduct adversarial attacks on them. We compare performance of black-box attacks across spoofing countermeasure models with different network architectures and different amount of model parameters. The experimental results show that all implemented countermeasure models are vulnerable to FGSM and PGD attacks under the scenario of white-box attack. The more dangerous black-box attacks also prove to be effective by the experimental results.
Tasks	Speaker Verification
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08716v1
PDF	https://arxiv.org/pdf/1910.08716v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-attacks-on-spoofing
Repo
Framework

Human detection of machine manipulated media


Title	Human detection of machine manipulated media
Authors	Matthew Groh, Ziv Epstein, Nick Obradovich, Manuel Cebrian, Iyad Rahwan
Abstract	Recent advances in neural networks for content generation enable artificial intelligence (AI) models to generate high-quality media manipulations. Here we report on a randomized experiment designed to study the effect of exposure to media manipulations on over 15,000 individuals’ ability to discern machine-manipulated media. We engineer a neural network to plausibly and automatically remove objects from images, and we deploy this neural network online with a randomized experiment where participants can guess which image out of a pair of images has been manipulated. The system provides participants feedback on the accuracy of each guess. In the experiment, we randomize the order in which images are presented, allowing causal identification of the learning curve surrounding participants’ ability to detect fake content. We find sizable and robust evidence that individuals learn to detect fake content through exposure to manipulated media when provided iterative feedback on their detection attempts. Over a succession of only ten images, participants increase their rating accuracy by over ten percentage points. Our study provides initial evidence that human ability to detect fake, machine-generated content may increase alongside the prevalence of such media online.
Tasks	Causal Identification, Human Detection
Published	2019-07-06
URL	https://arxiv.org/abs/1907.05276v2
PDF	https://arxiv.org/pdf/1907.05276v2.pdf
PWC	https://paperswithcode.com/paper/human-detection-of-machine-manipulated-media
Repo
Framework

Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification


Title	Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification
Authors	Youngmoon Jung, Yeunju Choi, Hoirin Kim
Abstract	Voice activity detection (VAD), which classifies frames as speech or non-speech, is an important module in many speech applications including speaker verification. In this paper, we propose a novel method, called self-adaptive soft VAD, to incorporate a deep neural network (DNN)-based VAD into a deep speaker embedding system. The proposed method is a combination of the following two approaches. The first approach is soft VAD, which performs a soft selection of frame-level features extracted from a speaker feature extractor. The frame-level features are weighted by their corresponding speech posteriors estimated from the DNN-based VAD, and then aggregated to generate a speaker embedding. The second approach is self-adaptive VAD, which fine-tunes the pre-trained VAD on the speaker verification data to reduce the domain mismatch. Here, we introduce two unsupervised domain adaptation (DA) schemes, namely speech posterior-based DA (SP-DA) and joint learning-based DA (JL-DA). Experiments on a Korean speech database demonstrate that the verification performance is improved significantly in real-world environments by using self-adaptive soft VAD.
Tasks	Action Detection, Activity Detection, Domain Adaptation, Speaker Verification, Unsupervised Domain Adaptation
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11886v1
PDF	https://arxiv.org/pdf/1909.11886v1.pdf
PWC	https://paperswithcode.com/paper/self-adaptive-soft-voice-activity-detection
Repo
Framework

VAE-based Domain Adaptation for Speaker Verification


Title	VAE-based Domain Adaptation for Speaker Verification
Authors	Xueyi Wang, Lantian Li, Dong Wang
Abstract	Deep speaker embedding has achieved satisfactory performance in speaker verification. By enforcing the neural model to discriminate the speakers in the training set, deep speaker embedding (called `x-vectors`) can be derived from the hidden layers. Despite its good performance, the present embedding model is highly domain sensitive, which means that it often works well in domains whose acoustic condition matches that of the training data (in-domain), but degrades in mismatched domains (out-of-domain). In this paper, we present a domain adaptation approach based on Variational Auto-Encoder (VAE). This model transforms x-vectors to a regularized latent space; within this latent space, a small amount of data from the target domain is sufficient to accomplish the adaptation. Our experiments demonstrated that by this VAE-adaptation approach, speaker embeddings can be easily transformed to the target domain, leading to noticeable performance improvement.
Tasks	Domain Adaptation, Speaker Verification
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10092v1
PDF	https://arxiv.org/pdf/1908.10092v1.pdf
PWC	https://paperswithcode.com/paper/vae-based-domain-adaptation-for-speaker
Repo
Framework

Optimizing quantum heuristics with meta-learning


Title	Optimizing quantum heuristics with meta-learning
Authors	Max Wilson, Sam Stromswold, Filip Wudarski, Stuart Hadfield, Norm M. Tubman, Eleanor Rieffel
Abstract	Variational quantum algorithms, a class of quantum heuristics, are promising candidates for the demonstration of useful quantum computation. Finding the best way to amplify the performance of these methods on hardware is an important task. Here, we evaluate the optimization of quantum heuristics with an existing class of techniques called meta-learners'. We compare the performance of a meta-learner to Bayesian optimization, evolutionary strategies, L-BFGS-B and Nelder-Mead approaches, for two quantum heuristics (quantum alternating operator ansatz and variational quantum eigensolver), on three problems, in three simulation environments. We show that the meta-learner comes near to the global optima more frequently than all other optimizers we tested in a noisy parameter setting environment. We also find that the meta-learner is generally more resistant to noise, for example seeing a smaller reduction in performance in Noisy and Sampling environments and performs better on average by a gain’ metric than its closest comparable competitor L-BFGS-B. These results are an important indication that meta-learning and associated machine learning methods will be integral to the useful application of noisy near-term quantum computers.
Tasks	Meta-Learning
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03185v1
PDF	https://arxiv.org/pdf/1908.03185v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-quantum-heuristics-with-meta
Repo
Framework

Evaluating Computational Language Models with Scaling Properties of Natural Language


Title	Evaluating Computational Language Models with Scaling Properties of Natural Language
Authors	Shuntaro Takahashi, Kumiko Tanaka-Ishii
Abstract	In this article, we evaluate computational models of natural language with respect to the universal statistical behaviors of natural language. Statistical mechanical analyses have revealed that natural language text is characterized by scaling properties, which quantify the global structure in the vocabulary population and the long memory of a text. We study whether five scaling properties (given by Zipf’s law, Heaps’ law, Ebeling’s method, Taylor’s law, and long-range correlation analysis) can serve for evaluation of computational models. Specifically, we test $n$-gram language models, a probabilistic context-free grammar (PCFG), language models based on Simon/Pitman-Yor processes, neural language models, and generative adversarial networks (GANs) for text generation. Our analysis reveals that language models based on recurrent neural networks (RNNs) with a gating mechanism (i.e., long short-term memory, LSTM; a gated recurrent unit, GRU; and quasi-recurrent neural networks, QRNNs) are the only computational models that can reproduce the long memory behavior of natural language. Furthermore, through comparison with recently proposed model-based evaluation methods, we find that the exponent of Taylor’s law is a good indicator of model quality.
Tasks	Text Generation
Published	2019-06-22
URL	https://arxiv.org/abs/1906.09379v1
PDF	https://arxiv.org/pdf/1906.09379v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-computational-language-models-with
Repo
Framework

LRS-DAG: Low Resource Supervised Domain Adaptation with Generalization Across Domains


Title	LRS-DAG: Low Resource Supervised Domain Adaptation with Generalization Across Domains
Authors	Rheeya Uppaal
Abstract	Current state of the art methods in Domain Adaptation follow adversarial approaches, making training a challenge. Existing non-adversarial methods learn mappings between the source and target domains, to achieve reasonable performance. However, even these methods do not focus on a key aspect: maintaining performance on the source domain, even after optimizing over the target domain. Additionally, there exist very few methods in low resource supervised domain adaptation. This work proposes a method, LRS-DAG, that aims to solve these current issues in the field. By adding a set of “encoder layers” which map the target domain to the source, and can be removed when dealing directly with the source data, the model learns to perform optimally on both domains. LRS-DAG showcases its uniqueness by being a new algorithm for low resource domain adaptation which maintains performance over the source domain, with a new metric for learning mappings between domains being introduced. We show that, in the case of FCNs, when transferring from MNIST to SVHN, LRS-DAG performs comparably to fine tuning, with the advantage of maintaining performance over the source domain. LRS-DAG outperforms fine tuning when transferring to a synthetic dataset similar to MNIST, which is a setting more representative of low resource supervised domain adaptation.
Tasks	Domain Adaptation
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06718v2
PDF	https://arxiv.org/pdf/1909.06718v2.pdf
PWC	https://paperswithcode.com/paper/lrs-dag-low-resource-supervised-domain
Repo
Framework

Sample Efficient Toeplitz Covariance Estimation


Title	Sample Efficient Toeplitz Covariance Estimation
Authors	Yonina C. Eldar, Jerry Li, Cameron Musco, Christopher Musco
Abstract	We study the sample complexity of estimating the covariance matrix $T$ of a distribution $\mathcal{D}$ over $d$-dimensional vectors, under the assumption that $T$ is Toeplitz. This assumption arises in many signal processing problems, where the covariance between any two measurements only depends on the time or distance between those measurements. We are interested in estimation strategies that may choose to view only a subset of entries in each vector sample $x \sim \mathcal{D}$, which often equates to reducing hardware and communication requirements in applications ranging from wireless signal processing to advanced imaging. Our goal is to minimize both 1) the number of vector samples drawn from $\mathcal{D}$ and 2) the number of entries accessed in each sample. We provide some of the first non-asymptotic bounds on these sample complexity measures that exploit $T$'s Toeplitz structure, and by doing so, significantly improve on results for generic covariance matrices. Our bounds follow from a novel analysis of classical and widely used estimation algorithms (along with some new variants), including methods based on selecting entries from each vector sample according to a so-called sparse ruler. In many cases, we pair our upper bounds with matching or nearly matching lower bounds. In addition to results that hold for any Toeplitz $T$, we further study the important setting when $T$ is close to low-rank, which is often the case in practice. We show that methods based on sparse rulers perform even better in this setting, with sample complexity scaling sublinearly in $d$. Motivated by this finding, we develop a new covariance estimation strategy that further improves on all existing methods in the low-rank case: when $T$ is rank-$k$ or nearly rank-$k$, it achieves sample complexity depending polynomially on $k$ and only logarithmically on $d$.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05643v5
PDF	https://arxiv.org/pdf/1905.05643v5.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-toeplitz-covariance
Repo
Framework

PathRank: A Multi-Task Learning Framework to Rank Paths in Spatial Networks


Title	PathRank: A Multi-Task Learning Framework to Rank Paths in Spatial Networks
Authors	Sean Bin Yang, Bin Yang
Abstract	Modern navigation services often provide multiple paths connecting the same source and destination for users to select. Hence, ranking such paths becomes increasingly important, which directly affects the service quality. We present PathRank, a data-driven framework for ranking paths based on historical trajectories using multi-task learning. If a trajectory used path P from source s to destination d, PathRank considers this as an evidence that P is preferred over all other paths from s to d. Thus, a path that is similar to P should have a larger ranking score than a path that is dissimilar to P. Based on this intuition, PathRank models path ranking as a regression problem, where each path is associated with a ranking score. To enable PathRank, we first propose an effective method to generate a compact set of training data: for each trajectory, we generate a small set of diversified paths. Next, we propose a multi-task learning framework to solve the regression problem. In particular, a spatial network embedding is proposed to embed each vertex to a feature vector by considering both road network topology and spatial properties, such as distances and travel times. Since a path is represented by a sequence of vertices, which is now a sequence of feature vectors after embedding, recurrent neural network is applied to model the sequence. The objective function is designed to consider errors on both ranking scores and spatial properties, making the framework a multi-task learning framework. Empirical studies on a substantial trajectory data set offer insight into the designed properties of the proposed framework and indicating that it is effective and practical.
Tasks	Multi-Task Learning, Network Embedding
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04028v1
PDF	https://arxiv.org/pdf/1907.04028v1.pdf
PWC	https://paperswithcode.com/paper/pathrank-a-multi-task-learning-framework-to
Repo
Framework

An ASP-based Approach for Attractor Enumeration in Synchronous and Asynchronous Boolean Networks


Title	An ASP-based Approach for Attractor Enumeration in Synchronous and Asynchronous Boolean Networks
Authors	Tarek Khaled, Belaïd Benhamou
Abstract	Boolean networks are conventionally used to represent and simulate gene regulatory networks. In the analysis of the dynamic of a Boolean network, the attractors are the objects of a special attention. In this work, we propose a novel approach based on Answer Set Programming (ASP) to express Boolean networks and simulate the dynamics of such networks. Our work focuses on the identification of the attractors, it relies on the exhaustive enumeration of all the attractors of synchronous and asynchronous Boolean networks. We applied and evaluated the proposed approach on real biological networks, and the obtained results indicate that this novel approach is promising.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08251v1
PDF	https://arxiv.org/pdf/1909.08251v1.pdf
PWC	https://paperswithcode.com/paper/an-asp-based-approach-for-attractor
Repo
Framework

Towards Universal Object Detection by Domain Attention


Title	Towards Universal Object Detection by Domain Attention
Authors	Xudong Wang, Zhaowei Cai, Dashan Gao, Nuno Vasconcelos
Abstract	Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient universal object detection system that is capable of working on various image domains, from human faces and traffic signs to medical CT images. Unlike multi-domain models, this universal model does not require prior knowledge of the domain of interest. This is achieved by the introduction of a new family of adaptation layers, based on the principles of squeeze and excitation, and a new domain-attention mechanism. In the proposed universal detector, all parameters and computations are shared across domains, and a single network processes all domains all the time. Experiments, on a newly established universal object detection benchmark of 11 diverse datasets, show that the proposed detector outperforms a bank of individual detectors, a multi-domain detector, and a baseline universal detector, with a 1.3x parameter increase over a single-domain baseline detector. The code and benchmark will be released at http://www.svcl.ucsd.edu/projects/universal-detection/.
Tasks	Object Detection
Published	2019-04-09
URL	https://arxiv.org/abs/1904.04402v4
PDF	https://arxiv.org/pdf/1904.04402v4.pdf
PWC	https://paperswithcode.com/paper/towards-universal-object-detection-by-domain
Repo
Framework

Encoder-Agnostic Adaptation for Conditional Language Generation


Title	Encoder-Agnostic Adaptation for Conditional Language Generation
Authors	Zachary M. Ziegler, Luke Melas-Kyriazi, Sebastian Gehrmann, Alexander M. Rush
Abstract	Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However it is an open-question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretrained transformer models are sensitive to large parameter changes during tuning. We therefore propose an adaptation that directly injects arbitrary conditioning into self attention, an approach we call pseudo self attention. Through experiments on four diverse conditional text generation tasks we show that this encoder-agnostic technique outperforms strong baselines, produces coherent generations, and is data efficient.
Tasks	Language Modelling, Text Generation
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06938v2
PDF	https://arxiv.org/pdf/1908.06938v2.pdf
PWC	https://paperswithcode.com/paper/encoder-agnostic-adaptation-for-conditional
Repo
Framework

On How Users Edit Computer-Generated Visual Stories


Title	On How Users Edit Computer-Generated Visual Stories
Authors	Ting-Yao Hsu, Yen-Chia Hsu, Ting-Hao ‘Kenneth’ Huang
Abstract	A significant body of research in Artificial Intelligence (AI) has focused on generating stories automatically, either based on prior story plots or input images. However, literature has little to say about how users would receive and use these stories. Given the quality of stories generated by modern AI algorithms, users will nearly inevitably have to edit these stories before putting them to real use. In this paper, we present the first analysis of how human users edit machine-generated stories. We obtained 962 short stories generated by one of the state-of-the-art visual storytelling models. For each story, we recruited five crowd workers from Amazon Mechanical Turk to edit it. Our analysis of these edits shows that, on average, users (i) slightly shortened machine-generated stories, (ii) increased lexical diversity in these stories, and (iii) often replaced nouns and their determiners/articles with pronouns. Our study provides a better understanding on how users receive and edit machine-generated stories,informing future researchers to create more usable and helpful story generation systems.
Tasks	Visual Storytelling
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08327v2
PDF	http://arxiv.org/pdf/1902.08327v2.pdf
PWC	https://paperswithcode.com/paper/on-how-users-edit-computer-generated-visual
Repo
Framework