Paper Group ANR 772
Biases for Emergent Communication in Multi-agent Reinforcement Learning. LOGAN: Unpaired Shape Transform in Latent Overcomplete Space. Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification. Human detection of machine manipulated media. Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Spea …
Biases for Emergent Communication in Multi-agent Reinforcement Learning
Title | Biases for Emergent Communication in Multi-agent Reinforcement Learning |
Authors | Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, Thore Graepel |
Abstract | We study the problem of emergent communication, in which language arises because speakers and listeners must communicate information in order to solve tasks. In temporally extended reinforcement learning domains, it has proved hard to learn such communication without centralized training of agents, due in part to a difficult joint exploration problem. We introduce inductive biases for positive signalling and positive listening, which ease this problem. In a simple one-step environment, we demonstrate how these biases ease the learning problem. We also apply our methods to a more extended environment, showing that agents with these inductive biases achieve better performance, and analyse the resulting communication protocols. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05676v1 |
https://arxiv.org/pdf/1912.05676v1.pdf | |
PWC | https://paperswithcode.com/paper/biases-for-emergent-communication-in-multi-1 |
Repo | |
Framework | |
LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
Title | LOGAN: Unpaired Shape Transform in Latent Overcomplete Space |
Authors | Kangxue Yin, Zhiqin Chen, Hui Huang, Daniel Cohen-Or, Hao Zhang |
Abstract | We introduce LOGAN, a deep neural network aimed at learning general-purpose shape transforms from unpaired domains. The network is trained on two sets of shapes, e.g., tables and chairs, while there is neither a pairing between shapes from the domains as supervision nor any point-wise correspondence between any shapes. Once trained, LOGAN takes a shape from one domain and transforms it into the other. Our network consists of an autoencoder to encode shapes from the two input domains into a common latent space, where the latent codes concatenate multi-scale shape features, resulting in an overcomplete representation. The translator is based on a generative adversarial network (GAN), operating in the latent space, where an adversarial loss enforces cross-domain translation while a feature preservation loss ensures that the right shape features are preserved for a natural shape transform. We conduct ablation studies to validate each of our key network designs and demonstrate superior capabilities in unpaired shape transforms on a variety of examples over baselines and state-of-the-art approaches. We show that LOGAN is able to learn what shape features to preserve during shape translation, either local or non-local, whether content or style, depending solely on the input domains for training. |
Tasks | |
Published | 2019-03-25 |
URL | https://arxiv.org/abs/1903.10170v3 |
https://arxiv.org/pdf/1903.10170v3.pdf | |
PWC | https://paperswithcode.com/paper/logan-unpaired-shape-transform-in-latent |
Repo | |
Framework | |
Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification
Title | Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification |
Authors | Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng |
Abstract | High-performance spoofing countermeasure systems for automatic speaker verification (ASV) have been proposed in the ASVspoof 2019 challenge. However, the robustness of such systems under adversarial attacks has not been studied yet. In this paper, we investigate the vulnerability of spoofing countermeasures for ASV under both white-box and black-box adversarial attacks with the fast gradient sign method (FGSM) and the projected gradient descent (PGD) method. We implement high-performing countermeasure models in the ASVspoof 2019 challenge and conduct adversarial attacks on them. We compare performance of black-box attacks across spoofing countermeasure models with different network architectures and different amount of model parameters. The experimental results show that all implemented countermeasure models are vulnerable to FGSM and PGD attacks under the scenario of white-box attack. The more dangerous black-box attacks also prove to be effective by the experimental results. |
Tasks | Speaker Verification |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08716v1 |
https://arxiv.org/pdf/1910.08716v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-attacks-on-spoofing |
Repo | |
Framework | |
Human detection of machine manipulated media
Title | Human detection of machine manipulated media |
Authors | Matthew Groh, Ziv Epstein, Nick Obradovich, Manuel Cebrian, Iyad Rahwan |
Abstract | Recent advances in neural networks for content generation enable artificial intelligence (AI) models to generate high-quality media manipulations. Here we report on a randomized experiment designed to study the effect of exposure to media manipulations on over 15,000 individuals’ ability to discern machine-manipulated media. We engineer a neural network to plausibly and automatically remove objects from images, and we deploy this neural network online with a randomized experiment where participants can guess which image out of a pair of images has been manipulated. The system provides participants feedback on the accuracy of each guess. In the experiment, we randomize the order in which images are presented, allowing causal identification of the learning curve surrounding participants’ ability to detect fake content. We find sizable and robust evidence that individuals learn to detect fake content through exposure to manipulated media when provided iterative feedback on their detection attempts. Over a succession of only ten images, participants increase their rating accuracy by over ten percentage points. Our study provides initial evidence that human ability to detect fake, machine-generated content may increase alongside the prevalence of such media online. |
Tasks | Causal Identification, Human Detection |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.05276v2 |
https://arxiv.org/pdf/1907.05276v2.pdf | |
PWC | https://paperswithcode.com/paper/human-detection-of-machine-manipulated-media |
Repo | |
Framework | |
Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification
Title | Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification |
Authors | Youngmoon Jung, Yeunju Choi, Hoirin Kim |
Abstract | Voice activity detection (VAD), which classifies frames as speech or non-speech, is an important module in many speech applications including speaker verification. In this paper, we propose a novel method, called self-adaptive soft VAD, to incorporate a deep neural network (DNN)-based VAD into a deep speaker embedding system. The proposed method is a combination of the following two approaches. The first approach is soft VAD, which performs a soft selection of frame-level features extracted from a speaker feature extractor. The frame-level features are weighted by their corresponding speech posteriors estimated from the DNN-based VAD, and then aggregated to generate a speaker embedding. The second approach is self-adaptive VAD, which fine-tunes the pre-trained VAD on the speaker verification data to reduce the domain mismatch. Here, we introduce two unsupervised domain adaptation (DA) schemes, namely speech posterior-based DA (SP-DA) and joint learning-based DA (JL-DA). Experiments on a Korean speech database demonstrate that the verification performance is improved significantly in real-world environments by using self-adaptive soft VAD. |
Tasks | Action Detection, Activity Detection, Domain Adaptation, Speaker Verification, Unsupervised Domain Adaptation |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.11886v1 |
https://arxiv.org/pdf/1909.11886v1.pdf | |
PWC | https://paperswithcode.com/paper/self-adaptive-soft-voice-activity-detection |
Repo | |
Framework | |
VAE-based Domain Adaptation for Speaker Verification
Title | VAE-based Domain Adaptation for Speaker Verification |
Authors | Xueyi Wang, Lantian Li, Dong Wang |
Abstract | Deep speaker embedding has achieved satisfactory performance in speaker verification. By enforcing the neural model to discriminate the speakers in the training set, deep speaker embedding (called x-vectors ) can be derived from the hidden layers. Despite its good performance, the present embedding model is highly domain sensitive, which means that it often works well in domains whose acoustic condition matches that of the training data (in-domain), but degrades in mismatched domains (out-of-domain). In this paper, we present a domain adaptation approach based on Variational Auto-Encoder (VAE). This model transforms x-vectors to a regularized latent space; within this latent space, a small amount of data from the target domain is sufficient to accomplish the adaptation. Our experiments demonstrated that by this VAE-adaptation approach, speaker embeddings can be easily transformed to the target domain, leading to noticeable performance improvement. |
Tasks | Domain Adaptation, Speaker Verification |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10092v1 |
https://arxiv.org/pdf/1908.10092v1.pdf | |
PWC | https://paperswithcode.com/paper/vae-based-domain-adaptation-for-speaker |
Repo | |
Framework | |
Optimizing quantum heuristics with meta-learning
Title | Optimizing quantum heuristics with meta-learning |
Authors | Max Wilson, Sam Stromswold, Filip Wudarski, Stuart Hadfield, Norm M. Tubman, Eleanor Rieffel |
Abstract | Variational quantum algorithms, a class of quantum heuristics, are promising candidates for the demonstration of useful quantum computation. Finding the best way to amplify the performance of these methods on hardware is an important task. Here, we evaluate the optimization of quantum heuristics with an existing class of techniques called meta-learners'. We compare the performance of a meta-learner to Bayesian optimization, evolutionary strategies, L-BFGS-B and Nelder-Mead approaches, for two quantum heuristics (quantum alternating operator ansatz and variational quantum eigensolver), on three problems, in three simulation environments. We show that the meta-learner comes near to the global optima more frequently than all other optimizers we tested in a noisy parameter setting environment. We also find that the meta-learner is generally more resistant to noise, for example seeing a smaller reduction in performance in Noisy and Sampling environments and performs better on average by a gain’ metric than its closest comparable competitor L-BFGS-B. These results are an important indication that meta-learning and associated machine learning methods will be integral to the useful application of noisy near-term quantum computers. |
Tasks | Meta-Learning |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03185v1 |
https://arxiv.org/pdf/1908.03185v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-quantum-heuristics-with-meta |
Repo | |
Framework | |
Evaluating Computational Language Models with Scaling Properties of Natural Language
Title | Evaluating Computational Language Models with Scaling Properties of Natural Language |
Authors | Shuntaro Takahashi, Kumiko Tanaka-Ishii |
Abstract | In this article, we evaluate computational models of natural language with respect to the universal statistical behaviors of natural language. Statistical mechanical analyses have revealed that natural language text is characterized by scaling properties, which quantify the global structure in the vocabulary population and the long memory of a text. We study whether five scaling properties (given by Zipf’s law, Heaps’ law, Ebeling’s method, Taylor’s law, and long-range correlation analysis) can serve for evaluation of computational models. Specifically, we test $n$-gram language models, a probabilistic context-free grammar (PCFG), language models based on Simon/Pitman-Yor processes, neural language models, and generative adversarial networks (GANs) for text generation. Our analysis reveals that language models based on recurrent neural networks (RNNs) with a gating mechanism (i.e., long short-term memory, LSTM; a gated recurrent unit, GRU; and quasi-recurrent neural networks, QRNNs) are the only computational models that can reproduce the long memory behavior of natural language. Furthermore, through comparison with recently proposed model-based evaluation methods, we find that the exponent of Taylor’s law is a good indicator of model quality. |
Tasks | Text Generation |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09379v1 |
https://arxiv.org/pdf/1906.09379v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-computational-language-models-with |
Repo | |
Framework | |
LRS-DAG: Low Resource Supervised Domain Adaptation with Generalization Across Domains
Title | LRS-DAG: Low Resource Supervised Domain Adaptation with Generalization Across Domains |
Authors | Rheeya Uppaal |
Abstract | Current state of the art methods in Domain Adaptation follow adversarial approaches, making training a challenge. Existing non-adversarial methods learn mappings between the source and target domains, to achieve reasonable performance. However, even these methods do not focus on a key aspect: maintaining performance on the source domain, even after optimizing over the target domain. Additionally, there exist very few methods in low resource supervised domain adaptation. This work proposes a method, LRS-DAG, that aims to solve these current issues in the field. By adding a set of “encoder layers” which map the target domain to the source, and can be removed when dealing directly with the source data, the model learns to perform optimally on both domains. LRS-DAG showcases its uniqueness by being a new algorithm for low resource domain adaptation which maintains performance over the source domain, with a new metric for learning mappings between domains being introduced. We show that, in the case of FCNs, when transferring from MNIST to SVHN, LRS-DAG performs comparably to fine tuning, with the advantage of maintaining performance over the source domain. LRS-DAG outperforms fine tuning when transferring to a synthetic dataset similar to MNIST, which is a setting more representative of low resource supervised domain adaptation. |
Tasks | Domain Adaptation |
Published | 2019-09-15 |
URL | https://arxiv.org/abs/1909.06718v2 |
https://arxiv.org/pdf/1909.06718v2.pdf | |
PWC | https://paperswithcode.com/paper/lrs-dag-low-resource-supervised-domain |
Repo | |
Framework | |
Sample Efficient Toeplitz Covariance Estimation
Title | Sample Efficient Toeplitz Covariance Estimation |
Authors | Yonina C. Eldar, Jerry Li, Cameron Musco, Christopher Musco |
Abstract | We study the sample complexity of estimating the covariance matrix $T$ of a distribution $\mathcal{D}$ over $d$-dimensional vectors, under the assumption that $T$ is Toeplitz. This assumption arises in many signal processing problems, where the covariance between any two measurements only depends on the time or distance between those measurements. We are interested in estimation strategies that may choose to view only a subset of entries in each vector sample $x \sim \mathcal{D}$, which often equates to reducing hardware and communication requirements in applications ranging from wireless signal processing to advanced imaging. Our goal is to minimize both 1) the number of vector samples drawn from $\mathcal{D}$ and 2) the number of entries accessed in each sample. We provide some of the first non-asymptotic bounds on these sample complexity measures that exploit $T$'s Toeplitz structure, and by doing so, significantly improve on results for generic covariance matrices. Our bounds follow from a novel analysis of classical and widely used estimation algorithms (along with some new variants), including methods based on selecting entries from each vector sample according to a so-called sparse ruler. In many cases, we pair our upper bounds with matching or nearly matching lower bounds. In addition to results that hold for any Toeplitz $T$, we further study the important setting when $T$ is close to low-rank, which is often the case in practice. We show that methods based on sparse rulers perform even better in this setting, with sample complexity scaling sublinearly in $d$. Motivated by this finding, we develop a new covariance estimation strategy that further improves on all existing methods in the low-rank case: when $T$ is rank-$k$ or nearly rank-$k$, it achieves sample complexity depending polynomially on $k$ and only logarithmically on $d$. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05643v5 |
https://arxiv.org/pdf/1905.05643v5.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-toeplitz-covariance |
Repo | |
Framework | |
PathRank: A Multi-Task Learning Framework to Rank Paths in Spatial Networks
Title | PathRank: A Multi-Task Learning Framework to Rank Paths in Spatial Networks |
Authors | Sean Bin Yang, Bin Yang |
Abstract | Modern navigation services often provide multiple paths connecting the same source and destination for users to select. Hence, ranking such paths becomes increasingly important, which directly affects the service quality. We present PathRank, a data-driven framework for ranking paths based on historical trajectories using multi-task learning. If a trajectory used path P from source s to destination d, PathRank considers this as an evidence that P is preferred over all other paths from s to d. Thus, a path that is similar to P should have a larger ranking score than a path that is dissimilar to P. Based on this intuition, PathRank models path ranking as a regression problem, where each path is associated with a ranking score. To enable PathRank, we first propose an effective method to generate a compact set of training data: for each trajectory, we generate a small set of diversified paths. Next, we propose a multi-task learning framework to solve the regression problem. In particular, a spatial network embedding is proposed to embed each vertex to a feature vector by considering both road network topology and spatial properties, such as distances and travel times. Since a path is represented by a sequence of vertices, which is now a sequence of feature vectors after embedding, recurrent neural network is applied to model the sequence. The objective function is designed to consider errors on both ranking scores and spatial properties, making the framework a multi-task learning framework. Empirical studies on a substantial trajectory data set offer insight into the designed properties of the proposed framework and indicating that it is effective and practical. |
Tasks | Multi-Task Learning, Network Embedding |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04028v1 |
https://arxiv.org/pdf/1907.04028v1.pdf | |
PWC | https://paperswithcode.com/paper/pathrank-a-multi-task-learning-framework-to |
Repo | |
Framework | |
An ASP-based Approach for Attractor Enumeration in Synchronous and Asynchronous Boolean Networks
Title | An ASP-based Approach for Attractor Enumeration in Synchronous and Asynchronous Boolean Networks |
Authors | Tarek Khaled, Belaïd Benhamou |
Abstract | Boolean networks are conventionally used to represent and simulate gene regulatory networks. In the analysis of the dynamic of a Boolean network, the attractors are the objects of a special attention. In this work, we propose a novel approach based on Answer Set Programming (ASP) to express Boolean networks and simulate the dynamics of such networks. Our work focuses on the identification of the attractors, it relies on the exhaustive enumeration of all the attractors of synchronous and asynchronous Boolean networks. We applied and evaluated the proposed approach on real biological networks, and the obtained results indicate that this novel approach is promising. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08251v1 |
https://arxiv.org/pdf/1909.08251v1.pdf | |
PWC | https://paperswithcode.com/paper/an-asp-based-approach-for-attractor |
Repo | |
Framework | |
Towards Universal Object Detection by Domain Attention
Title | Towards Universal Object Detection by Domain Attention |
Authors | Xudong Wang, Zhaowei Cai, Dashan Gao, Nuno Vasconcelos |
Abstract | Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient universal object detection system that is capable of working on various image domains, from human faces and traffic signs to medical CT images. Unlike multi-domain models, this universal model does not require prior knowledge of the domain of interest. This is achieved by the introduction of a new family of adaptation layers, based on the principles of squeeze and excitation, and a new domain-attention mechanism. In the proposed universal detector, all parameters and computations are shared across domains, and a single network processes all domains all the time. Experiments, on a newly established universal object detection benchmark of 11 diverse datasets, show that the proposed detector outperforms a bank of individual detectors, a multi-domain detector, and a baseline universal detector, with a 1.3x parameter increase over a single-domain baseline detector. The code and benchmark will be released at http://www.svcl.ucsd.edu/projects/universal-detection/. |
Tasks | Object Detection |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.04402v4 |
https://arxiv.org/pdf/1904.04402v4.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-object-detection-by-domain |
Repo | |
Framework | |
Encoder-Agnostic Adaptation for Conditional Language Generation
Title | Encoder-Agnostic Adaptation for Conditional Language Generation |
Authors | Zachary M. Ziegler, Luke Melas-Kyriazi, Sebastian Gehrmann, Alexander M. Rush |
Abstract | Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However it is an open-question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretrained transformer models are sensitive to large parameter changes during tuning. We therefore propose an adaptation that directly injects arbitrary conditioning into self attention, an approach we call pseudo self attention. Through experiments on four diverse conditional text generation tasks we show that this encoder-agnostic technique outperforms strong baselines, produces coherent generations, and is data efficient. |
Tasks | Language Modelling, Text Generation |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06938v2 |
https://arxiv.org/pdf/1908.06938v2.pdf | |
PWC | https://paperswithcode.com/paper/encoder-agnostic-adaptation-for-conditional |
Repo | |
Framework | |
On How Users Edit Computer-Generated Visual Stories
Title | On How Users Edit Computer-Generated Visual Stories |
Authors | Ting-Yao Hsu, Yen-Chia Hsu, Ting-Hao ‘Kenneth’ Huang |
Abstract | A significant body of research in Artificial Intelligence (AI) has focused on generating stories automatically, either based on prior story plots or input images. However, literature has little to say about how users would receive and use these stories. Given the quality of stories generated by modern AI algorithms, users will nearly inevitably have to edit these stories before putting them to real use. In this paper, we present the first analysis of how human users edit machine-generated stories. We obtained 962 short stories generated by one of the state-of-the-art visual storytelling models. For each story, we recruited five crowd workers from Amazon Mechanical Turk to edit it. Our analysis of these edits shows that, on average, users (i) slightly shortened machine-generated stories, (ii) increased lexical diversity in these stories, and (iii) often replaced nouns and their determiners/articles with pronouns. Our study provides a better understanding on how users receive and edit machine-generated stories,informing future researchers to create more usable and helpful story generation systems. |
Tasks | Visual Storytelling |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08327v2 |
http://arxiv.org/pdf/1902.08327v2.pdf | |
PWC | https://paperswithcode.com/paper/on-how-users-edit-computer-generated-visual |
Repo | |
Framework | |