February 1, 2020

3252 words 16 mins read

Paper Group AWR 86

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy. Exploiting locality in high-dimensional factorial hidden Markov models. Human-In-The-Loop Automatic Program Repair. Anomaly Detection for Industrial Control Systems Using Sequence-to-Sequence Neural Networks. Automatic Differentiable Monte Carlo: Theory and Application. Towa …

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy


Title	Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy
Authors	Zhengwei Wang, Qi She, Tomas E. Ward
Abstract	Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably the revolutionary techniques are in the area of computer vision such as plausible image generation, image to image translation, facial attribute manipulation and similar domains. Despite the significant success achieved in the computer vision field, applying GANs to real-world problems still poses significant challenges, three of which we focus on here: (1) High quality image generation; (2) Diverse image generation; and (3) Stable training. Through an in-depth review of GAN-related research in the literature, we provide an account of the architecture-variants and loss-variants, which have been proposed to handle these three challenges from two perspectives. We propose loss-variants and architecture-variants for classifying the most popular GANs, and discuss the potential improvements with focusing on these two aspects. While several reviews for GANs have been presented to date, none have focused on the review of GAN-variants based on their handling the challenges mentioned above. In this paper, we review and critically discuss 7 architecture-variant GANs and 9 loss-variant GANs for remedying those three challenges. The objective of this review is to provide an insight on the footprint that current GANs research focuses on the performance improvement. Code related to GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.
Tasks	Image Generation, Image Inpainting, Image Quality Assessment, Image Quality Estimation, Image Super-Resolution, Image-to-Image Translation
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01529v3
PDF	https://arxiv.org/pdf/1906.01529v3.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-networks-a-survey-and
Repo	https://github.com/SWKoreaBME/paper_review
Framework	none

Exploiting locality in high-dimensional factorial hidden Markov models


Title	Exploiting locality in high-dimensional factorial hidden Markov models
Authors	Lorenzo Rimella, Nick Whiteley
Abstract	We propose algorithms for approximate filtering and smoothing in high-dimensional factorial hidden Markov models. The approximation involves discarding, in a principled way, likelihood factors according a notion of locality in a factor graph associated with the emission distribution. This allows the exponential-in-dimension cost of exact filtering and smoothing to be avoided. We prove that the approximation accuracy, measured in a local total variation norm, is `dimension-free’ in the sense that as the overall dimension of the model increases the error bounds we derive do not necessarily degrade. A key step in the analysis is to quantify the error introduced by localizing the likelihood function in a Bayes’ rule update. The factorial structure of the likelihood function which we exploit arises naturally when data have known spatial or network structure. We demonstrate the new algorithms on synthetic examples and a London Underground passenger flow problem, where the factor graph is effectively given by the train network. \|
Tasks
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01639v2
PDF	http://arxiv.org/pdf/1902.01639v2.pdf
PWC	https://paperswithcode.com/paper/exploiting-locality-in-high-dimensional
Repo	https://github.com/LorenzoRimella/GraphFilter-GraphSmoother
Framework	none

Human-In-The-Loop Automatic Program Repair


Title	Human-In-The-Loop Automatic Program Repair
Authors	Marcel Böhme, Charaka Geethal, Van-Thuan Pham
Abstract	We introduce Learn2fix, the first human-in-the-loop, semi-automatic repair technique when no bug oracle–except for the user who is reporting the bug–is available. Our approach negotiates with the user the condition under which the bug is observed. Only when a budget of queries to the user is exhausted, it attempts to repair the bug. A query can be thought of as the following question: “When executing this alternative test input, the program produces the following output; is the bug observed”? Through systematic queries, Learn2fix trains an automatic bug oracle that becomes increasingly more accurate in predicting the user’s response. Our key challenge is to maximize the oracle’s accuracy in predicting which tests are bug-exposing given a small budget of queries. From the alternative tests that were labeled by the user, test-driven automatic repair produces the patch. Our experiments demonstrate that Learn2fix learns a sufficiently accurate automatic oracle with a reasonably low labeling effort (lt. 20 queries). Given Learn2fix’s test suite, the GenProg test-driven repair tool produces a higher-quality patch (i.e., passing a larger proportion of validation tests) than using manual test suites provided with the repair benchmark.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07758v1
PDF	https://arxiv.org/pdf/1912.07758v1.pdf
PWC	https://paperswithcode.com/paper/human-in-the-loop-automatic-program-repair
Repo	https://github.com/mboehme/learn2fix
Framework	none

Anomaly Detection for Industrial Control Systems Using Sequence-to-Sequence Neural Networks


Title	Anomaly Detection for Industrial Control Systems Using Sequence-to-Sequence Neural Networks
Authors	Jonguk Kim, Jeong-Han Yun, Hyoung Chun Kim
Abstract	This study proposes an anomaly detection method for operational data of industrial control systems (ICSs). Sequence-to-sequence neural networks were applied to train and predict ICS operational data and interpret their time-series characteristic. The proposed method requires only a normal dataset to understand ICS’s normal state and detect outliers. This method was evaluated with SWaT (secure water treatment) dataset, and 29 out of 36 attacks were detected. The reported method also detects the attack points, and 25 out of 53 points were detected. This study provides a detailed analysis of false positives and false negatives of the experimental results.
Tasks	Anomaly Detection, Time Series
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04831v1
PDF	https://arxiv.org/pdf/1911.04831v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-for-industrial-control
Repo	https://github.com/jukworks/swat-seq2seq
Framework	pytorch

Automatic Differentiable Monte Carlo: Theory and Application


Title	Automatic Differentiable Monte Carlo: Theory and Application
Authors	Shi-Xin Zhang, Zhou-Quan Wan, Hong Yao
Abstract	Differentiable programming has emerged as a key programming paradigm empowering rapid developments of deep learning while its applications to important computational methods such as Monte Carlo remain largely unexplored. Here we present the general theory enabling infinite-order automatic differentiation on expectations computed by Monte Carlo with unnormalized probability distributions, which we call “automatic differentiable Monte Carlo” (ADMC). By implementing ADMC algorithms on computational graphs, one can also leverage state-of-the-art machine learning frameworks and techniques to traditional Monte Carlo applications in statistics and physics. We illustrate the versatility of ADMC by showing some applications: fast search of phase transitions and accurately finding ground states of interacting many-body models in two dimensions. ADMC paves a promising way to innovate Monte Carlo in various aspects to achieve higher accuracy and efficiency, e.g. easing or solving the sign problem of quantum many-body models through ADMC.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.09117v1
PDF	https://arxiv.org/pdf/1911.09117v1.pdf
PWC	https://paperswithcode.com/paper/automatic-differentiable-monte-carlo-theory
Repo	https://github.com/refraction-ray/admc
Framework	tf

Toward Dialogue Modeling: A Semantic Annotation Scheme for Questions and Answers


Title	Toward Dialogue Modeling: A Semantic Annotation Scheme for Questions and Answers
Authors	Maria-Andrea Cruz-Blandón, Gosse Minnema, Aria Nourbakhsh, Maria Boritchev, Maxime Amblard
Abstract	The present study proposes an annotation scheme for classifying the content and discourse contribution of question-answer pairs. We propose detailed guidelines for using the scheme and apply them to dialogues in English, Spanish, and Dutch. Finally, we report on initial machine learning experiments for automatic annotation.
Tasks
Published	2019-08-23
URL	https://arxiv.org/abs/1908.09921v1
PDF	https://arxiv.org/pdf/1908.09921v1.pdf
PWC	https://paperswithcode.com/paper/toward-dialogue-modeling-a-semantic-1
Repo	https://github.com/andrea08/question_answer_annotation
Framework	none

Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed


Title	Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed
Authors	Alexandre Défossez, Nicolas Usunier, Léon Bottou, Francis Bach
Abstract	We study the problem of source separation for music using deep learning with four known sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches predict soft masks over mixture spectrograms while methods working on the waveform are lagging behind as measured on the standard MusDB benchmark. Our contribution is two fold. (i) We introduce a simple convolutional and recurrent model that outperforms the state-of-the-art model on waveforms, that is, Wave-U-Net, by 1.6 points of SDR (signal to distortion ratio). (ii) We propose a new scheme to leverage unlabeled music. We train a first model to extract parts with at least one source silent in unlabeled tracks, for instance without bass. We remix this extract with a bass line taken from the supervised dataset to form a new weakly supervised training example. Combining our architecture and scheme, we show that waveform methods can play in the same ballpark as spectrogram ones.
Tasks	Music Source Separation
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01174v1
PDF	https://arxiv.org/pdf/1909.01174v1.pdf
PWC	https://paperswithcode.com/paper/demucs-deep-extractor-for-music-sources-with
Repo	https://github.com/facebookresearch/demucs
Framework	pytorch

MASS: Masked Sequence to Sequence Pre-training for Language Generation


Title	MASS: Masked Sequence to Sequence Pre-training for Language Generation
Authors	Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu
Abstract	Pre-training and fine-tuning, e.g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks. Inspired by the success of BERT, we propose MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks. MASS adopts the encoder-decoder framework to reconstruct a sentence fragment given the remaining part of the sentence: its encoder takes a sentence with randomly masked fragment (several consecutive tokens) as input, and its decoder tries to predict this masked fragment. In this way, MASS can jointly train the encoder and decoder to develop the capability of representation extraction and language modeling. By further fine-tuning on a variety of zero/low-resource language generation tasks, including neural machine translation, text summarization and conversational response generation (3 tasks and totally 8 datasets), MASS achieves significant improvements over the baselines without pre-training or with other pre-training methods. Specially, we achieve the state-of-the-art accuracy (37.5 in terms of BLEU score) on the unsupervised English-French translation, even beating the early attention-based supervised model.
Tasks	Conversational Response Generation, Machine Translation, Text Generation, Text Summarization, Unsupervised Machine Translation
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02450v5
PDF	https://arxiv.org/pdf/1905.02450v5.pdf
PWC	https://paperswithcode.com/paper/mass-masked-sequence-to-sequence-pre-training
Repo	https://github.com/microsoft/MASS
Framework	pytorch

MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines


Title	MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines
Authors	Mihail Eric, Rahul Goel, Shachi Paul, Adarsh Kumar, Abhishek Sethi, Peter Ku, Anuj Kumar Goyal, Sanchit Agarwal, Shuyang Gao, Dilek Hakkani-Tur
Abstract	MultiWOZ 2.0 (Budzianowski et al., 2018) is a recently released multi-domain dialogue dataset spanning 7 distinct domains and containing over 10,000 dialogues. Though immensely useful and one of the largest resources of its kind to-date, MultiWOZ 2.0 has a few shortcomings. Firstly, there is substantial noise in the dialogue state annotations and dialogue utterances which negatively impact the performance of state-tracking models. Secondly, follow-up work (Lee et al., 2019) has augmented the original dataset with user dialogue acts. This leads to multiple co-existent versions of the same dataset with minor modifications. In this work we tackle the aforementioned issues by introducing MultiWOZ 2.1. To fix the noisy state annotations, we use crowdsourced workers to re-annotate state and utterances based on the original utterances in the dataset. This correction process results in changes to over 32% of state annotations across 40% of the dialogue turns. In addition, we fix 146 dialogue utterances by canonicalizing slot values in the utterances to the values in the dataset ontology. To address the second problem, we combined the contributions of the follow-up works into MultiWOZ 2.1. Hence, our dataset also includes user dialogue acts as well as multiple slot descriptions per dialogue state slot. We then benchmark a number of state-of-the-art dialogue state tracking models on the MultiWOZ 2.1 dataset and show the joint state tracking performance on the corrected state annotations. We are publicly releasing MultiWOZ 2.1 to the community, hoping that this dataset resource will allow for more effective models across various dialogue subproblems to be built in the future.
Tasks	Dialogue State Tracking
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01669v4
PDF	https://arxiv.org/pdf/1907.01669v4.pdf
PWC	https://paperswithcode.com/paper/multiwoz-21-multi-domain-dialogue-state
Repo	https://github.com/budzianowski/multiwoz
Framework	pytorch

Deep Adaptive Wavelet Network


Title	Deep Adaptive Wavelet Network
Authors	Maria Ximena Bastidas Rodriguez, Adrien Gruson, Luisa F. Polania, Shin Fujieda, Flavio Prieto Ortiz, Kohei Takayama, Toshiya Hachisuka
Abstract	Even though convolutional neural networks have become the method of choice in many fields of computer vision, they still lack interpretability and are usually designed manually in a cumbersome trial-and-error process. This paper aims at overcoming those limitations by proposing a deep neural network, which is designed in a systematic fashion and is interpretable, by integrating multiresolution analysis at the core of the deep neural network design. By using the lifting scheme, it is possible to generate a wavelet representation and design a network capable of learning wavelet coefficients in an end-to-end form. Compared to state-of-the-art architectures, the proposed model requires less hyper-parameter tuning and achieves competitive accuracy in image classification tasks
Tasks	Image Classification
Published	2019-12-10
URL	https://arxiv.org/abs/1912.05035v1
PDF	https://arxiv.org/pdf/1912.05035v1.pdf
PWC	https://paperswithcode.com/paper/deep-adaptive-wavelet-network
Repo	https://github.com/mxbastidasr/DAWN_WACV2020
Framework	pytorch

Leveraging the Invariant Side of Generative Zero-Shot Learning


Title	Leveraging the Invariant Side of Generative Zero-Shot Learning
Authors	Jingjing Li, Mengmeng Jin, Ke Lu, Zhengming Ding, Lei Zhu, Zi Huang
Abstract	Conventional zero-shot learning (ZSL) methods generally learn an embedding, e.g., visual-semantic mapping, to handle the unseen visual samples via an indirect manner. In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions. Specifically, we train a conditional Wasserstein GANs in which the generator synthesizes fake unseen features from noises and the discriminator distinguishes the fake from real via a minimax game. Considering that one semantic description can correspond to various synthesized visual samples, and the semantic description, figuratively, is the soul of the generated features, we introduce soul samples as the invariant side of generative zero-shot learning in this paper. A soul sample is the meta-representation of one class. It visualizes the most semantically-meaningful aspects of each sample in the same category. We regularize that each generated sample (the varying side of generative ZSL) should be close to at least one soul sample (the invariant side) which has the same class label with it. At the zero-shot recognition stage, we propose to use two classifiers, which are deployed in a cascade way, to achieve a coarse-to-fine result. Experiments on five popular benchmarks verify that our proposed approach can outperform state-of-the-art methods with significant improvements.
Tasks	Zero-Shot Learning
Published	2019-04-08
URL	http://arxiv.org/abs/1904.04092v1
PDF	http://arxiv.org/pdf/1904.04092v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-the-invariant-side-of-generative
Repo	https://github.com/lijin118/LisGAN
Framework	pytorch

Efficient Forward Architecture Search


Title	Efficient Forward Architecture Search
Authors	Hanzhang Hu, John Langford, Rich Caruana, Saurajit Mukherjee, Eric Horvitz, Debadeepta Dey
Abstract	We propose a neural architecture search (NAS) algorithm, Petridish, to iteratively add shortcut connections to existing network layers. The added shortcut connections effectively perform gradient boosting on the augmented layers. The proposed algorithm is motivated by the feature selection algorithm forward stage-wise linear regression, since we consider NAS as a generalization of feature selection for regression, where NAS selects shortcuts among layers instead of selecting features. In order to reduce the number of trials of possible connection combinations, we train jointly all possible connections at each stage of growth while leveraging feature selection techniques to choose a subset of them. We experimentally show this process to be an efficient forward architecture search algorithm that can find competitive models using few GPU days in both the search space of repeatable network modules (cell-search) and the space of general networks (macro-search). Petridish is particularly well-suited for warm-starting from existing models crucial for lifelong-learning scenarios.
Tasks	Feature Selection, Neural Architecture Search
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13360v1
PDF	https://arxiv.org/pdf/1905.13360v1.pdf
PWC	https://paperswithcode.com/paper/efficient-forward-architecture-search
Repo	https://github.com/microsoft/petridishnn
Framework	tf

Task-Driven Modular Networks for Zero-Shot Compositional Learning


Title	Task-Driven Modular Networks for Zero-Shot Compositional Learning
Authors	Senthil Purushwalkam, Maximilian Nickel, Abhinav Gupta, Marc’Aurelio Ranzato
Abstract	One of the hallmarks of human intelligence is the ability to compose learned knowledge into novel concepts which can be recognized without a single training example. In contrast, current state-of-the-art methods require hundreds of training examples for each possible category to build reliable and accurate classifiers. To alleviate this striking difference in efficiency, we propose a task-driven modular architecture for compositional reasoning and sample efficient learning. Our architecture consists of a set of neural network modules, which are small fully connected layers operating in semantic concept space. These modules are configured through a gating function conditioned on the task to produce features representing the compatibility between the input image and the concept under consideration. This enables us to express tasks as a combination of sub-tasks and to generalize to unseen categories by reweighting a set of small modules. Furthermore, the network can be trained efficiently as it is fully differentiable and its modules operate on small sub-spaces. We focus our study on the problem of compositional zero-shot classification of object-attribute categories. We show in our experiments that current evaluation metrics are flawed as they only consider unseen object-attribute pairs. When extending the evaluation to the generalized setting which accounts also for pairs seen during training, we discover that naive baseline methods perform similarly or better than current approaches. However, our modular network is able to outperform all existing approaches on two widely-used benchmark datasets.
Tasks	Zero-Shot Learning
Published	2019-05-15
URL	https://arxiv.org/abs/1905.05908v1
PDF	https://arxiv.org/pdf/1905.05908v1.pdf
PWC	https://paperswithcode.com/paper/task-driven-modular-networks-for-zero-shot
Repo	https://github.com/facebookresearch/taskmodularnets
Framework	pytorch

Funnelling: A New Ensemble Method for Heterogeneous Transfer Learning and its Application to Cross-Lingual Text Classification


Title	Funnelling: A New Ensemble Method for Heterogeneous Transfer Learning and its Application to Cross-Lingual Text Classification
Authors	Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani
Abstract	Cross-lingual Text Classification (CLC) consists of automatically classifying, according to a common set C of classes, documents each written in one of a set of languages L, and doing so more accurately than when naively classifying each document via its corresponding language-specific classifier. In order to obtain an increase in the classification accuracy for a given language, the system thus needs to also leverage the training examples written in the other languages. We tackle multilabel CLC via funnelling, a new ensemble learning method that we propose here. Funnelling consists of generating a two-tier classification system where all documents, irrespectively of language, are classified by the same (2nd-tier) classifier. For this classifier all documents are represented in a common, language-independent feature space consisting of the posterior probabilities generated by 1st-tier, language-dependent classifiers. This allows the classification of all test documents, of any language, to benefit from the information present in all training documents, of any language. We present substantial experiments, run on publicly available multilingual text collections, in which funnelling is shown to significantly outperform a number of state-of-the-art baselines. All code and datasets (in vector form) are made publicly available.
Tasks	Text Classification, Transfer Learning
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11459v2
PDF	http://arxiv.org/pdf/1901.11459v2.pdf
PWC	https://paperswithcode.com/paper/funnelling-a-new-ensemble-method-for
Repo	https://github.com/AlexMoreo/funnelling
Framework	none

Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds


Title	Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds
Authors	Peng Cao, Yilun Xu, Yuqing Kong, Yizhou Wang
Abstract	Eliciting labels from crowds is a potential way to obtain large labeled data. Despite a variety of methods developed for learning from crowds, a key challenge remains unsolved: \emph{learning from crowds without knowing the information structure among the crowds a priori, when some people of the crowds make highly correlated mistakes and some of them label effortlessly (e.g. randomly)}. We propose an information theoretic approach, Max-MIG, for joint learning from crowds, with a common assumption: the crowdsourced labels and the data are independent conditioning on the ground truth. Max-MIG simultaneously aggregates the crowdsourced labels and learns an accurate data classifier. Furthermore, we devise an accurate data-crowds forecaster that employs both the data and the crowdsourced labels to forecast the ground truth. To the best of our knowledge, this is the first algorithm that solves the aforementioned challenge of learning from crowds. In addition to the theoretical validation, we also empirically show that our algorithm achieves the new state-of-the-art results in most settings, including the real-world data, and is the first algorithm that is robust to various information structures. Codes are available at \hyperlink{https://github.com/Newbeeer/Max-MIG}{https://github.com/Newbeeer/Max-MIG}
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13436v1
PDF	https://arxiv.org/pdf/1905.13436v1.pdf
PWC	https://paperswithcode.com/paper/max-mig-an-information-theoretic-approach-for-1
Repo	https://github.com/CsPsy/Max-MIG
Framework	pytorch