October 21, 2019

3003 words 15 mins read

Paper Group AWR 37

Joint Multilingual Supervision for Cross-lingual Entity Linking. Another Diversity-Promoting Objective Function for Neural Dialogue Generation. Importance of Search and Evaluation Strategies in Neural Dialogue Modeling. RISE: Randomized Input Sampling for Explanation of Black-box Models. A deep learning architecture to detect events in EEG signals …

Joint Multilingual Supervision for Cross-lingual Entity Linking


Title	Joint Multilingual Supervision for Cross-lingual Entity Linking
Authors	Shyam Upadhyay, Nitish Gupta, Dan Roth
Abstract	Cross-lingual Entity Linking (XEL) aims to ground entity mentions written in any language to an English Knowledge Base (KB), such as Wikipedia. XEL for most languages is challenging, owing to limited availability of resources as supervision. We address this challenge by developing the first XEL approach that combines supervision from multiple languages jointly. This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. Extensive evaluation on three benchmark datasets across 8 languages shows that our approach significantly improves over the current state-of-the-art. We also provide analyses in two limited resource settings: (a) zero-shot setting, when no supervision in the target language is available, and in (b) low-resource setting, when some supervision in the target language is available. Our analysis provides insights into the limitations of zero-shot XEL approaches in realistic scenarios, and shows the value of joint supervision in low-resource settings.
Tasks	Cross-Lingual Entity Linking, Entity Linking
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07657v1
PDF	http://arxiv.org/pdf/1809.07657v1.pdf
PWC	https://paperswithcode.com/paper/joint-multilingual-supervision-for-cross
Repo	https://github.com/shyamupa/xelms
Framework	none

Another Diversity-Promoting Objective Function for Neural Dialogue Generation


Title	Another Diversity-Promoting Objective Function for Neural Dialogue Generation
Authors	Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
Abstract	Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities. The most likely reason for this problem is Maximum Likelihood Estimation (MLE) with Softmax Cross-Entropy (SCE) loss. MLE trains models to generate the most frequent responses from enormous generation candidates, although in actual dialogues there are various responses based on the context. In this paper, we propose a new objective function called Inverse Token Frequency (ITF) loss, which individually scales smaller loss for frequent token classes and larger loss for rare token classes. This function encourages the model to generate rare tokens rather than frequent tokens. It does not complicate the model and its training is stable because we only replace the objective function. On the OpenSubtitles dialogue dataset, our loss model establishes a state-of-the-art DIST-1 of 7.56, which is the unigram diversity score, while maintaining a good BLEU-1 score. On a Japanese Twitter replies dataset, our loss model achieves a DIST-1 score comparable to the ground truth.
Tasks	Dialogue Generation
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08100v2
PDF	http://arxiv.org/pdf/1811.08100v2.pdf
PWC	https://paperswithcode.com/paper/another-diversity-promoting-objective
Repo	https://github.com/reppy4620/Dialog
Framework	pytorch

Importance of Search and Evaluation Strategies in Neural Dialogue Modeling


Title	Importance of Search and Evaluation Strategies in Neural Dialogue Modeling
Authors	Ilia Kulikov, Alexander H. Miller, Kyunghyun Cho, Jason Weston
Abstract	We investigate the impact of search strategies in neural dialogue modeling. We first compare two standard search algorithms, greedy and beam search, as well as our newly proposed iterative beam search which produces a more diverse set of candidate responses. We evaluate these strategies in realistic full conversations with humans and propose a model-based Bayesian calibration to address annotator bias. These conversations are analyzed using two automatic metrics: log-probabilities assigned by the model and utterance diversity. Our experiments reveal that better search algorithms lead to higher rated conversations. However, finding the optimal selection mechanism to choose from a more diverse set of candidates is still an open question.
Tasks	Calibration, Dialogue Generation
Published	2018-11-02
URL	https://arxiv.org/abs/1811.00907v3
PDF	https://arxiv.org/pdf/1811.00907v3.pdf
PWC	https://paperswithcode.com/paper/importance-of-a-search-strategy-in-neural
Repo	https://github.com/nyu-dl/dl4dial-bayesian-calibration
Framework	pytorch

RISE: Randomized Input Sampling for Explanation of Black-box Models


Title	RISE: Randomized Input Sampling for Explanation of Black-box Models
Authors	Vitali Petsiuk, Abir Das, Kate Saenko
Abstract	Deep neural networks are being used increasingly to automate data analysis and decision making, yet their decision-making process is largely unclear and is difficult to explain to the end users. In this paper, we address the problem of Explainable AI for deep neural networks that take images as input and output a class probability. We propose an approach called RISE that generates an importance map indicating how salient each pixel is for the model’s prediction. In contrast to white-box approaches that estimate pixel importance using gradients or other internal network state, RISE works on black-box models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. We compare our approach to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that our approach matches or exceeds the performance of other methods, including white-box approaches. Project page: http://cs-people.bu.edu/vpetsiuk/rise/
Tasks	Decision Making
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07421v3
PDF	http://arxiv.org/pdf/1806.07421v3.pdf
PWC	https://paperswithcode.com/paper/rise-randomized-input-sampling-for
Repo	https://github.com/eclique/RISE
Framework	pytorch

A deep learning architecture to detect events in EEG signals during sleep


Title	A deep learning architecture to detect events in EEG signals during sleep
Authors	Stanislas Chambon, Valentin Thorey, Pierrick J. Arnal, Emmanuel Mignot, Alexandre Gramfort
Abstract	Electroencephalography (EEG) during sleep is used by clinicians to evaluate various neurological disorders. In sleep medicine, it is relevant to detect macro-events (> 10s) such as sleep stages, and micro-events (<2s) such as spindles and K-complexes. Annotations of such events require a trained sleep expert, a time consuming and tedious process with a large inter-scorer variability. Automatic algorithms have been developed to detect various types of events but these are event-specific. We propose a deep learning method that jointly predicts locations, durations and types of events in EEG time series. It relies on a convolutional neural network that builds a feature representation from raw EEG signals. Numerical experiments demonstrate efficiency of this new approach on various event detection tasks compared to current state-of-the-art, event specific, algorithms.
Tasks	EEG, Time Series
Published	2018-07-11
URL	http://arxiv.org/abs/1807.05981v1
PDF	http://arxiv.org/pdf/1807.05981v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-architecture-to-detect-events
Repo	https://github.com/Dreem-Organization/dosed
Framework	pytorch

Learning to Decompose and Disentangle Representations for Video Prediction


Title	Learning to Decompose and Disentangle Representations for Video Prediction
Authors	Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles
Abstract	Our goal is to predict future video frames given a sequence of input frames. Despite large amounts of video data, this remains a challenging task because of the high-dimensionality of video frames. We address this challenge by proposing the Decompositional Disentangled Predictive Auto-Encoder (DDPAE), a framework that combines structured probabilistic models and deep networks to automatically (i) decompose the high-dimensional video that we aim to predict into components, and (ii) disentangle each component to have low-dimensional temporal dynamics that are easier to predict. Crucially, with an appropriately specified generative model of video frames, our DDPAE is able to learn both the latent decomposition and disentanglement without explicit supervision. For the Moving MNIST dataset, we show that DDPAE is able to recover the underlying components (individual digits) and disentanglement (appearance and location) as we would intuitively do. We further demonstrate that DDPAE can be applied to the Bouncing Balls dataset involving complex interactions between multiple objects to predict the video frame directly from the pixels and recover physical states without explicit supervision.
Tasks	Predict Future Video Frames, Video Prediction
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04166v2
PDF	http://arxiv.org/pdf/1806.04166v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-decompose-and-disentangle
Repo	https://github.com/jthsieh/DDPAE-video-prediction
Framework	pytorch

Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences


Title	Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences
Authors	Louis Lettry, Kenneth Vanhoey, Luc van Gool
Abstract	Machine learning based Single Image Intrinsic Decomposition (SIID) methods decompose a captured scene into its albedo and shading images by using the knowledge of a large set of known and realistic ground truth decompositions. Collecting and annotating such a dataset is an approach that cannot scale to sufficient variety and realism. We free ourselves from this limitation by training on unannotated images. Our method leverages the observation that two images of the same scene but with different lighting provide useful information on their intrinsic properties: by definition, albedo is invariant to lighting conditions, and cross-combining the estimated albedo of a first image with the estimated shading of a second one should lead back to the second one’s input image. We transcribe this relationship into a siamese training scheme for a deep convolutional neural network that decomposes a single image into albedo and shading. The siamese setting allows us to introduce a new loss function including such cross-combinations, and to train solely on (time-lapse) images, discarding the need for any ground truth annotations. As a result, our method has the good properties of i) taking advantage of the time-varying information of image sequences in the (pre-computed) training step, ii) not requiring ground truth data to train on, and iii) being able to decompose single images of unseen scenes at runtime. To demonstrate and evaluate our work, we additionally propose a new rendered dataset containing illumination-varying scenes and a set of quantitative metrics to evaluate SIID algorithms. Despite its unsupervised nature, our results compete with state of the art methods, including supervised and non data-driven methods.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00805v2
PDF	http://arxiv.org/pdf/1803.00805v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-single-image-intrinsic
Repo	https://github.com/kvanhoey/UnsupervisedIntrinsicDecomposition
Framework	tf

End-to-end Audiovisual Speech Recognition


Title	End-to-end Audiovisual Speech Recognition
Authors	Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Feipeng Cai, Georgios Tzimiropoulos, Maja Pantic
Abstract	Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-to-end audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the best of our knowledge, this is the first audiovisual fusion model which simultaneously learns to extract features directly from the image pixels and audio waveforms and performs within-context word recognition on a large publicly available dataset (LRW). The model consists of two streams, one for each modality, which extract features directly from mouth regions and raw waveforms. The temporal dynamics in each stream/modality are modeled by a 2-layer BGRU and the fusion of multiple streams/modalities takes place via another 2-layer BGRU. A slight improvement in the classification rate over an end-to-end audio-only and MFCC-based model is reported in clean audio conditions and low levels of noise. In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models.
Tasks	Speech Recognition
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06424v2
PDF	http://arxiv.org/pdf/1802.06424v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-audiovisual-speech-recognition
Repo	https://github.com/tstafylakis/Lipreading-ResNet
Framework	pytorch

Modeling urbanization patterns with generative adversarial networks


Title	Modeling urbanization patterns with generative adversarial networks
Authors	Adrian Albert, Emanuele Strano, Jasleen Kaur, Marta Gonzalez
Abstract	In this study we propose a new method to simulate hyper-realistic urban patterns using Generative Adversarial Networks trained with a global urban land-use inventory. We generated a synthetic urban “universe” that qualitatively reproduces the complex spatial organization observed in global urban patterns, while being able to quantitatively recover certain key high-level urban spatial metrics.
Tasks
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02710v1
PDF	http://arxiv.org/pdf/1801.02710v1.pdf
PWC	https://paperswithcode.com/paper/modeling-urbanization-patterns-with
Repo	https://github.com/adrianalbert/citygan
Framework	pytorch

Real-time Air Pollution prediction model based on Spatiotemporal Big data


Title	Real-time Air Pollution prediction model based on Spatiotemporal Big data
Authors	V. Duc Le, Sang Kyun Cha
Abstract	Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability.
Tasks	Air Pollution Prediction, Time Series
Published	2018-04-05
URL	http://arxiv.org/abs/1805.00432v3
PDF	http://arxiv.org/pdf/1805.00432v3.pdf
PWC	https://paperswithcode.com/paper/real-time-air-pollution-prediction-model
Repo	https://github.com/vanduc103/air_analysis_v1
Framework	tf

A Survey of Unsupervised Deep Domain Adaptation


Title	A Survey of Unsupervised Deep Domain Adaptation
Authors	Garrett Wilson, Diane J. Cook
Abstract	Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
Tasks	Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published	2018-12-06
URL	https://arxiv.org/abs/1812.02849v3
PDF	https://arxiv.org/pdf/1812.02849v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-transfer-learning
Repo	https://github.com/zhaoxin94/awsome-domain-adaptation
Framework	pytorch

Hierarchical interpretations for neural network predictions


Title	Hierarchical interpretations for neural network predictions
Authors	Chandan Singh, W. James Murdoch, Bin Yu
Abstract	Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN’s outputs. We also find that ACD’s hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.
Tasks	Feature Importance, Interpretable Machine Learning
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05337v2
PDF	http://arxiv.org/pdf/1806.05337v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-interpretations-for-neural
Repo	https://github.com/csinva/hierarchical-dnn-interpretations
Framework	pytorch

Backdrop: Stochastic Backpropagation


Title	Backdrop: Stochastic Backpropagation
Authors	Siavash Golkar, Kyle Cranmer
Abstract	We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01337v1
PDF	http://arxiv.org/pdf/1806.01337v1.pdf
PWC	https://paperswithcode.com/paper/backdrop-stochastic-backpropagation
Repo	https://github.com/dexgen/backdrop
Framework	pytorch

Regularization by Denoising: Clarifications and New Interpretations


Title	Regularization by Denoising: Clarifications and New Interpretations
Authors	Edward T. Reehorst, Philip Schniter
Abstract	Regularization by Denoising (RED), as recently proposed by Romano, Elad, and Milanfar, is powerful image-recovery framework that aims to minimize an explicit regularization objective constructed from a plug-in image-denoising function. Experimental evidence suggests that the RED algorithms are state-of-the-art. We claim, however, that explicit regularization does not explain the RED algorithms. In particular, we show that many of the expressions in the paper by Romano et al. hold only when the denoiser has a symmetric Jacobian, and we demonstrate that such symmetry does not occur with practical denoisers such as non-local means, BM3D, TNRD, and DnCNN. To explain the RED algorithms, we propose a new framework called Score-Matching by Denoising (SMD), which aims to match a “score” (i.e., the gradient of a log-prior). We then show tight connections between SMD, kernel density estimation, and constrained minimum mean-squared error denoising. Furthermore, we interpret the RED algorithms from Romano et al. and propose new algorithms with acceleration and convergence guarantees. Finally, we show that the RED algorithms seek a consensus equilibrium solution, which facilitates a comparison to plug-and-play ADMM.
Tasks	Denoising, Density Estimation, Image Denoising
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02296v4
PDF	http://arxiv.org/pdf/1806.02296v4.pdf
PWC	https://paperswithcode.com/paper/regularization-by-denoising-clarifications
Repo	https://github.com/edward-reehorst/On_RED
Framework	none

Compressed Sensing with Deep Image Prior and Learned Regularization


Title	Compressed Sensing with Deep Image Prior and Learned Regularization
Authors	Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis
Abstract	We propose a novel method for compressed sensing recovery using untrained deep generative models. Our method is based on the recently proposed Deep Image Prior (DIP), wherein the convolutional weights of the network are optimized to match the observed measurements. We show that this approach can be applied to solve any differentiable linear inverse problem, outperforming previous unlearned methods. Unlike various learned approaches based on generative models, our method does not require pre-training over large datasets. We further introduce a novel learned regularization technique, which incorporates prior information on the network weights. This reduces reconstruction error, especially for noisy measurements. Finally, we prove that single-layer DIP networks with constant fraction over-parameterization will perfectly fit any signal through gradient descent, despite being a non-convex problem. This theoretical result provides justification for early stopping.
Tasks
Published	2018-06-17
URL	https://arxiv.org/abs/1806.06438v3
PDF	https://arxiv.org/pdf/1806.06438v3.pdf
PWC	https://paperswithcode.com/paper/compressed-sensing-with-deep-image-prior-and
Repo	https://github.com/davevanveen/compsensing_dip
Framework	pytorch