October 21, 2019

3003 words 15 mins read

Paper Group AWR 37

Paper Group AWR 37

Joint Multilingual Supervision for Cross-lingual Entity Linking. Another Diversity-Promoting Objective Function for Neural Dialogue Generation. Importance of Search and Evaluation Strategies in Neural Dialogue Modeling. RISE: Randomized Input Sampling for Explanation of Black-box Models. A deep learning architecture to detect events in EEG signals …

Joint Multilingual Supervision for Cross-lingual Entity Linking

Title Joint Multilingual Supervision for Cross-lingual Entity Linking
Authors Shyam Upadhyay, Nitish Gupta, Dan Roth
Abstract Cross-lingual Entity Linking (XEL) aims to ground entity mentions written in any language to an English Knowledge Base (KB), such as Wikipedia. XEL for most languages is challenging, owing to limited availability of resources as supervision. We address this challenge by developing the first XEL approach that combines supervision from multiple languages jointly. This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. Extensive evaluation on three benchmark datasets across 8 languages shows that our approach significantly improves over the current state-of-the-art. We also provide analyses in two limited resource settings: (a) zero-shot setting, when no supervision in the target language is available, and in (b) low-resource setting, when some supervision in the target language is available. Our analysis provides insights into the limitations of zero-shot XEL approaches in realistic scenarios, and shows the value of joint supervision in low-resource settings.
Tasks Cross-Lingual Entity Linking, Entity Linking
Published 2018-09-20
URL http://arxiv.org/abs/1809.07657v1
PDF http://arxiv.org/pdf/1809.07657v1.pdf
PWC https://paperswithcode.com/paper/joint-multilingual-supervision-for-cross
Repo https://github.com/shyamupa/xelms
Framework none

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

Title Another Diversity-Promoting Objective Function for Neural Dialogue Generation
Authors Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
Abstract Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities. The most likely reason for this problem is Maximum Likelihood Estimation (MLE) with Softmax Cross-Entropy (SCE) loss. MLE trains models to generate the most frequent responses from enormous generation candidates, although in actual dialogues there are various responses based on the context. In this paper, we propose a new objective function called Inverse Token Frequency (ITF) loss, which individually scales smaller loss for frequent token classes and larger loss for rare token classes. This function encourages the model to generate rare tokens rather than frequent tokens. It does not complicate the model and its training is stable because we only replace the objective function. On the OpenSubtitles dialogue dataset, our loss model establishes a state-of-the-art DIST-1 of 7.56, which is the unigram diversity score, while maintaining a good BLEU-1 score. On a Japanese Twitter replies dataset, our loss model achieves a DIST-1 score comparable to the ground truth.
Tasks Dialogue Generation
Published 2018-11-20
URL http://arxiv.org/abs/1811.08100v2
PDF http://arxiv.org/pdf/1811.08100v2.pdf
PWC https://paperswithcode.com/paper/another-diversity-promoting-objective
Repo https://github.com/reppy4620/Dialog
Framework pytorch

Importance of Search and Evaluation Strategies in Neural Dialogue Modeling

Title Importance of Search and Evaluation Strategies in Neural Dialogue Modeling
Authors Ilia Kulikov, Alexander H. Miller, Kyunghyun Cho, Jason Weston
Abstract We investigate the impact of search strategies in neural dialogue modeling. We first compare two standard search algorithms, greedy and beam search, as well as our newly proposed iterative beam search which produces a more diverse set of candidate responses. We evaluate these strategies in realistic full conversations with humans and propose a model-based Bayesian calibration to address annotator bias. These conversations are analyzed using two automatic metrics: log-probabilities assigned by the model and utterance diversity. Our experiments reveal that better search algorithms lead to higher rated conversations. However, finding the optimal selection mechanism to choose from a more diverse set of candidates is still an open question.
Tasks Calibration, Dialogue Generation
Published 2018-11-02
URL https://arxiv.org/abs/1811.00907v3
PDF https://arxiv.org/pdf/1811.00907v3.pdf
PWC https://paperswithcode.com/paper/importance-of-a-search-strategy-in-neural
Repo https://github.com/nyu-dl/dl4dial-bayesian-calibration
Framework pytorch

RISE: Randomized Input Sampling for Explanation of Black-box Models

Title RISE: Randomized Input Sampling for Explanation of Black-box Models
Authors Vitali Petsiuk, Abir Das, Kate Saenko
Abstract Deep neural networks are being used increasingly to automate data analysis and decision making, yet their decision-making process is largely unclear and is difficult to explain to the end users. In this paper, we address the problem of Explainable AI for deep neural networks that take images as input and output a class probability. We propose an approach called RISE that generates an importance map indicating how salient each pixel is for the model’s prediction. In contrast to white-box approaches that estimate pixel importance using gradients or other internal network state, RISE works on black-box models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. We compare our approach to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that our approach matches or exceeds the performance of other methods, including white-box approaches. Project page: http://cs-people.bu.edu/vpetsiuk/rise/
Tasks Decision Making
Published 2018-06-19
URL http://arxiv.org/abs/1806.07421v3
PDF http://arxiv.org/pdf/1806.07421v3.pdf
PWC https://paperswithcode.com/paper/rise-randomized-input-sampling-for
Repo https://github.com/eclique/RISE
Framework pytorch

A deep learning architecture to detect events in EEG signals during sleep

Title A deep learning architecture to detect events in EEG signals during sleep
Authors Stanislas Chambon, Valentin Thorey, Pierrick J. Arnal, Emmanuel Mignot, Alexandre Gramfort
Abstract Electroencephalography (EEG) during sleep is used by clinicians to evaluate various neurological disorders. In sleep medicine, it is relevant to detect macro-events (> 10s) such as sleep stages, and micro-events (<2s) such as spindles and K-complexes. Annotations of such events require a trained sleep expert, a time consuming and tedious process with a large inter-scorer variability. Automatic algorithms have been developed to detect various types of events but these are event-specific. We propose a deep learning method that jointly predicts locations, durations and types of events in EEG time series. It relies on a convolutional neural network that builds a feature representation from raw EEG signals. Numerical experiments demonstrate efficiency of this new approach on various event detection tasks compared to current state-of-the-art, event specific, algorithms.
Tasks EEG, Time Series
Published 2018-07-11
URL http://arxiv.org/abs/1807.05981v1
PDF http://arxiv.org/pdf/1807.05981v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-architecture-to-detect-events
Repo https://github.com/Dreem-Organization/dosed
Framework pytorch

Learning to Decompose and Disentangle Representations for Video Prediction

Title Learning to Decompose and Disentangle Representations for Video Prediction
Authors Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles
Abstract Our goal is to predict future video frames given a sequence of input frames. Despite large amounts of video data, this remains a challenging task because of the high-dimensionality of video frames. We address this challenge by proposing the Decompositional Disentangled Predictive Auto-Encoder (DDPAE), a framework that combines structured probabilistic models and deep networks to automatically (i) decompose the high-dimensional video that we aim to predict into components, and (ii) disentangle each component to have low-dimensional temporal dynamics that are easier to predict. Crucially, with an appropriately specified generative model of video frames, our DDPAE is able to learn both the latent decomposition and disentanglement without explicit supervision. For the Moving MNIST dataset, we show that DDPAE is able to recover the underlying components (individual digits) and disentanglement (appearance and location) as we would intuitively do. We further demonstrate that DDPAE can be applied to the Bouncing Balls dataset involving complex interactions between multiple objects to predict the video frame directly from the pixels and recover physical states without explicit supervision.
Tasks Predict Future Video Frames, Video Prediction
Published 2018-06-11
URL http://arxiv.org/abs/1806.04166v2
PDF http://arxiv.org/pdf/1806.04166v2.pdf
PWC https://paperswithcode.com/paper/learning-to-decompose-and-disentangle
Repo https://github.com/jthsieh/DDPAE-video-prediction
Framework pytorch

Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences

Title Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences
Authors Louis Lettry, Kenneth Vanhoey, Luc van Gool
Abstract Machine learning based Single Image Intrinsic Decomposition (SIID) methods decompose a captured scene into its albedo and shading images by using the knowledge of a large set of known and realistic ground truth decompositions. Collecting and annotating such a dataset is an approach that cannot scale to sufficient variety and realism. We free ourselves from this limitation by training on unannotated images. Our method leverages the observation that two images of the same scene but with different lighting provide useful information on their intrinsic properties: by definition, albedo is invariant to lighting conditions, and cross-combining the estimated albedo of a first image with the estimated shading of a second one should lead back to the second one’s input image. We transcribe this relationship into a siamese training scheme for a deep convolutional neural network that decomposes a single image into albedo and shading. The siamese setting allows us to introduce a new loss function including such cross-combinations, and to train solely on (time-lapse) images, discarding the need for any ground truth annotations. As a result, our method has the good properties of i) taking advantage of the time-varying information of image sequences in the (pre-computed) training step, ii) not requiring ground truth data to train on, and iii) being able to decompose single images of unseen scenes at runtime. To demonstrate and evaluate our work, we additionally propose a new rendered dataset containing illumination-varying scenes and a set of quantitative metrics to evaluate SIID algorithms. Despite its unsupervised nature, our results compete with state of the art methods, including supervised and non data-driven methods.
Tasks
Published 2018-03-02
URL http://arxiv.org/abs/1803.00805v2
PDF http://arxiv.org/pdf/1803.00805v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-deep-single-image-intrinsic
Repo https://github.com/kvanhoey/UnsupervisedIntrinsicDecomposition
Framework tf

End-to-end Audiovisual Speech Recognition

Title End-to-end Audiovisual Speech Recognition
Authors Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Feipeng Cai, Georgios Tzimiropoulos, Maja Pantic
Abstract Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-to-end audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the best of our knowledge, this is the first audiovisual fusion model which simultaneously learns to extract features directly from the image pixels and audio waveforms and performs within-context word recognition on a large publicly available dataset (LRW). The model consists of two streams, one for each modality, which extract features directly from mouth regions and raw waveforms. The temporal dynamics in each stream/modality are modeled by a 2-layer BGRU and the fusion of multiple streams/modalities takes place via another 2-layer BGRU. A slight improvement in the classification rate over an end-to-end audio-only and MFCC-based model is reported in clean audio conditions and low levels of noise. In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models.
Tasks Speech Recognition
Published 2018-02-18
URL http://arxiv.org/abs/1802.06424v2
PDF http://arxiv.org/pdf/1802.06424v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-audiovisual-speech-recognition
Repo https://github.com/tstafylakis/Lipreading-ResNet
Framework pytorch

Modeling urbanization patterns with generative adversarial networks

Title Modeling urbanization patterns with generative adversarial networks
Authors Adrian Albert, Emanuele Strano, Jasleen Kaur, Marta Gonzalez
Abstract In this study we propose a new method to simulate hyper-realistic urban patterns using Generative Adversarial Networks trained with a global urban land-use inventory. We generated a synthetic urban “universe” that qualitatively reproduces the complex spatial organization observed in global urban patterns, while being able to quantitatively recover certain key high-level urban spatial metrics.
Tasks
Published 2018-01-08
URL http://arxiv.org/abs/1801.02710v1
PDF http://arxiv.org/pdf/1801.02710v1.pdf
PWC https://paperswithcode.com/paper/modeling-urbanization-patterns-with
Repo https://github.com/adrianalbert/citygan
Framework pytorch

Real-time Air Pollution prediction model based on Spatiotemporal Big data

Title Real-time Air Pollution prediction model based on Spatiotemporal Big data
Authors V. Duc Le, Sang Kyun Cha
Abstract Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability.
Tasks Air Pollution Prediction, Time Series
Published 2018-04-05
URL http://arxiv.org/abs/1805.00432v3
PDF http://arxiv.org/pdf/1805.00432v3.pdf
PWC https://paperswithcode.com/paper/real-time-air-pollution-prediction-model
Repo https://github.com/vanduc103/air_analysis_v1
Framework tf

A Survey of Unsupervised Deep Domain Adaptation

Title A Survey of Unsupervised Deep Domain Adaptation
Authors Garrett Wilson, Diane J. Cook
Abstract Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
Tasks Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published 2018-12-06
URL https://arxiv.org/abs/1812.02849v3
PDF https://arxiv.org/pdf/1812.02849v3.pdf
PWC https://paperswithcode.com/paper/adversarial-transfer-learning
Repo https://github.com/zhaoxin94/awsome-domain-adaptation
Framework pytorch

Hierarchical interpretations for neural network predictions

Title Hierarchical interpretations for neural network predictions
Authors Chandan Singh, W. James Murdoch, Bin Yu
Abstract Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN’s outputs. We also find that ACD’s hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.
Tasks Feature Importance, Interpretable Machine Learning
Published 2018-06-14
URL http://arxiv.org/abs/1806.05337v2
PDF http://arxiv.org/pdf/1806.05337v2.pdf
PWC https://paperswithcode.com/paper/hierarchical-interpretations-for-neural
Repo https://github.com/csinva/hierarchical-dnn-interpretations
Framework pytorch

Backdrop: Stochastic Backpropagation

Title Backdrop: Stochastic Backpropagation
Authors Siavash Golkar, Kyle Cranmer
Abstract We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization.
Tasks
Published 2018-06-04
URL http://arxiv.org/abs/1806.01337v1
PDF http://arxiv.org/pdf/1806.01337v1.pdf
PWC https://paperswithcode.com/paper/backdrop-stochastic-backpropagation
Repo https://github.com/dexgen/backdrop
Framework pytorch

Regularization by Denoising: Clarifications and New Interpretations

Title Regularization by Denoising: Clarifications and New Interpretations
Authors Edward T. Reehorst, Philip Schniter
Abstract Regularization by Denoising (RED), as recently proposed by Romano, Elad, and Milanfar, is powerful image-recovery framework that aims to minimize an explicit regularization objective constructed from a plug-in image-denoising function. Experimental evidence suggests that the RED algorithms are state-of-the-art. We claim, however, that explicit regularization does not explain the RED algorithms. In particular, we show that many of the expressions in the paper by Romano et al. hold only when the denoiser has a symmetric Jacobian, and we demonstrate that such symmetry does not occur with practical denoisers such as non-local means, BM3D, TNRD, and DnCNN. To explain the RED algorithms, we propose a new framework called Score-Matching by Denoising (SMD), which aims to match a “score” (i.e., the gradient of a log-prior). We then show tight connections between SMD, kernel density estimation, and constrained minimum mean-squared error denoising. Furthermore, we interpret the RED algorithms from Romano et al. and propose new algorithms with acceleration and convergence guarantees. Finally, we show that the RED algorithms seek a consensus equilibrium solution, which facilitates a comparison to plug-and-play ADMM.
Tasks Denoising, Density Estimation, Image Denoising
Published 2018-06-06
URL http://arxiv.org/abs/1806.02296v4
PDF http://arxiv.org/pdf/1806.02296v4.pdf
PWC https://paperswithcode.com/paper/regularization-by-denoising-clarifications
Repo https://github.com/edward-reehorst/On_RED
Framework none

Compressed Sensing with Deep Image Prior and Learned Regularization

Title Compressed Sensing with Deep Image Prior and Learned Regularization
Authors Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis
Abstract We propose a novel method for compressed sensing recovery using untrained deep generative models. Our method is based on the recently proposed Deep Image Prior (DIP), wherein the convolutional weights of the network are optimized to match the observed measurements. We show that this approach can be applied to solve any differentiable linear inverse problem, outperforming previous unlearned methods. Unlike various learned approaches based on generative models, our method does not require pre-training over large datasets. We further introduce a novel learned regularization technique, which incorporates prior information on the network weights. This reduces reconstruction error, especially for noisy measurements. Finally, we prove that single-layer DIP networks with constant fraction over-parameterization will perfectly fit any signal through gradient descent, despite being a non-convex problem. This theoretical result provides justification for early stopping.
Tasks
Published 2018-06-17
URL https://arxiv.org/abs/1806.06438v3
PDF https://arxiv.org/pdf/1806.06438v3.pdf
PWC https://paperswithcode.com/paper/compressed-sensing-with-deep-image-prior-and
Repo https://github.com/davevanveen/compsensing_dip
Framework pytorch
comments powered by Disqus