Paper Group AWR 37
Joint Multilingual Supervision for Cross-lingual Entity Linking. Another Diversity-Promoting Objective Function for Neural Dialogue Generation. Importance of Search and Evaluation Strategies in Neural Dialogue Modeling. RISE: Randomized Input Sampling for Explanation of Black-box Models. A deep learning architecture to detect events in EEG signals …
Joint Multilingual Supervision for Cross-lingual Entity Linking
Title | Joint Multilingual Supervision for Cross-lingual Entity Linking |
Authors | Shyam Upadhyay, Nitish Gupta, Dan Roth |
Abstract | Cross-lingual Entity Linking (XEL) aims to ground entity mentions written in any language to an English Knowledge Base (KB), such as Wikipedia. XEL for most languages is challenging, owing to limited availability of resources as supervision. We address this challenge by developing the first XEL approach that combines supervision from multiple languages jointly. This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. Extensive evaluation on three benchmark datasets across 8 languages shows that our approach significantly improves over the current state-of-the-art. We also provide analyses in two limited resource settings: (a) zero-shot setting, when no supervision in the target language is available, and in (b) low-resource setting, when some supervision in the target language is available. Our analysis provides insights into the limitations of zero-shot XEL approaches in realistic scenarios, and shows the value of joint supervision in low-resource settings. |
Tasks | Cross-Lingual Entity Linking, Entity Linking |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07657v1 |
http://arxiv.org/pdf/1809.07657v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-multilingual-supervision-for-cross |
Repo | https://github.com/shyamupa/xelms |
Framework | none |
Another Diversity-Promoting Objective Function for Neural Dialogue Generation
Title | Another Diversity-Promoting Objective Function for Neural Dialogue Generation |
Authors | Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura |
Abstract | Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities. The most likely reason for this problem is Maximum Likelihood Estimation (MLE) with Softmax Cross-Entropy (SCE) loss. MLE trains models to generate the most frequent responses from enormous generation candidates, although in actual dialogues there are various responses based on the context. In this paper, we propose a new objective function called Inverse Token Frequency (ITF) loss, which individually scales smaller loss for frequent token classes and larger loss for rare token classes. This function encourages the model to generate rare tokens rather than frequent tokens. It does not complicate the model and its training is stable because we only replace the objective function. On the OpenSubtitles dialogue dataset, our loss model establishes a state-of-the-art DIST-1 of 7.56, which is the unigram diversity score, while maintaining a good BLEU-1 score. On a Japanese Twitter replies dataset, our loss model achieves a DIST-1 score comparable to the ground truth. |
Tasks | Dialogue Generation |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08100v2 |
http://arxiv.org/pdf/1811.08100v2.pdf | |
PWC | https://paperswithcode.com/paper/another-diversity-promoting-objective |
Repo | https://github.com/reppy4620/Dialog |
Framework | pytorch |
Importance of Search and Evaluation Strategies in Neural Dialogue Modeling
Title | Importance of Search and Evaluation Strategies in Neural Dialogue Modeling |
Authors | Ilia Kulikov, Alexander H. Miller, Kyunghyun Cho, Jason Weston |
Abstract | We investigate the impact of search strategies in neural dialogue modeling. We first compare two standard search algorithms, greedy and beam search, as well as our newly proposed iterative beam search which produces a more diverse set of candidate responses. We evaluate these strategies in realistic full conversations with humans and propose a model-based Bayesian calibration to address annotator bias. These conversations are analyzed using two automatic metrics: log-probabilities assigned by the model and utterance diversity. Our experiments reveal that better search algorithms lead to higher rated conversations. However, finding the optimal selection mechanism to choose from a more diverse set of candidates is still an open question. |
Tasks | Calibration, Dialogue Generation |
Published | 2018-11-02 |
URL | https://arxiv.org/abs/1811.00907v3 |
https://arxiv.org/pdf/1811.00907v3.pdf | |
PWC | https://paperswithcode.com/paper/importance-of-a-search-strategy-in-neural |
Repo | https://github.com/nyu-dl/dl4dial-bayesian-calibration |
Framework | pytorch |
RISE: Randomized Input Sampling for Explanation of Black-box Models
Title | RISE: Randomized Input Sampling for Explanation of Black-box Models |
Authors | Vitali Petsiuk, Abir Das, Kate Saenko |
Abstract | Deep neural networks are being used increasingly to automate data analysis and decision making, yet their decision-making process is largely unclear and is difficult to explain to the end users. In this paper, we address the problem of Explainable AI for deep neural networks that take images as input and output a class probability. We propose an approach called RISE that generates an importance map indicating how salient each pixel is for the model’s prediction. In contrast to white-box approaches that estimate pixel importance using gradients or other internal network state, RISE works on black-box models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. We compare our approach to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that our approach matches or exceeds the performance of other methods, including white-box approaches. Project page: http://cs-people.bu.edu/vpetsiuk/rise/ |
Tasks | Decision Making |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07421v3 |
http://arxiv.org/pdf/1806.07421v3.pdf | |
PWC | https://paperswithcode.com/paper/rise-randomized-input-sampling-for |
Repo | https://github.com/eclique/RISE |
Framework | pytorch |
A deep learning architecture to detect events in EEG signals during sleep
Title | A deep learning architecture to detect events in EEG signals during sleep |
Authors | Stanislas Chambon, Valentin Thorey, Pierrick J. Arnal, Emmanuel Mignot, Alexandre Gramfort |
Abstract | Electroencephalography (EEG) during sleep is used by clinicians to evaluate various neurological disorders. In sleep medicine, it is relevant to detect macro-events (> 10s) such as sleep stages, and micro-events (<2s) such as spindles and K-complexes. Annotations of such events require a trained sleep expert, a time consuming and tedious process with a large inter-scorer variability. Automatic algorithms have been developed to detect various types of events but these are event-specific. We propose a deep learning method that jointly predicts locations, durations and types of events in EEG time series. It relies on a convolutional neural network that builds a feature representation from raw EEG signals. Numerical experiments demonstrate efficiency of this new approach on various event detection tasks compared to current state-of-the-art, event specific, algorithms. |
Tasks | EEG, Time Series |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.05981v1 |
http://arxiv.org/pdf/1807.05981v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-architecture-to-detect-events |
Repo | https://github.com/Dreem-Organization/dosed |
Framework | pytorch |
Learning to Decompose and Disentangle Representations for Video Prediction
Title | Learning to Decompose and Disentangle Representations for Video Prediction |
Authors | Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles |
Abstract | Our goal is to predict future video frames given a sequence of input frames. Despite large amounts of video data, this remains a challenging task because of the high-dimensionality of video frames. We address this challenge by proposing the Decompositional Disentangled Predictive Auto-Encoder (DDPAE), a framework that combines structured probabilistic models and deep networks to automatically (i) decompose the high-dimensional video that we aim to predict into components, and (ii) disentangle each component to have low-dimensional temporal dynamics that are easier to predict. Crucially, with an appropriately specified generative model of video frames, our DDPAE is able to learn both the latent decomposition and disentanglement without explicit supervision. For the Moving MNIST dataset, we show that DDPAE is able to recover the underlying components (individual digits) and disentanglement (appearance and location) as we would intuitively do. We further demonstrate that DDPAE can be applied to the Bouncing Balls dataset involving complex interactions between multiple objects to predict the video frame directly from the pixels and recover physical states without explicit supervision. |
Tasks | Predict Future Video Frames, Video Prediction |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.04166v2 |
http://arxiv.org/pdf/1806.04166v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-decompose-and-disentangle |
Repo | https://github.com/jthsieh/DDPAE-video-prediction |
Framework | pytorch |
Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences
Title | Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences |
Authors | Louis Lettry, Kenneth Vanhoey, Luc van Gool |
Abstract | Machine learning based Single Image Intrinsic Decomposition (SIID) methods decompose a captured scene into its albedo and shading images by using the knowledge of a large set of known and realistic ground truth decompositions. Collecting and annotating such a dataset is an approach that cannot scale to sufficient variety and realism. We free ourselves from this limitation by training on unannotated images. Our method leverages the observation that two images of the same scene but with different lighting provide useful information on their intrinsic properties: by definition, albedo is invariant to lighting conditions, and cross-combining the estimated albedo of a first image with the estimated shading of a second one should lead back to the second one’s input image. We transcribe this relationship into a siamese training scheme for a deep convolutional neural network that decomposes a single image into albedo and shading. The siamese setting allows us to introduce a new loss function including such cross-combinations, and to train solely on (time-lapse) images, discarding the need for any ground truth annotations. As a result, our method has the good properties of i) taking advantage of the time-varying information of image sequences in the (pre-computed) training step, ii) not requiring ground truth data to train on, and iii) being able to decompose single images of unseen scenes at runtime. To demonstrate and evaluate our work, we additionally propose a new rendered dataset containing illumination-varying scenes and a set of quantitative metrics to evaluate SIID algorithms. Despite its unsupervised nature, our results compete with state of the art methods, including supervised and non data-driven methods. |
Tasks | |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00805v2 |
http://arxiv.org/pdf/1803.00805v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-single-image-intrinsic |
Repo | https://github.com/kvanhoey/UnsupervisedIntrinsicDecomposition |
Framework | tf |
End-to-end Audiovisual Speech Recognition
Title | End-to-end Audiovisual Speech Recognition |
Authors | Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Feipeng Cai, Georgios Tzimiropoulos, Maja Pantic |
Abstract | Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-to-end audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the best of our knowledge, this is the first audiovisual fusion model which simultaneously learns to extract features directly from the image pixels and audio waveforms and performs within-context word recognition on a large publicly available dataset (LRW). The model consists of two streams, one for each modality, which extract features directly from mouth regions and raw waveforms. The temporal dynamics in each stream/modality are modeled by a 2-layer BGRU and the fusion of multiple streams/modalities takes place via another 2-layer BGRU. A slight improvement in the classification rate over an end-to-end audio-only and MFCC-based model is reported in clean audio conditions and low levels of noise. In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models. |
Tasks | Speech Recognition |
Published | 2018-02-18 |
URL | http://arxiv.org/abs/1802.06424v2 |
http://arxiv.org/pdf/1802.06424v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-audiovisual-speech-recognition |
Repo | https://github.com/tstafylakis/Lipreading-ResNet |
Framework | pytorch |
Modeling urbanization patterns with generative adversarial networks
Title | Modeling urbanization patterns with generative adversarial networks |
Authors | Adrian Albert, Emanuele Strano, Jasleen Kaur, Marta Gonzalez |
Abstract | In this study we propose a new method to simulate hyper-realistic urban patterns using Generative Adversarial Networks trained with a global urban land-use inventory. We generated a synthetic urban “universe” that qualitatively reproduces the complex spatial organization observed in global urban patterns, while being able to quantitatively recover certain key high-level urban spatial metrics. |
Tasks | |
Published | 2018-01-08 |
URL | http://arxiv.org/abs/1801.02710v1 |
http://arxiv.org/pdf/1801.02710v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-urbanization-patterns-with |
Repo | https://github.com/adrianalbert/citygan |
Framework | pytorch |
Real-time Air Pollution prediction model based on Spatiotemporal Big data
Title | Real-time Air Pollution prediction model based on Spatiotemporal Big data |
Authors | V. Duc Le, Sang Kyun Cha |
Abstract | Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability. |
Tasks | Air Pollution Prediction, Time Series |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1805.00432v3 |
http://arxiv.org/pdf/1805.00432v3.pdf | |
PWC | https://paperswithcode.com/paper/real-time-air-pollution-prediction-model |
Repo | https://github.com/vanduc103/air_analysis_v1 |
Framework | tf |
A Survey of Unsupervised Deep Domain Adaptation
Title | A Survey of Unsupervised Deep Domain Adaptation |
Authors | Garrett Wilson, Diane J. Cook |
Abstract | Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions. |
Tasks | Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02849v3 |
https://arxiv.org/pdf/1812.02849v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-transfer-learning |
Repo | https://github.com/zhaoxin94/awsome-domain-adaptation |
Framework | pytorch |
Hierarchical interpretations for neural network predictions
Title | Hierarchical interpretations for neural network predictions |
Authors | Chandan Singh, W. James Murdoch, Bin Yu |
Abstract | Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN’s outputs. We also find that ACD’s hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise. |
Tasks | Feature Importance, Interpretable Machine Learning |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05337v2 |
http://arxiv.org/pdf/1806.05337v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-interpretations-for-neural |
Repo | https://github.com/csinva/hierarchical-dnn-interpretations |
Framework | pytorch |
Backdrop: Stochastic Backpropagation
Title | Backdrop: Stochastic Backpropagation |
Authors | Siavash Golkar, Kyle Cranmer |
Abstract | We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization. |
Tasks | |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01337v1 |
http://arxiv.org/pdf/1806.01337v1.pdf | |
PWC | https://paperswithcode.com/paper/backdrop-stochastic-backpropagation |
Repo | https://github.com/dexgen/backdrop |
Framework | pytorch |
Regularization by Denoising: Clarifications and New Interpretations
Title | Regularization by Denoising: Clarifications and New Interpretations |
Authors | Edward T. Reehorst, Philip Schniter |
Abstract | Regularization by Denoising (RED), as recently proposed by Romano, Elad, and Milanfar, is powerful image-recovery framework that aims to minimize an explicit regularization objective constructed from a plug-in image-denoising function. Experimental evidence suggests that the RED algorithms are state-of-the-art. We claim, however, that explicit regularization does not explain the RED algorithms. In particular, we show that many of the expressions in the paper by Romano et al. hold only when the denoiser has a symmetric Jacobian, and we demonstrate that such symmetry does not occur with practical denoisers such as non-local means, BM3D, TNRD, and DnCNN. To explain the RED algorithms, we propose a new framework called Score-Matching by Denoising (SMD), which aims to match a “score” (i.e., the gradient of a log-prior). We then show tight connections between SMD, kernel density estimation, and constrained minimum mean-squared error denoising. Furthermore, we interpret the RED algorithms from Romano et al. and propose new algorithms with acceleration and convergence guarantees. Finally, we show that the RED algorithms seek a consensus equilibrium solution, which facilitates a comparison to plug-and-play ADMM. |
Tasks | Denoising, Density Estimation, Image Denoising |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02296v4 |
http://arxiv.org/pdf/1806.02296v4.pdf | |
PWC | https://paperswithcode.com/paper/regularization-by-denoising-clarifications |
Repo | https://github.com/edward-reehorst/On_RED |
Framework | none |
Compressed Sensing with Deep Image Prior and Learned Regularization
Title | Compressed Sensing with Deep Image Prior and Learned Regularization |
Authors | Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis |
Abstract | We propose a novel method for compressed sensing recovery using untrained deep generative models. Our method is based on the recently proposed Deep Image Prior (DIP), wherein the convolutional weights of the network are optimized to match the observed measurements. We show that this approach can be applied to solve any differentiable linear inverse problem, outperforming previous unlearned methods. Unlike various learned approaches based on generative models, our method does not require pre-training over large datasets. We further introduce a novel learned regularization technique, which incorporates prior information on the network weights. This reduces reconstruction error, especially for noisy measurements. Finally, we prove that single-layer DIP networks with constant fraction over-parameterization will perfectly fit any signal through gradient descent, despite being a non-convex problem. This theoretical result provides justification for early stopping. |
Tasks | |
Published | 2018-06-17 |
URL | https://arxiv.org/abs/1806.06438v3 |
https://arxiv.org/pdf/1806.06438v3.pdf | |
PWC | https://paperswithcode.com/paper/compressed-sensing-with-deep-image-prior-and |
Repo | https://github.com/davevanveen/compsensing_dip |
Framework | pytorch |