Paper Group ANR 760
Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders. Predicting Conversion of Mild Cognitive Impairments to Alzheimer’s Disease and Exploring Impact of Neuroimaging. On the Learning Dynamics of Deep Neural Networks. Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Gr …
Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders
Title | Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders |
Authors | Emad M. Grais, Dominic Ward, Mark D. Plumbley |
Abstract | Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used for training. In this work, we introduce a novel multi-channel, multi-resolution convolutional auto-encoder neural network that works on raw time-domain signals to determine appropriate multi-resolution features for separating the singing-voice from stereo music. Our experimental results show that the proposed method can achieve multi-channel audio source separation without the need for hand-crafted features or any pre- or post-processing. |
Tasks | |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00702v1 |
http://arxiv.org/pdf/1803.00702v1.pdf | |
PWC | https://paperswithcode.com/paper/raw-multi-channel-audio-source-separation |
Repo | |
Framework | |
Predicting Conversion of Mild Cognitive Impairments to Alzheimer’s Disease and Exploring Impact of Neuroimaging
Title | Predicting Conversion of Mild Cognitive Impairments to Alzheimer’s Disease and Exploring Impact of Neuroimaging |
Authors | Yaroslav Shmulev, Mikhail Belyaev |
Abstract | Nowadays, a lot of scientific efforts are concentrated on the diagnosis of Alzheimer’s Disease (AD) applying deep learning methods to neuroimaging data. Even for 2017, there were published more than a hundred papers dedicated to AD diagnosis, whereas only a few works considered a problem of mild cognitive impairments (MCI) conversion to the AD. However, the conversion prediction is an important problem since approximately 15% of patients with MCI converges to the AD every year. In the current work, we are focusing on the conversion prediction using brain Magnetic Resonance Imaging and clinical data, such as demographics, cognitive assessments, genetic, and biochemical markers. First of all, we applied state-of-the-art deep learning algorithms on the neuroimaging data and compared these results with two machine learning algorithms that we fit using the clinical data. As a result, the models trained on the clinical data outperform the deep learning algorithms applied to the MR images. To explore the impact of neuroimaging further, we trained a deep feed-forward embedding using similarity learning with Histogram loss on all available MRIs and obtained 64-dimensional vector representation of neuroimaging data. The use of learned representation from the deep embedding allowed to increase the quality of prediction based on the neuroimaging. Finally, the current results on this dataset show that the neuroimaging does affect conversion prediction, however, cannot noticeably increase the quality of the prediction. The best results of predicting MCI-to-AD conversion are provided by XGBoost algorithm trained on the clinical and embedding data. The resulting accuracy is 0.76 +- 0.01 and the area under the ROC curve - 0.86 +- 0.01. |
Tasks | |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11228v1 |
http://arxiv.org/pdf/1807.11228v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-conversion-of-mild-cognitive |
Repo | |
Framework | |
On the Learning Dynamics of Deep Neural Networks
Title | On the Learning Dynamics of Deep Neural Networks |
Authors | Remi Tachet des Combes, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio |
Abstract | While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood. In this work, we study the case of binary classification and prove various properties of learning in such networks under strong assumptions such as linear separability of the data. Extending existing results from the linear case, we confirm empirical observations by proving that the classification error also follows a sigmoidal shape in nonlinear architectures. We show that given proper initialization, learning expounds parallel independent modes and that certain regions of parameter space might lead to failed training. We also demonstrate that input norm and features’ frequency in the dataset lead to distinct convergence speeds which might shed some light on the generalization capabilities of deep neural networks. We provide a comparison between the dynamics of learning with cross-entropy and hinge losses, which could prove useful to understand recent progress in the training of generative adversarial networks. Finally, we identify a phenomenon that we baptize \textit{gradient starvation} where the most frequent features in a dataset prevent the learning of other less frequent but equally informative features. |
Tasks | |
Published | 2018-09-18 |
URL | https://arxiv.org/abs/1809.06848v2 |
https://arxiv.org/pdf/1809.06848v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-learning-dynamics-of-deep-neural |
Repo | |
Framework | |
Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks
Title | Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks |
Authors | Linfeng Song, Zhiguo Wang, Mo Yu, Yue Zhang, Radu Florian, Daniel Gildea |
Abstract | Multi-hop reading comprehension focuses on one type of factoid question, where a system needs to properly integrate multiple pieces of evidence to correctly answer a question. Previous work approximates global evidence with local coreference information, encoding coreference chains with DAG-styled GRU layers within a gated-attention reader. However, coreference is limited in providing information for rich inference. We introduce a new method for better connecting global evidence, which forms more complex graphs compared to DAGs. To perform evidence integration on our graphs, we investigate two recent graph neural networks, namely graph convolutional network (GCN) and graph recurrent network (GRN). Experiments on two standard datasets show that richer global information leads to better answers. Our method performs better than all published results on these datasets. |
Tasks | Multi-Hop Reading Comprehension, Reading Comprehension |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02040v1 |
http://arxiv.org/pdf/1809.02040v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-graph-structured-passage |
Repo | |
Framework | |
Fusion of stereo and still monocular depth estimates in a self-supervised learning context
Title | Fusion of stereo and still monocular depth estimates in a self-supervised learning context |
Authors | Diogo Martins, Kevin van Hecke, Guido de Croon |
Abstract | We study how autonomous robots can learn by themselves to improve their depth estimation capability. In particular, we investigate a self-supervised learning setup in which stereo vision depth estimates serve as targets for a convolutional neural network (CNN) that transforms a single still image to a dense depth map. After training, the stereo and mono estimates are fused with a novel fusion method that preserves high confidence stereo estimates, while leveraging the CNN estimates in the low-confidence regions. The main contribution of the article is that it is shown that the fused estimates lead to a higher performance than the stereo vision estimates alone. Experiments are performed on the KITTI dataset, and on board of a Parrot SLAMDunk, showing that even rather limited CNNs can help provide stereo vision equipped robots with more reliable depth maps for autonomous navigation. |
Tasks | Autonomous Navigation, Depth Estimation |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07512v1 |
http://arxiv.org/pdf/1803.07512v1.pdf | |
PWC | https://paperswithcode.com/paper/fusion-of-stereo-and-still-monocular-depth |
Repo | |
Framework | |
Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference
Title | Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference |
Authors | Reza Ghaeini, Xiaoli Z. Fern, Prasad Tadepalli |
Abstract | Deep learning models have achieved remarkable success in natural language inference (NLI) tasks. While these models are widely explored, they are hard to interpret and it is often unclear how and why they actually work. In this paper, we take a step toward explaining such deep learning based models through a case study on a popular neural model for NLI. In particular, we propose to interpret the intermediate layers of NLI models by visualizing the saliency of attention and LSTM gating signals. We present several examples for which our methods are able to reveal interesting insights and identify the critical information contributing to the model decisions. |
Tasks | Natural Language Inference |
Published | 2018-08-12 |
URL | http://arxiv.org/abs/1808.03894v1 |
http://arxiv.org/pdf/1808.03894v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-recurrent-and-attention-based |
Repo | |
Framework | |
Coupled Fluid Density and Motion from Single Views
Title | Coupled Fluid Density and Motion from Single Views |
Authors | Marie-Lena Eckert, Wolfgang Heidrich, Nils Thuerey |
Abstract | We present a novel method to reconstruct a fluid’s 3D density and motion based on just a single sequence of images. This is rendered possible by using powerful physical priors for this strongly under-determined problem. More specifically, we propose a novel strategy to infer density updates strongly coupled to previous and current estimates of the flow motion. Additionally, we employ an accurate discretization and depth-based regularizers to compute stable solutions. Using only one view for the reconstruction reduces the complexity of the capturing setup drastically and could even allow for online video databases or smart-phone videos as inputs. The reconstructed 3D velocity can then be flexibly utilized, e.g., for re-simulation, domain modification or guiding purposes. We will demonstrate the capacity of our method with a series of synthetic test cases and the reconstruction of real smoke plumes captured with a Raspberry Pi camera. |
Tasks | |
Published | 2018-06-18 |
URL | http://arxiv.org/abs/1806.06613v1 |
http://arxiv.org/pdf/1806.06613v1.pdf | |
PWC | https://paperswithcode.com/paper/coupled-fluid-density-and-motion-from-single |
Repo | |
Framework | |
Compositional Verification for Autonomous Systems with Deep Learning Components
Title | Compositional Verification for Autonomous Systems with Deep Learning Components |
Authors | Corina S. Pasareanu, Divya Gopinath, Huafeng Yu |
Abstract | As autonomy becomes prevalent in many applications, ranging from recommendation systems to fully autonomous vehicles, there is an increased need to provide safety guarantees for such systems. The problem is difficult, as these are large, complex systems which operate in uncertain environments, requiring data-driven machine-learning components. However, learning techniques such as Deep Neural Networks, widely used today, are inherently unpredictable and lack the theoretical foundations to provide strong assurance guarantees. We present a compositional approach for the scalable, formal verification of autonomous systems that contain Deep Neural Network components. The approach uses assume-guarantee reasoning whereby {\em contracts}, encoding the input-output behavior of individual components, allow the designer to model and incorporate the behavior of the learning-enabled components working side-by-side with the other components. We illustrate the approach on an example taken from the autonomous vehicles domain. |
Tasks | Autonomous Vehicles, Recommendation Systems |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08303v1 |
http://arxiv.org/pdf/1810.08303v1.pdf | |
PWC | https://paperswithcode.com/paper/compositional-verification-for-autonomous |
Repo | |
Framework | |
An Analysis of the Accuracy of the P300 BCI
Title | An Analysis of the Accuracy of the P300 BCI |
Authors | Nitzan S. Artzi, Oren Shriki |
Abstract | The P300 Brain-Computer Interface (BCI) is a well-established communication channel for severely disabled people. The P300 event-related potential is mostly characterized by its amplitude or its area, which correlate with the spelling accuracy of the P300 speller. Here, we introduce a novel approach for estimating the efficiency of this BCI by considering the P300 signal-to-noise ratio (SNR), a parameter that estimates the spatial and temporal noise levels and has a significantly stronger correlation with spelling accuracy. Furthermore, we suggest a Gaussian noise model, which utilizes the P300 event-related potential SNR to predict spelling accuracy under various conditions for LDA-based classification. We demonstrate the utility of this analysis using real data and discuss its potential applications, such as speeding up the process of electrode selection. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1901.03299v1 |
http://arxiv.org/pdf/1901.03299v1.pdf | |
PWC | https://paperswithcode.com/paper/an-analysis-of-the-accuracy-of-the-p300-bci |
Repo | |
Framework | |
Structured Neural Topic Models for Reviews
Title | Structured Neural Topic Models for Reviews |
Authors | Babak Esmaeili, Hongyi Huang, Byron C. Wallace, Jan-Willem van de Meent |
Abstract | We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by inferring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects. Experimental evaluation on large number of datasets demonstrates that aspects are interpretable, yield higher coherence scores than non-structured autoencoding topic model variants, and can be utilized to perform aspect-based comparison and genre discovery. |
Tasks | Topic Models |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.05035v2 |
http://arxiv.org/pdf/1812.05035v2.pdf | |
PWC | https://paperswithcode.com/paper/structured-neural-topic-models-for-reviews |
Repo | |
Framework | |
Proximal Online Gradient is Optimum for Dynamic Regret
Title | Proximal Online Gradient is Optimum for Dynamic Regret |
Authors | Yawei Zhao, Shuang Qiu, Ji Liu |
Abstract | In online learning, the dynamic regret metric chooses the reference (optimal) solution that may change over time, while the typical (static) regret metric assumes the reference solution to be constant over the whole time horizon. The dynamic regret metric is particularly interesting for applications such as online recommendation (since the customers’ preference always evolves over time). While the online gradient method has been shown to be optimal for the static regret metric, the optimal algorithm for the dynamic regret remains unknown. In this paper, we show that proximal online gradient (a general version of online gradient) is optimum to the dynamic regret by showing that the proved lower bound matches the upper bound that slightly improves existing upper bound. |
Tasks | |
Published | 2018-10-08 |
URL | https://arxiv.org/abs/1810.03594v6 |
https://arxiv.org/pdf/1810.03594v6.pdf | |
PWC | https://paperswithcode.com/paper/proximal-online-gradient-is-optimum-for |
Repo | |
Framework | |
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition
Title | Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition |
Authors | Wei-Ning Hsu, James Glass |
Abstract | The performance of automatic speech recognition (ASR) systems can be significantly compromised by previously unseen conditions, which is typically due to a mismatch between training and testing distributions. In this paper, we address robustness by studying domain invariant features, such that domain information becomes transparent to ASR systems, resolving the mismatch problem. Specifically, we investigate a recent model, called the Factorized Hierarchical Variational Autoencoder (FHVAE). FHVAEs learn to factorize sequence-level and segment-level attributes into different latent variables without supervision. We argue that the set of latent variables that contain segment-level information is our desired domain invariant feature for ASR. Experiments are conducted on Aurora-4 and CHiME-4, which demonstrate 41% and 27% absolute word error rate reductions respectively on mismatched domains. |
Tasks | Speech Recognition |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02551v1 |
http://arxiv.org/pdf/1803.02551v1.pdf | |
PWC | https://paperswithcode.com/paper/extracting-domain-invariant-features-by |
Repo | |
Framework | |
Learning to Race through Coordinate Descent Bayesian Optimisation
Title | Learning to Race through Coordinate Descent Bayesian Optimisation |
Authors | Rafael Oliveira, Fernando H. M. Rocha, Lionel Ott, Vitor Guizilini, Fabio Ramos, Valdir Grassi Jr |
Abstract | In the automation of many kinds of processes, the observable outcome can often be described as the combined effect of an entire sequence of actions, or controls, applied throughout its execution. In these cases, strategies to optimise control policies for individual stages of the process might not be applicable, and instead the whole policy might have to be optimised at once. On the other hand, the cost to evaluate the policy’s performance might also be high, being desirable that a solution can be found with as few interactions as possible with the real system. We consider the problem of optimising control policies to allow a robot to complete a given race track within a minimum amount of time. We assume that the robot has no prior information about the track or its own dynamical model, just an initial valid driving example. Localisation is only applied to monitor the robot and to provide an indication of its position along the track’s centre axis. We propose a method for finding a policy that minimises the time per lap while keeping the vehicle on the track using a Bayesian optimisation (BO) approach over a reproducing kernel Hilbert space. We apply an algorithm to search more efficiently over high-dimensional policy-parameter spaces with BO, by iterating over each dimension individually, in a sequential coordinate descent-like scheme. Experiments demonstrate the performance of the algorithm against other methods in a simulated car racing environment. |
Tasks | Bayesian Optimisation, Car Racing |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06179v1 |
http://arxiv.org/pdf/1802.06179v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-race-through-coordinate-descent |
Repo | |
Framework | |
Presentation Attack Detection for Iris Recognition: An Assessment of the State of the Art
Title | Presentation Attack Detection for Iris Recognition: An Assessment of the State of the Art |
Authors | Adam Czajka, Kevin W. Bowyer |
Abstract | Iris recognition is increasingly used in large-scale applications. As a result, presentation attack detection for iris recognition takes on fundamental importance. This survey covers the diverse research literature on this topic. Different categories of presentation attack are described and placed in an application-relevant framework, and the state of the art in detecting each category of attack is summarized. One conclusion from this is that presentation attack detection for iris recognition is not yet a solved problem. Datasets available for research are described, research directions for the near- and medium-term future are outlined, and a short list of recommended readings are suggested. |
Tasks | Iris Recognition |
Published | 2018-03-31 |
URL | http://arxiv.org/abs/1804.00194v3 |
http://arxiv.org/pdf/1804.00194v3.pdf | |
PWC | https://paperswithcode.com/paper/presentation-attack-detection-for-iris |
Repo | |
Framework | |
Learning Selfie-Friendly Abstraction from Artistic Style Images
Title | Learning Selfie-Friendly Abstraction from Artistic Style Images |
Authors | Yicun Liu, Jimmy Ren, Jianbo Liu, Jiawei Zhang, Xiaohao Chen |
Abstract | Artistic style transfer can be thought as a process to generate different versions of abstraction of the original image. However, most of the artistic style transfer operators are not optimized for human faces thus mainly suffers from two undesirable features when applying them to selfies. First, the edges of human faces may unpleasantly deviate from the ones in the original image. Second, the skin color is far from faithful to the original one which is usually problematic in producing quality selfies. In this paper, we take a different approach and formulate this abstraction process as a gradient domain learning problem. We aim to learn a type of abstraction which not only achieves the specified artistic style but also circumvents the two aforementioned drawbacks thus highly applicable to selfie photography. We also show that our method can be directly generalized to videos with high inter-frame consistency. Our method is also robust to non-selfie images, and the generalization to various kinds of real-life scenes is discussed. We will make our code publicly available. |
Tasks | Style Transfer |
Published | 2018-05-05 |
URL | http://arxiv.org/abs/1805.02085v2 |
http://arxiv.org/pdf/1805.02085v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-selfie-friendly-abstraction-from |
Repo | |
Framework | |