January 28, 2020

2949 words 14 mins read

Paper Group ANR 944

Paper Group ANR 944

Universality Theorems for Generative Models. Joint Semantic Domain Alignment and Target Classifier Learning for Unsupervised Domain Adaptation. Guided Layer-wise Learning for Deep Models using Side Information. End-to-End Spoken Language Translation. Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling. Dynamical Sy …

Universality Theorems for Generative Models

Title Universality Theorems for Generative Models
Authors Valentin Khrulkov, Ivan Oseledets
Abstract Despite the fact that generative models are extremely successful in practice, the theory underlying this phenomenon is only starting to catch up with practice. In this work we address the question of the universality of generative models: is it true that neural networks can approximate any data manifold arbitrarily well? We provide a positive answer to this question and show that under mild assumptions on the activation function one can always find a feedforward neural network that maps the latent space onto a set located within the specified Hausdorff distance from the desired data manifold. We also prove similar theorems for the case of multiclass generative models and cycle generative models, trained to map samples from one manifold to another and vice versa.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.11520v1
PDF https://arxiv.org/pdf/1905.11520v1.pdf
PWC https://paperswithcode.com/paper/universality-theorems-for-generative-models
Repo
Framework

Joint Semantic Domain Alignment and Target Classifier Learning for Unsupervised Domain Adaptation

Title Joint Semantic Domain Alignment and Target Classifier Learning for Unsupervised Domain Adaptation
Authors Dong-Dong Chen, Yisen Wang, Jinfeng Yi, Zaiyi Chen, Zhi-Hua Zhou
Abstract Unsupervised domain adaptation aims to transfer the classifier learned from the source domain to the target domain in an unsupervised manner. With the help of target pseudo-labels, aligning class-level distributions and learning the classifier in the target domain are two widely used objectives. Existing methods often separately optimize these two individual objectives, which makes them suffer from the neglect of the other. However, optimizing these two aspects together is not trivial. To alleviate the above issues, we propose a novel method that jointly optimizes semantic domain alignment and target classifier learning in a holistic way. The joint optimization mechanism can not only eliminate their weaknesses but also complement their strengths. The theoretical analysis also verifies the favor of the joint optimization mechanism. Extensive experiments on benchmark datasets show that the proposed method yields the best performance in comparison with the state-of-the-art unsupervised domain adaptation methods.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2019-06-10
URL https://arxiv.org/abs/1906.04053v1
PDF https://arxiv.org/pdf/1906.04053v1.pdf
PWC https://paperswithcode.com/paper/joint-semantic-domain-alignment-and-target
Repo
Framework

Guided Layer-wise Learning for Deep Models using Side Information

Title Guided Layer-wise Learning for Deep Models using Side Information
Authors Pavel Sulimov, Elena Sukmanova, Roman Chereshnev, Attila Kertesz-Farkas
Abstract Training of deep models for classification tasks is hindered by local minima problems and vanishing gradients, while unsupervised layer-wise pretraining does not exploit information from class labels. Here, we propose a new regularization technique, called diversifying regularization (DR), which applies a penalty on hidden units at any layer if they obtain similar features for different types of data. For generative models, DR is defined as divergence over the variational posteriori distributions and included in the maximum likelihood estimation as a prior. Thus, DR includes class label information for greedy pretraining of deep belief networks which result in a better weight initialization for fine-tuning methods. On the other hand, for discriminative training of deep neural networks, DR is defined as a distance over the features and included in the learning objective. With our experimental tests, we show that DR can help the backpropagation to cope with vanishing gradient problems and to provide faster convergence and smaller generalization errors.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.02048v1
PDF https://arxiv.org/pdf/1911.02048v1.pdf
PWC https://paperswithcode.com/paper/guided-layer-wise-learning-for-deep-models
Repo
Framework

End-to-End Spoken Language Translation

Title End-to-End Spoken Language Translation
Authors Michelle Guo, Albert Haque, Prateek Verma
Abstract In this paper, we address the task of spoken language understanding. We present a method for translating spoken sentences from one language into spoken sentences in another language. Given spectrogram-spectrogram pairs, our model can be trained completely from scratch to translate unseen sentences. Our method consists of a pyramidal-bidirectional recurrent network combined with a convolutional network to output sentence-level spectrograms in the target language. Empirically, our model achieves competitive performance with state-of-the-art methods on multiple languages and can generalize to unseen speakers.
Tasks Spoken Language Understanding
Published 2019-04-23
URL http://arxiv.org/abs/1904.10760v1
PDF http://arxiv.org/pdf/1904.10760v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-spoken-language-translation
Repo
Framework

Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling

Title Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling
Authors Marco Dinarelli, Loïc Grobol
Abstract During the last couple of years, Recurrent Neural Networks (RNN) have reached state-of-the-art performances on most of the sequence modelling problems. In particular, the “sequence to sequence” model and the neural CRF have proved to be very effective in this domain. In this article, we propose a new RNN architecture for sequence labelling, leveraging gated recurrent layers to take arbitrarily long contexts into account, and using two decoders operating forward and backward. We compare several variants of the proposed solution and their performances to the state-of-the-art. Most of our results are better than the state-of-the-art or very close to it and thanks to the use of recent technologies, our architecture can scale on corpora larger than those used in this work.
Tasks
Published 2019-04-09
URL http://arxiv.org/abs/1904.04733v3
PDF http://arxiv.org/pdf/1904.04733v3.pdf
PWC https://paperswithcode.com/paper/seq2biseq-bidirectional-output-wise-recurrent
Repo
Framework

Dynamical System Inspired Adaptive Time Stepping Controller for Residual Network Families

Title Dynamical System Inspired Adaptive Time Stepping Controller for Residual Network Families
Authors Yibo Yang, Jianlong Wu, Hongyang Li, Xia Li, Tiancheng Shen, Zhouchen Lin
Abstract The correspondence between residual networks and dynamical systems motivates researchers to unravel the physics of ResNets with well-developed tools in numeral methods of ODE systems. The Runge-Kutta-Fehlberg method is an adaptive time stepping that renders a good trade-off between the stability and efficiency. Can we also have an adaptive time stepping for ResNets to ensure both stability and performance? In this study, we analyze the effects of time stepping on the Euler method and ResNets. We establish a stability condition for ResNets with step sizes and weight parameters, and point out the effects of step sizes on the stability and performance. Inspired by our analyses, we develop an adaptive time stepping controller that is dependent on the parameters of the current step, and aware of previous steps. The controller is jointly optimized with the network training so that variable step sizes and evolution time can be adaptively adjusted. We conduct experiments on ImageNet and CIFAR to demonstrate the effectiveness. It is shown that our proposed method is able to improve both stability and accuracy without introducing additional overhead in inference phase.
Tasks
Published 2019-11-23
URL https://arxiv.org/abs/1911.10305v1
PDF https://arxiv.org/pdf/1911.10305v1.pdf
PWC https://paperswithcode.com/paper/dynamical-system-inspired-adaptive-time
Repo
Framework

Ordering-Based Causal Structure Learning in the Presence of Latent Variables

Title Ordering-Based Causal Structure Learning in the Presence of Latent Variables
Authors Daniel Irving Bernstein, Basil Saeed, Chandler Squires, Caroline Uhler
Abstract We consider the task of learning a causal graph in the presence of latent confounders given i.i.d.~samples from the model. While current algorithms for causal structure discovery in the presence of latent confounders are constraint-based, we here propose a score-based approach. We prove that under assumptions weaker than faithfulness, any sparsest independence map (IMAP) of the distribution belongs to the Markov equivalence class of the true model. This motivates the \emph{Sparsest Poset} formulation - that posets can be mapped to minimal IMAPs of the true model such that the sparsest of these IMAPs is Markov equivalent to the true model. Motivated by this result, we propose a greedy algorithm over the space of posets for causal structure discovery in the presence of latent confounders and compare its performance to the current state-of-the-art algorithms FCI and FCI+ on synthetic data.
Tasks
Published 2019-10-20
URL https://arxiv.org/abs/1910.09014v2
PDF https://arxiv.org/pdf/1910.09014v2.pdf
PWC https://paperswithcode.com/paper/ordering-based-causal-structure-learning-in
Repo
Framework

Improving Transformer-based Speech Recognition Using Unsupervised Pre-training

Title Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Authors Dongwei Jiang, Xiaoning Lei, Wubo Li, Ne Luo, Yuxuan Hu, Wei Zou, Xiangang Li
Abstract Speech recognition technologies are gaining enormous popularity in various industrial applications. However, building a good speech recognition system usually requires large amounts of transcribed data, which is expensive to collect. To tackle this problem, an unsupervised pre-training method called Masked Predictive Coding is proposed, which can be applied for unsupervised pre-training with Transformer based model. Experiments on HKUST show that using the same training data, we can achieve CER 23.3%, exceeding the best end-to-end model by over 0.2% absolute CER. With more pre-training data, we can further reduce the CER to 21.0%, or a 11.8% relative CER reduction over baseline.
Tasks Speech Recognition
Published 2019-10-22
URL https://arxiv.org/abs/1910.09932v3
PDF https://arxiv.org/pdf/1910.09932v3.pdf
PWC https://paperswithcode.com/paper/improving-transformer-based-speech
Repo
Framework

Molecular activity prediction using graph convolutional deep neural network considering distance on a molecular graph

Title Molecular activity prediction using graph convolutional deep neural network considering distance on a molecular graph
Authors Masahito Ohue, Ryota Ii, Keisuke Yanagisawa, Yutaka Akiyama
Abstract Machine learning is often used in virtual screening to find compounds that are pharmacologically active on a target protein. The weave module is a type of graph convolutional deep neural network that uses not only features focusing on atoms alone (atom features) but also features focusing on atom pairs (pair features); thus, it can consider information of nonadjacent atoms. However, the correlation between the distance on the graph and the three-dimensional coordinate distance is uncertain. In this paper, we propose three improvements for modifying the weave module. First, the distances between ring atoms on the graph were modified to bring the distances on the graph closer to the coordinate distance. Second, different weight matrices were used depending on the distance on the graph in the convolution layers of the pair features. Finally, a weighted sum, by distance, was used when converting pair features to atom features. The experimental results show that the performance of the proposed method is slightly better than that of the weave module, and the improvement in the distance representation might be useful for compound activity prediction.
Tasks Activity Prediction
Published 2019-07-02
URL https://arxiv.org/abs/1907.01103v2
PDF https://arxiv.org/pdf/1907.01103v2.pdf
PWC https://paperswithcode.com/paper/molecular-activity-prediction-using-graph
Repo
Framework

Simulating Problem Difficulty in Arithmetic Cognition Through Dynamic Connectionist Models

Title Simulating Problem Difficulty in Arithmetic Cognition Through Dynamic Connectionist Models
Authors Sungjae Cho, Jaeseo Lim, Chris Hickey, Jung Ae Park, Byoung-Tak Zhang
Abstract The present study aims to investigate similarities between how humans and connectionist models experience difficulty in arithmetic problems. Problem difficulty was operationalized by the number of carries involved in solving a given problem. Problem difficulty was measured in humans by response time, and in models by computational steps. The present study found that both humans and connectionist models experience difficulty similarly when solving binary addition and subtraction. Specifically, both agents found difficulty to be strictly increasing with respect to the number of carries. Another notable similarity is that problem difficulty increases more steeply in subtraction than in addition, for both humans and connectionist models. Further investigation on two model hyperparameters — confidence threshold and hidden dimension — shows higher confidence thresholds cause the model to take more computational steps to arrive at the correct answer. Likewise, larger hidden dimensions cause the model to take more computational steps to correctly answer arithmetic problems; however, this effect by hidden dimensions is negligible.
Tasks
Published 2019-05-09
URL https://arxiv.org/abs/1905.03617v3
PDF https://arxiv.org/pdf/1905.03617v3.pdf
PWC https://paperswithcode.com/paper/190503617
Repo
Framework

Facial Expressions Analysis Under Occlusions Based on Specificities of Facial Motion Propagation

Title Facial Expressions Analysis Under Occlusions Based on Specificities of Facial Motion Propagation
Authors Delphine Poux, Benjamin Allaert, Jose Mennesson, Nacim Ihaddadene, Ioan Marius Bilasco, Chaabane Djeraba
Abstract Although much progress has been made in the facial expression analysis field, facial occlusions are still challenging. The main innovation brought by this contribution consists in exploiting the specificities of facial movement propagation for recognizing expressions in presence of important occlusions. The movement induced by an expression extends beyond the movement epicenter. Thus, the movement occurring in an occluded region propagates towards neighboring visible regions. In presence of occlusions, per expression, we compute the importance of each unoccluded facial region and we construct adapted facial frameworks that boost the performance of per expression binary classifier. The output of each expression-dependant binary classifier is then aggregated and fed into a fusion process that aims constructing, per occlusion, a unique model that recognizes all the facial expressions considered. The evaluations highlight the robustness of this approach in presence of significant facial occlusions.
Tasks
Published 2019-04-30
URL http://arxiv.org/abs/1904.13154v1
PDF http://arxiv.org/pdf/1904.13154v1.pdf
PWC https://paperswithcode.com/paper/facial-expressions-analysis-under-occlusions
Repo
Framework

Image Inpainting by Adaptive Fusion of Variable Spline Interpolations

Title Image Inpainting by Adaptive Fusion of Variable Spline Interpolations
Authors Zahra Nabizadeh, Ghazale Ghorbanzade, Nader Karimi, Shadrokh Samavi
Abstract There are many methods for image enhancement. Image inpainting is one of them which could be used in reconstruction and restoration of scratch images or editing images by adding or removing objects. According to its application, different algorithmic and learning methods are proposed. In this paper, the focus is on applications, which enhance the old and historical scratched images. For this purpose, we proposed an adaptive spline interpolation. In this method, a different number of neighbors in four directions are considered for each pixel in the lost block. In the previous methods, predicting the lost pixels that are on edges is the problem. To address this problem, we consider horizontal and vertical edge information. If the pixel is located on an edge, then we use the predicted value in that direction. In other situations, irrelevant predicted values are omitted, and the average of rest values is used as the value of the missing pixel. The method evaluates by PSNR and SSIM metrics on the Kodak dataset. The results show improvement in PSNR and SSIM compared to similar procedures. Also, the run time of the proposed method outperforms others.
Tasks Image Enhancement, Image Inpainting
Published 2019-11-03
URL https://arxiv.org/abs/1911.00825v1
PDF https://arxiv.org/pdf/1911.00825v1.pdf
PWC https://paperswithcode.com/paper/image-inpainting-by-adaptive-fusion-of
Repo
Framework

On Sample Complexity of Projection-Free Primal-Dual Methods for Learning Mixture Policies in Markov Decision Processes

Title On Sample Complexity of Projection-Free Primal-Dual Methods for Learning Mixture Policies in Markov Decision Processes
Authors Masoud Badiei Khuzani, Varun Vasudevan, Hongyi Ren, Lei Xing
Abstract We study the problem of learning policy of an infinite-horizon, discounted cost, Markov decision process (MDP) with a large number of states. We compute the actions of a policy that is nearly as good as a policy chosen by a suitable oracle from a given mixture policy class characterized by the convex hull of a set of known base policies. To learn the coefficients of the mixture model, we recast the problem as an approximate linear programming (ALP) formulation for MDPs, where the feature vectors correspond to the occupation measures of the base policies defined on the state-action space. We then propose a projection-free stochastic primal-dual method with the Bregman divergence to solve the characterized ALP. Furthermore, we analyze the probably approximately correct (PAC) sample complexity of the proposed stochastic algorithm, namely the number of queries required to achieve near optimal objective value. We also propose a modification of our proposed algorithm with the polytope constraint sampling for the smoothed ALP, where the restriction to lower bounding approximations are relaxed. In addition, we apply the proposed algorithms to a queuing problem, and compare their performance with a penalty function algorithm. The numerical results illustrates that the primal-dual achieves better efficiency and low variance across different trials compared to the penalty function method.
Tasks
Published 2019-03-15
URL https://arxiv.org/abs/1903.06727v3
PDF https://arxiv.org/pdf/1903.06727v3.pdf
PWC https://paperswithcode.com/paper/on-sample-complexity-of-projection-free
Repo
Framework

Variational Multi-Phase Segmentation using High-Dimensional Local Features

Title Variational Multi-Phase Segmentation using High-Dimensional Local Features
Authors Niklas Mevenkamp, Benjamin Berkels
Abstract We propose a novel method for multi-phase segmentation of images based on high-dimensional local feature vectors. While the method was developed for the segmentation of extremely noisy crystal images based on localized Fourier transforms, the resulting framework is not tied to specific feature descriptors. For instance, using local spectral histograms as features, it allows for robust texture segmentation. The segmentation itself is based on the multi-phase Mumford-Shah model. Initializing the high-dimensional mean features directly is computationally too demanding and ill-posed in practice. This is resolved by projecting the features onto a low-dimensional space using principle component analysis. The resulting objective functional is minimized using a convexification and the Chambolle-Pock algorithm. Numerical results are presented, illustrating that the algorithm is very competitive in texture segmentation with state-of-the-art performance on the Prague benchmark and provides new possibilities in crystal segmentation, being robust to extreme noise and requiring no prior knowledge of the crystal structure.
Tasks
Published 2019-02-26
URL http://arxiv.org/abs/1902.09863v1
PDF http://arxiv.org/pdf/1902.09863v1.pdf
PWC https://paperswithcode.com/paper/variational-multi-phase-segmentation-using
Repo
Framework

Deep Reflection Prior

Title Deep Reflection Prior
Authors Qingnan Fan, Yingda Yin, Dongdong Chen, Yujie Wang, Angelica Aviles-Rivero, Ruoteng Li, Carola-Bibiane Schnlieb, Dani Lischinski, Baoquan Chen
Abstract Reflections are very common phenomena in our daily photography, which distract people’s attention from the scene behind the glass. The problem of removing reflection artifacts is important but challenging due to its ill-posed nature. Recent learning-based approaches have demonstrated a significant improvement in removing reflections. However, these methods are limited as they require a large number of synthetic reflection/clean image pairs for supervision, at the risk of overfitting in the synthetic image domain. In this paper, we propose a learning-based approach that captures the reflection statistical prior for single image reflection removal. Our algorithm is driven by optimizing the target with joint constraints enhanced between multiple input images during the training stage, but is able to eliminate reflections only from a single input for evaluation. Our framework allows to predict both background and reflection via a one-branch deep neural network, which is implemented by the controllable latent code that indicates either the background or reflection output. We demonstrate superior performance over the state-of-the-art methods on a large range of real-world images. We further provide insightful analysis behind the learned latent code, which may inspire more future work.
Tasks
Published 2019-12-08
URL https://arxiv.org/abs/1912.03623v1
PDF https://arxiv.org/pdf/1912.03623v1.pdf
PWC https://paperswithcode.com/paper/deep-reflection-prior
Repo
Framework
comments powered by Disqus