October 17, 2019

3216 words 16 mins read

Paper Group ANR 729

Paper Group ANR 729

Stochastic Image Deformation in Frequency Domain and Parameter Estimation using Moment Evolutions. A neural network trained to predict future video frames mimics critical properties of biological neuronal responses and perception. Parsimonious HMMs for Offline Handwritten Chinese Text Recognition. DeepTravel: a Neural Network Based Travel Time Esti …

Stochastic Image Deformation in Frequency Domain and Parameter Estimation using Moment Evolutions

Title Stochastic Image Deformation in Frequency Domain and Parameter Estimation using Moment Evolutions
Authors Line Kühnel, Alexis Arnaudon, Tom Fletcher, Stefan Sommer
Abstract Modelling deformation of anatomical objects observed in medical images can help describe disease progression patterns and variations in anatomy across populations. We apply a stochastic generalisation of the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework to model differences in the evolution of anatomical objects detected in populations of image data. The computational challenges that are prevalent even in the deterministic LDDMM setting are handled by extending the FLASH LDDMM representation to the stochastic setting keeping a finite discretisation of the infinite dimensional space of image deformations. In this computationally efficient setting, we perform estimation to infer parameters for noise correlations and local variability in datasets of images. Fundamental for the optimisation procedure is using the finite dimensional Fourier representation to derive approximations of the evolution of moments for the stochastic warps. Particularly, the first moment allows us to infer deformation mean trajectories. The second moment encodes variation around the mean, and thus provides information on the noise correlation. We show on simulated datasets of 2D MR brain images that the estimation algorithm can successfully recover parameters of the stochastic model.
Tasks
Published 2018-12-13
URL http://arxiv.org/abs/1812.05537v1
PDF http://arxiv.org/pdf/1812.05537v1.pdf
PWC https://paperswithcode.com/paper/stochastic-image-deformation-in-frequency
Repo
Framework

A neural network trained to predict future video frames mimics critical properties of biological neuronal responses and perception

Title A neural network trained to predict future video frames mimics critical properties of biological neuronal responses and perception
Authors William Lotter, Gabriel Kreiman, David Cox
Abstract While deep neural networks take loose inspiration from neuroscience, it is an open question how seriously to take the analogies between artificial deep networks and biological neuronal systems. Interestingly, recent work has shown that deep convolutional neural networks (CNNs) trained on large-scale image recognition tasks can serve as strikingly good models for predicting the responses of neurons in visual cortex to visual stimuli, suggesting that analogies between artificial and biological neural networks may be more than superficial. However, while CNNs capture key properties of the average responses of cortical neurons, they fail to explain other properties of these neurons. For one, CNNs typically require large quantities of labeled input data for training. Our own brains, in contrast, rarely have access to this kind of supervision, so to the extent that representations are similar between CNNs and brains, this similarity must arise via different training paths. In addition, neurons in visual cortex produce complex time-varying responses even to static inputs, and they dynamically tune themselves to temporal regularities in the visual environment. We argue that these differences are clues to fundamental differences between the computations performed in the brain and in deep networks. To begin to close the gap, here we study the emergent properties of a previously-described recurrent generative network that is trained to predict future video frames in a self-supervised manner. Remarkably, the model is able to capture a wide variety of seemingly disparate phenomena observed in visual cortex, ranging from single unit response dynamics to complex perceptual motion illusions. These results suggest potentially deep connections between recurrent predictive neural network models and the brain, providing new leads that can enrich both fields.
Tasks Predict Future Video Frames
Published 2018-05-28
URL http://arxiv.org/abs/1805.10734v2
PDF http://arxiv.org/pdf/1805.10734v2.pdf
PWC https://paperswithcode.com/paper/a-neural-network-trained-to-predict-future
Repo
Framework

Parsimonious HMMs for Offline Handwritten Chinese Text Recognition

Title Parsimonious HMMs for Offline Handwritten Chinese Text Recognition
Authors Wenchao Wang, Jun Du, Zi-Rui Wang
Abstract Recently, hidden Markov models (HMMs) have achieved promising results for offline handwritten Chinese text recognition. However, due to the large vocabulary of Chinese characters with each modeled by a uniform and fixed number of hidden states, a high demand of memory and computation is required. In this study, to address this issue, we present parsimonious HMMs via the state tying which can fully utilize the similarities among different Chinese characters. Two-step algorithm with the data-driven question-set is adopted to generate the tied-state pool using the likelihood measure. The proposed parsimonious HMMs with both Gaussian mixture models (GMMs) and deep neural networks (DNNs) as the emission distributions not only lead to a compact model but also improve the recognition accuracy via the data sharing for the tied states and the confusion decreasing among state classes. Tested on ICDAR-2013 competition database, in the best configured case, the new parsimonious DNN-HMM can yield a relative character error rate (CER) reduction of 6.2%, 25% reduction of model size and 60% reduction of decoding time over the conventional DNN-HMM. In the compact setting case of average 1-state HMM, our parsimonious DNN-HMM significantly outperforms the conventional DNN-HMM with a relative CER reduction of 35.5%.
Tasks Handwritten Chinese Text Recognition
Published 2018-08-13
URL http://arxiv.org/abs/1808.04138v1
PDF http://arxiv.org/pdf/1808.04138v1.pdf
PWC https://paperswithcode.com/paper/parsimonious-hmms-for-offline-handwritten
Repo
Framework

DeepTravel: a Neural Network Based Travel Time Estimation Model with Auxiliary Supervision

Title DeepTravel: a Neural Network Based Travel Time Estimation Model with Auxiliary Supervision
Authors Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng
Abstract Estimating the travel time of a path is of great importance to smart urban mobility. Existing approaches are either based on estimating the time cost of each road segment which are not able to capture many cross-segment complex factors, or designed heuristically in a non-learning-based way which fail to utilize the existing abundant temporal labels of the data, i.e., the time stamp of each trajectory point. In this paper, we leverage on new development of deep neural networks and propose a novel auxiliary supervision model, namely DeepTravel, that can automatically and effectively extract different features, as well as make full use of the temporal labels of the trajectory data. We have conducted comprehensive experiments on real datasets to demonstrate the out-performance of DeepTravel over existing approaches.
Tasks
Published 2018-02-06
URL http://arxiv.org/abs/1802.02147v1
PDF http://arxiv.org/pdf/1802.02147v1.pdf
PWC https://paperswithcode.com/paper/deeptravel-a-neural-network-based-travel-time
Repo
Framework

Minimax Estimation of Neural Net Distance

Title Minimax Estimation of Neural Net Distance
Authors Kaiyi Ji, Yingbin Liang
Abstract An important class of distance metrics proposed for training generative adversarial networks (GANs) is the integral probability metric (IPM), in which the neural net distance captures the practical GAN training via two neural networks. This paper investigates the minimax estimation problem of the neural net distance based on samples drawn from the distributions. We develop the first known minimax lower bound on the estimation error of the neural net distance, and an upper bound tighter than an existing bound on the estimator error for the empirical neural net distance. Our lower and upper bounds match not only in the order of the sample size but also in terms of the norm of the parameter matrices of neural networks, which justifies the empirical neural net distance as a good approximation of the true neural net distance for training GANs in practice.
Tasks
Published 2018-11-02
URL http://arxiv.org/abs/1811.01054v1
PDF http://arxiv.org/pdf/1811.01054v1.pdf
PWC https://paperswithcode.com/paper/minimax-estimation-of-neural-net-distance
Repo
Framework

Real-Time Anomaly Detection With HMOF Feature

Title Real-Time Anomaly Detection With HMOF Feature
Authors Huihui Zhu, Bin Liu, Guojun Yin, Yan Lu, Weihai Li, Nenghai Yu
Abstract Anomaly detection is a challenging problem in intelligent video surveillance. Most existing methods are computation consuming, which cannot satisfy the real-time requirement. In this paper, we propose a real-time anomaly detection framework with low computational complexity and high efficiency. A new feature, named Histogram of Magnitude Optical Flow (HMOF), is proposed to capture the motion of video patches. Compared with existing feature descriptors, HMOF is more sensitive to motion magnitude and more efficient to distinguish anomaly information. The HMOF features are computed for foreground patches, and are reconstructed by the auto-encoder for better clustering. Then, we use Gaussian Mixture Model (GMM) Classifiers to distinguish anomalies from normal activities in videos. Experimental results show that our framework outperforms state-of-the-art methods, and can reliably detect anomalies in real-time.
Tasks Anomaly Detection, Optical Flow Estimation
Published 2018-12-12
URL http://arxiv.org/abs/1812.04980v1
PDF http://arxiv.org/pdf/1812.04980v1.pdf
PWC https://paperswithcode.com/paper/real-time-anomaly-detection-with-hmof-feature
Repo
Framework

A Deep Generative Model of Vowel Formant Typology

Title A Deep Generative Model of Vowel Formant Typology
Authors Ryan Cotterell, Jason Eisner
Abstract What makes some types of languages more probable than others? For instance, we know that almost all spoken languages contain the vowel phoneme /i/; why should that be? The field of linguistic typology seeks to answer these questions and, thereby, divine the mechanisms that underlie human language. In our work, we tackle the problem of vowel system typology, i.e., we propose a generative probability model of which vowels a language contains. In contrast to previous work, we work directly with the acoustic information – the first two formant values – rather than modeling discrete sets of phonemic symbols (IPA). We develop a novel generative probability model and report results based on a corpus of 233 languages.
Tasks
Published 2018-07-08
URL http://arxiv.org/abs/1807.02745v1
PDF http://arxiv.org/pdf/1807.02745v1.pdf
PWC https://paperswithcode.com/paper/a-deep-generative-model-of-vowel-formant
Repo
Framework

Atrial fibrosis quantification based on maximum likelihood estimator of multivariate images

Title Atrial fibrosis quantification based on maximum likelihood estimator of multivariate images
Authors Fuping Wu, Lei Li, Guang Yang, Tom Wong, Raad Mohiaddin, David Firmin, Jennifer Keegan, Lingchao Xu, Xiahai Zhuang
Abstract We present a fully-automated segmentation and quantification of the left atrial (LA) fibrosis and scars combining two cardiac MRIs, one is the target late gadolinium-enhanced (LGE) image, and the other is an anatomical MRI from the same acquisition session. We formulate the joint distribution of images using a multivariate mixture model (MvMM), and employ the maximum likelihood estimator (MLE) for texture classification of the images simultaneously. The MvMM can also embed transformations assigned to the images to correct the misregistration. The iterated conditional mode algorithm is adopted for optimization. This method first extracts the anatomical shape of the LA, and then estimates a prior probability map. It projects the resulting segmentation onto the LA surface, for quantification and analysis of scarring. We applied the proposed method to 36 clinical data sets and obtained promising results (Accuracy: $0.809\pm .150$, Dice: $0.556\pm.187$). We compared the method with the conventional algorithms and showed an evidently and statistically better performance ($p<0.03$).
Tasks Texture Classification
Published 2018-10-22
URL http://arxiv.org/abs/1810.09075v1
PDF http://arxiv.org/pdf/1810.09075v1.pdf
PWC https://paperswithcode.com/paper/atrial-fibrosis-quantification-based-on
Repo
Framework

A Cost-Sensitive Deep Belief Network for Imbalanced Classification

Title A Cost-Sensitive Deep Belief Network for Imbalanced Classification
Authors Chong Zhang, Kay Chen Tan, Haizhou Li, Geok Soon Hong
Abstract Imbalanced data with a skewed class distribution are common in many real-world applications. Deep Belief Network (DBN) is a machine learning technique that is effective in classification tasks. However, conventional DBN does not work well for imbalanced data classification because it assumes equal costs for each class. To deal with this problem, cost-sensitive approaches assign different misclassification costs for different classes without disrupting the true data sample distributions. However, due to lack of prior knowledge, the misclassification costs are usually unknown and hard to choose in practice. Moreover, it has not been well studied as to how cost-sensitive learning could improve DBN performance on imbalanced data problems. This paper proposes an evolutionary cost-sensitive deep belief network (ECS-DBN) for imbalanced classification. ECS-DBN uses adaptive differential evolution to optimize the misclassification costs based on training data, that presents an effective approach to incorporating the evaluation measure (i.e. G-mean) into the objective function. We first optimize the misclassification costs, then apply them to deep belief network. Adaptive differential evolution optimization is implemented as the optimization algorithm that automatically updates its corresponding parameters without the need of prior domain knowledge. The experiments have shown that the proposed approach consistently outperforms the state-of-the-art on both benchmark datasets and real-world dataset for fault diagnosis in tool condition monitoring.
Tasks
Published 2018-04-28
URL http://arxiv.org/abs/1804.10801v2
PDF http://arxiv.org/pdf/1804.10801v2.pdf
PWC https://paperswithcode.com/paper/a-cost-sensitive-deep-belief-network-for
Repo
Framework

Towards a Theoretical Understanding of Hashing-Based Neural Nets

Title Towards a Theoretical Understanding of Hashing-Based Neural Nets
Authors Yibo Lin, Zhao Song, Lin F. Yang
Abstract Parameter reduction has been an important topic in deep learning due to the ever-increasing size of deep neural network models and the need to train and run them on resource limited machines. Despite many efforts in this area, there were no rigorous theoretical guarantees on why existing neural net compression methods should work. In this paper, we provide provable guarantees on some hashing-based parameter reduction methods in neural nets. First, we introduce a neural net compression scheme based on random linear sketching (which is usually implemented efficiently via hashing), and show that the sketched (smaller) network is able to approximate the original network on all input data coming from any smooth and well-conditioned low-dimensional manifold. The sketched network can also be trained directly via back-propagation. Next, we study the previously proposed HashedNets architecture and show that the optimization landscape of one-hidden-layer HashedNets has a local strong convexity property similar to a normal fully connected neural network. We complement our theoretical results with empirical verifications.
Tasks
Published 2018-12-26
URL http://arxiv.org/abs/1812.10244v2
PDF http://arxiv.org/pdf/1812.10244v2.pdf
PWC https://paperswithcode.com/paper/towards-a-theoretical-understanding-of
Repo
Framework

Area Attention

Title Area Attention
Authors Yang Li, Lukasz Kaiser, Samy Bengio, Si Si
Abstract Existing attention mechanisms are trained to attend to individual items in a collection (the memory) with a predefined, fixed granularity, e.g., a word token or an image grid. We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e.g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences. Importantly, the shape and the size of an area are dynamically determined via learning, which enables a model to attend to information with varying granularity. Area attention can easily work with existing model architectures such as multi-head attention for simultaneously attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation (both character and token-level) and image captioning, and improve upon strong (state-of-the-art) baselines in all the cases. These improvements are obtainable with a basic form of area attention that is parameter free.
Tasks Image Captioning, Machine Translation
Published 2018-10-23
URL https://arxiv.org/abs/1810.10126v6
PDF https://arxiv.org/pdf/1810.10126v6.pdf
PWC https://paperswithcode.com/paper/area-attention
Repo
Framework

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Title Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation
Authors Peyman Passban, Qun Liu, Andy Way
Abstract Recently, neural machine translation (NMT) has emerged as a powerful alternative to conventional statistical approaches. However, its performance drops considerably in the presence of morphologically rich languages (MRLs). Neural engines usually fail to tackle the large vocabulary and high out-of-vocabulary (OOV) word rate of MRLs. Therefore, it is not suitable to exploit existing word-based models to translate this set of languages. In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information. In our architecture, an additional morphology table is plugged into the model. Each time the decoder samples from a target vocabulary, the table sends auxiliary signals from the most relevant affixes in order to enrich the decoder’s current state and constrain it to provide better predictions. We evaluated our model to translate English into German, Russian, and Turkish as three MRLs and observed significant improvements.
Tasks Machine Translation
Published 2018-04-17
URL http://arxiv.org/abs/1804.06506v1
PDF http://arxiv.org/pdf/1804.06506v1.pdf
PWC https://paperswithcode.com/paper/improving-character-based-decoding-using
Repo
Framework

Machine Translation : From Statistical to modern Deep-learning practices

Title Machine Translation : From Statistical to modern Deep-learning practices
Authors Siddhant Srivastava, Anupam Shukla, Ritu Tiwari
Abstract Machine translation (MT) is an area of study in Natural Language processing which deals with the automatic translation of human language, from one language to another by the computer. Having a rich research history spanning nearly three decades, Machine translation is one of the most sought after area of research in the linguistics and computational community. In this paper, we investigate the models based on deep learning that have achieved substantial progress in recent years and becoming the prominent method in MT. We shall discuss the two main deep-learning based Machine Translation methods, one at component or domain level which leverages deep learning models to enhance the efficacy of Statistical Machine Translation (SMT) and end-to-end deep learning models in MT which uses neural networks to find correspondence between the source and target languages using the encoder-decoder architecture. We conclude this paper by providing a time line of the major research problems solved by the researchers and also provide a comprehensive overview of present areas of research in Neural Machine Translation.
Tasks Machine Translation
Published 2018-12-11
URL http://arxiv.org/abs/1812.04238v1
PDF http://arxiv.org/pdf/1812.04238v1.pdf
PWC https://paperswithcode.com/paper/machine-translation-from-statistical-to
Repo
Framework

Translating Pro-Drop Languages with Reconstruction Models

Title Translating Pro-Drop Languages with Reconstruction Models
Authors Longyue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu
Abstract Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations. To date, very little attention has been paid to the dropped pronoun (DP) problem within neural machine translation (NMT). In this work, we propose a novel reconstruction-based approach to alleviating DP translation problems for NMT models. Firstly, DPs within all source sentences are automatically annotated with parallel information extracted from the bilingual training corpus. Next, the annotated source sentence is reconstructed from hidden representations in the NMT model. With auxiliary training objectives, in terms of reconstruction scores, the parameters associated with the NMT model are guided to produce enhanced hidden representations that are encouraged as much as possible to embed annotated DP information. Experimental results on both Chinese-English and Japanese-English dialogue translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT baseline, which is directly built on the training data annotated with DPs.
Tasks Machine Translation
Published 2018-01-10
URL http://arxiv.org/abs/1801.03257v1
PDF http://arxiv.org/pdf/1801.03257v1.pdf
PWC https://paperswithcode.com/paper/translating-pro-drop-languages-with
Repo
Framework

Universal Dependency Parsing for Hindi-English Code-switching

Title Universal Dependency Parsing for Hindi-English Code-switching
Authors Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma
Abstract Code-switching is a phenomenon of mixing grammatical structures of two or more languages under varied social constraints. The code-switching data differ so radically from the benchmark corpora used in NLP community that the application of standard technologies to these data degrades their performance sharply. Unlike standard corpora, these data often need to go through additional processes such as language identification, normalization and/or back-transliteration for their efficient processing. In this paper, we investigate these indispensable processes and other problems associated with syntactic parsing of code-switching data and propose methods to mitigate their effects. In particular, we study dependency parsing of code-switching data of Hindi and English multilingual speakers from Twitter. We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural stacking model for parsing that efficiently leverages part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks. We also present normalization and back-transliteration models with a decoding process tailored for code-switching data. Results show that our neural stacking parser is 1.5% LAS points better than the augmented parsing model and our decoding process improves results by 3.8% LAS points over the first-best normalization and/or back-transliteration.
Tasks Dependency Parsing, Language Identification, Transliteration
Published 2018-04-16
URL http://arxiv.org/abs/1804.05868v3
PDF http://arxiv.org/pdf/1804.05868v3.pdf
PWC https://paperswithcode.com/paper/universal-dependency-parsing-for-hindi
Repo
Framework
comments powered by Disqus