July 27, 2019

3049 words 15 mins read

Paper Group ANR 570

Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration. Item Recommendation with Evolving User Preferences and Experience. On weight initialization in deep neural networks. Video Captioning with Guidance of Multimodal Latent Topics. Feature uncertainty bounding schemes for large …

Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration


Title	Optimizing Long Short-Term Memory Recurrent Neural Networks Using Ant Colony Optimization to Predict Turbine Engine Vibration
Authors	AbdElRahman ElSaid, Travis Desell, Fatima El Jamiy, James Higgins, Brandon Wild
Abstract	This article expands on research that has been done to develop a recurrent neural network (RNN) capable of predicting aircraft engine vibrations using long short-term memory (LSTM) neurons. LSTM RNNs can provide a more generalizable and robust method for prediction over analytical calculations of engine vibration, as analytical calculations must be solved iteratively based on specific empirical engine parameters, making this approach ungeneralizable across multiple engines. In initial work, multiple LSTM RNN architectures were proposed, evaluated and compared. This research improves the performance of the most effective LSTM network design proposed in the previous work by using a promising neuroevolution method based on ant colony optimization (ACO) to develop and enhance the LSTM cell structure of the network. A parallelized version of the ACO neuroevolution algorithm has been developed and the evolved LSTM RNNs were compared to the previously used fixed topology. The evolved networks were trained on a large database of flight data records obtained from an airline containing flights that suffered from excessive vibration. Results were obtained using MPI (Message Passing Interface) on a high performance computing (HPC) cluster, evolving 1000 different LSTM cell structures using 168 cores over 4 days. The new evolved LSTM cells showed an improvement of 1.35%, reducing prediction error from 5.51% to 4.17% when predicting excessive engine vibrations 10 seconds in the future, while at the same time dramatically reducing the number of weights from 21,170 to 11,810.
Tasks
Published	2017-10-10
URL	http://arxiv.org/abs/1710.03753v1
PDF	http://arxiv.org/pdf/1710.03753v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-long-short-term-memory-recurrent
Repo
Framework

Item Recommendation with Evolving User Preferences and Experience


Title	Item Recommendation with Evolving User Preferences and Experience
Authors	Subhabrata Mukherjee, Hemank Lamba, Gerhard Weikum
Abstract	Current recommender systems exploit user and item similarities by collaborative filtering. Some advanced methods also consider the temporal evolution of item ratings as a global background process. However, all prior methods disregard the individual evolution of a user’s experience level and how this is expressed in the user’s writing in a review community. In this paper, we model the joint evolution of user experience, interest in specific item facets, writing style, and rating behavior. This way we can generate individual recommendations that take into account the user’s maturity level (e.g., recommending art movies rather than blockbusters for a cinematography expert). As only item ratings and review texts are observables, we capture the user’s experience and interests in a latent model learned from her reviews, vocabulary and writing style. We develop a generative HMM-LDA model to trace user evolution, where the Hidden Markov Model (HMM) traces her latent experience progressing over time – with solely user reviews and ratings as observables over time. The facets of a user’s interest are drawn from a Latent Dirichlet Allocation (LDA) model derived from her reviews, as a function of her (again latent) experience level. In experiments with five real-world datasets, we show that our model improves the rating prediction over state-of-the-art baselines, by a substantial margin. We also show, in a use-case study, that our model performs well in the assessment of user experience levels.
Tasks	Recommendation Systems
Published	2017-05-06
URL	http://arxiv.org/abs/1705.02519v1
PDF	http://arxiv.org/pdf/1705.02519v1.pdf
PWC	https://paperswithcode.com/paper/item-recommendation-with-evolving-user
Repo
Framework

On weight initialization in deep neural networks


Title	On weight initialization in deep neural networks
Authors	Siddharth Krishna Kumar
Abstract	A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable at 0. Next, I derive the weight initialization strategy for the Rectified Linear Unit (RELU), and provide theoretical insights into why the Xavier initialization is a poor choice with RELU activations. My analysis provides a clear demonstration of the role of non-linearities in determining the proper weight initializations.
Tasks
Published	2017-04-28
URL	http://arxiv.org/abs/1704.08863v2
PDF	http://arxiv.org/pdf/1704.08863v2.pdf
PWC	https://paperswithcode.com/paper/on-weight-initialization-in-deep-neural
Repo
Framework

Video Captioning with Guidance of Multimodal Latent Topics


Title	Video Captioning with Guidance of Multimodal Latent Topics
Authors	Shizhe Chen, Jia Chen, Qin Jin, Alexander Hauptmann
Abstract	The topic diversity of open-domain videos leads to various vocabularies and linguistic expressions in describing video contents, and therefore, makes the video captioning task even more challenging. In this paper, we propose an unified caption framework, M&M TGM, which mines multimodal topics in unsupervised fashion from data and guides the caption decoder with these topics. Compared to pre-defined topics, the mined multimodal topics are more semantically and visually coherent and can reflect the topic distribution of videos better. We formulate the topic-aware caption generation as a multi-task learning problem, in which we add a parallel task, topic prediction, in addition to the caption task. For the topic prediction task, we use the mined topics as the teacher to train a student topic prediction model, which learns to predict the latent topics from multimodal contents of videos. The topic prediction provides intermediate supervision to the learning process. As for the caption task, we propose a novel topic-aware decoder to generate more accurate and detailed video descriptions with the guidance from latent topics. The entire learning procedure is end-to-end and it optimizes both tasks simultaneously. The results from extensive experiments conducted on the MSR-VTT and Youtube2Text datasets demonstrate the effectiveness of our proposed model. M&M TGM not only outperforms prior state-of-the-art methods on multiple evaluation metrics and on both benchmark datasets, but also achieves better generalization ability.
Tasks	Multi-Task Learning, Video Captioning
Published	2017-08-31
URL	http://arxiv.org/abs/1708.09667v2
PDF	http://arxiv.org/pdf/1708.09667v2.pdf
PWC	https://paperswithcode.com/paper/video-captioning-with-guidance-of-multimodal
Repo
Framework

Feature uncertainty bounding schemes for large robust nonlinear SVM classifiers


Title	Feature uncertainty bounding schemes for large robust nonlinear SVM classifiers
Authors	Nicolas Couellan, Sophie Jan
Abstract	We consider the binary classification problem when data are large and subject to unknown but bounded uncertainties. We address the problem by formulating the nonlinear support vector machine training problem with robust optimization. To do so, we analyze and propose two bounding schemes for uncertainties associated to random approximate features in low dimensional spaces. The proposed techniques are based on Random Fourier Features and the Nystr"om methods. The resulting formulations can be solved with efficient stochastic approximation techniques such as stochastic (sub)-gradient, stochastic proximal gradient techniques or their variants.
Tasks
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09795v1
PDF	http://arxiv.org/pdf/1706.09795v1.pdf
PWC	https://paperswithcode.com/paper/feature-uncertainty-bounding-schemes-for
Repo
Framework

An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection


Title	An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection
Authors	Chao Chen, Xiao Lin, Gabriel Terejanu
Abstract	Long Short-Term Memory networks trained with gradient descent and back-propagation have received great success in various applications. However, point estimation of the weights of the networks is prone to over-fitting problems and lacks important uncertainty information associated with the estimation. However, exact Bayesian neural network methods are intractable and non-applicable for real-world applications. In this study, we propose an approximate estimation of the weights uncertainty using Ensemble Kalman Filter, which is easily scalable to a large number of weights. Furthermore, we optimize the covariance of the noise distribution in the ensemble update step using maximum likelihood estimation. To assess the proposed algorithm, we apply it to outlier detection in five real-world events retrieved from the Twitter platform.
Tasks	Outlier Detection
Published	2017-12-23
URL	https://arxiv.org/abs/1712.08773v2
PDF	https://arxiv.org/pdf/1712.08773v2.pdf
PWC	https://paperswithcode.com/paper/an-approximate-bayesian-long-short-term
Repo
Framework

Acoustic Modeling Using a Shallow CNN-HTSVM Architecture


Title	Acoustic Modeling Using a Shallow CNN-HTSVM Architecture
Authors	Christopher Dane Shulby, Martha Dais Ferreira, Rodrigo F. de Mello, Sandra Maria Aluisio
Abstract	High-accuracy speech recognition is especially challenging when large datasets are not available. It is possible to bridge this gap with careful and knowledge-driven parsing combined with the biologically inspired CNN and the learning guarantees of the Vapnik Chervonenkis (VC) theory. This work presents a Shallow-CNN-HTSVM (Hierarchical Tree Support Vector Machine classifier) architecture which uses a predefined knowledge-based set of rules with statistical machine learning techniques. Here we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. The CNN-HTSVM acoustic model outperforms traditional GMM-HMM models and the HTSVM structure outperforms a MLP multi-class classifier. More importantly we isolate the performance of the acoustic model and provide results on both the frame and phoneme level considering the true robustness of the model. We show that even with a small amount of data accurate and robust recognition rates can be obtained.
Tasks	Speech Recognition
Published	2017-06-27
URL	http://arxiv.org/abs/1706.09055v1
PDF	http://arxiv.org/pdf/1706.09055v1.pdf
PWC	https://paperswithcode.com/paper/acoustic-modeling-using-a-shallow-cnn-htsvm
Repo
Framework

Machine Learning for Quantum Dynamics: Deep Learning of Excitation Energy Transfer Properties


Title	Machine Learning for Quantum Dynamics: Deep Learning of Excitation Energy Transfer Properties
Authors	Florian Häse, Christoph Kreisbeck, Alán Aspuru-Guzik
Abstract	Understanding the relationship between the structure of light-harvesting systems and their excitation energy transfer properties is of fundamental importance in many applications including the development of next generation photovoltaics. Natural light harvesting in photosynthesis shows remarkable excitation energy transfer properties, which suggests that pigment-protein complexes could serve as blueprints for the design of nature inspired devices. Mechanistic insights into energy transport dynamics can be gained by leveraging numerically involved propagation schemes such as the hierarchical equations of motion (HEOM). Solving these equations, however, is computationally costly due to the adverse scaling with the number of pigments. Therefore virtual high-throughput screening, which has become a powerful tool in material discovery, is less readily applicable for the search of novel excitonic devices. We propose the use of artificial neural networks to bypass the computational limitations of established techniques for exploring the structure-dynamics relation in excitonic systems. Once trained, our neural networks reduce computational costs by several orders of magnitudes. Our predicted transfer times and transfer efficiencies exhibit similar or even higher accuracies than frequently used approximate methods such as secular Redfield theory
Tasks
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06338v1
PDF	http://arxiv.org/pdf/1707.06338v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-quantum-dynamics-deep
Repo
Framework

A Flexible Approach to Automated RNN Architecture Generation


Title	A Flexible Approach to Automated RNN Architecture Generation
Authors	Martin Schrimpf, Stephen Merity, James Bradbury, Richard Socher
Abstract	The process of designing neural architectures requires expert knowledge and extensive trial and error. While automated architecture search may simplify these requirements, the recurrent neural network (RNN) architectures generated by existing methods are limited in both flexibility and components. We propose a domain-specific language (DSL) for use in automated architecture search which can produce novel RNNs of arbitrary depth and width. The DSL is flexible enough to define standard architectures such as the Gated Recurrent Unit and Long Short Term Memory and allows the introduction of non-standard RNN components such as trigonometric curves and layer normalization. Using two different candidate generation techniques, random search with a ranking function and reinforcement learning, we explore the novel architectures produced by the RNN DSL for language modeling and machine translation domains. The resulting architectures do not follow human intuition yet perform well on their targeted tasks, suggesting the space of usable RNN architectures is far larger than previously assumed.
Tasks	Language Modelling, Machine Translation, Neural Architecture Search
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07316v1
PDF	http://arxiv.org/pdf/1712.07316v1.pdf
PWC	https://paperswithcode.com/paper/a-flexible-approach-to-automated-rnn
Repo
Framework

Deep Health Care Text Classification


Title	Deep Health Care Text Classification
Authors	Vinayakumar R, Barathi Ganesh HB, Anand Kumar M, Soman KP
Abstract	Health related social media mining is a valuable apparatus for the early recognition of the diverse antagonistic medicinal conditions. Mostly, the existing methods are based on machine learning with knowledge-based learning. This working note presents the Recurrent neural network (RNN) and Long short-term memory (LSTM) based embedding for automatic health text classification in the social media mining. For each task, two systems are built and that classify the tweet at the tweet level. RNN and LSTM are used for extracting features and non-linear activation function at the last layer facilitates to distinguish the tweets of different categories. The experiments are conducted on 2nd Social Media Mining for Health Applications Shared Task at AMIA 2017. The experiment results are considerable; however the proposed method is appropriate for the health text classification. This is primarily due to the reason that, it doesn’t rely on any feature engineering mechanisms.
Tasks	Feature Engineering, Text Classification
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08396v1
PDF	http://arxiv.org/pdf/1710.08396v1.pdf
PWC	https://paperswithcode.com/paper/deep-health-care-text-classification
Repo
Framework

Curve Reconstruction via the Global Statistics of Natural Curves


Title	Curve Reconstruction via the Global Statistics of Natural Curves
Authors	Ehud Barnea, Ohad Ben-Shahar
Abstract	Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and oftentimes it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.
Tasks	Image Inpainting
Published	2017-11-08
URL	http://arxiv.org/abs/1711.03172v3
PDF	http://arxiv.org/pdf/1711.03172v3.pdf
PWC	https://paperswithcode.com/paper/curve-reconstruction-via-the-global
Repo
Framework

Spherical Paragraph Model


Title	Spherical Paragraph Model
Authors	Ruqing Zhang, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng
Abstract	Representing texts as fixed-length vectors is central to many language processing tasks. Most traditional methods build text representations based on the simple Bag-of-Words (BoW) representation, which loses the rich semantic relations between words. Recent advances in natural language processing have shown that semantically meaningful representations of words can be efficiently acquired by distributed models, making it possible to build text representations based on a better foundation called the Bag-of-Word-Embedding (BoWE) representation. However, existing text representation methods using BoWE often lack sound probabilistic foundations or cannot well capture the semantic relatedness encoded in word vectors. To address these problems, we introduce the Spherical Paragraph Model (SPM), a probabilistic generative model based on BoWE, for text representation. SPM has good probabilistic interpretability and can fully leverage the rich semantics of words, the word co-occurrence information as well as the corpus-wide information to help the representation learning of texts. Experimental results on topical classification and sentiment analysis demonstrate that SPM can achieve new state-of-the-art performances on several benchmark datasets.
Tasks	Representation Learning, Sentiment Analysis
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05635v1
PDF	http://arxiv.org/pdf/1707.05635v1.pdf
PWC	https://paperswithcode.com/paper/spherical-paragraph-model
Repo
Framework

Deep Stacked Networks with Residual Polishing for Image Inpainting


Title	Deep Stacked Networks with Residual Polishing for Image Inpainting
Authors	Ugur Demir, Gozde Unal
Abstract	Deep neural networks have shown promising results in image inpainting even if the missing area is relatively large. However, most of the existing inpainting networks introduce undesired artifacts and noise to the repaired regions. To solve this problem, we present a novel framework which consists of two stacked convolutional neural networks that inpaint the image and remove the artifacts, respectively. The first network considers the global structure of the damaged image and coarsely fills the blank area. Then the second network modifies the repaired image to cancel the noise introduced by the first network. The proposed framework splits the problem into two distinct partitions that can be optimized separately, therefore it can be applied to any inpainting algorithm by changing the first network. Second stage in our framework which aims at polishing the inpainted images can be treated as a denoising problem where a wide range of algorithms can be employed. Our results demonstrate that the proposed framework achieves significant improvement on both visual and quantitative evaluations.
Tasks	Denoising, Image Inpainting
Published	2017-12-31
URL	http://arxiv.org/abs/1801.00289v1
PDF	http://arxiv.org/pdf/1801.00289v1.pdf
PWC	https://paperswithcode.com/paper/deep-stacked-networks-with-residual-polishing
Repo
Framework

SafetyNet: Detecting and Rejecting Adversarial Examples Robustly


Title	SafetyNet: Detecting and Rejecting Adversarial Examples Robustly
Authors	Jiajun Lu, Theerasit Issaranon, David Forsyth
Abstract	We describe a method to produce a network where current methods such as DeepFool have great difficulty producing adversarial samples. Our construction suggests some insights into how deep networks work. We provide a reasonable analyses that our construction is difficult to defeat, and show experimentally that our method is hard to defeat with both Type I and Type II attacks using several standard networks and datasets. This SafetyNet architecture is used to an important and novel application SceneProof, which can reliably detect whether an image is a picture of a real scene or not. SceneProof applies to images captured with depth maps (RGBD images) and checks if a pair of image and depth map is consistent. It relies on the relative difficulty of producing naturalistic depth maps for images in post processing. We demonstrate that our SafetyNet is robust to adversarial examples built from currently known attacking approaches.
Tasks
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00103v2
PDF	http://arxiv.org/pdf/1704.00103v2.pdf
PWC	https://paperswithcode.com/paper/safetynet-detecting-and-rejecting-adversarial
Repo
Framework

An evolutionary strategy for DeltaE - E identification


Title	An evolutionary strategy for DeltaE - E identification
Authors	Katarzyna Schmidt, Oskar Wyszynski
Abstract	In this article we present an automatic method for charge and mass identification of charged nuclear fragments produced in heavy ion collisions at intermediate energies. The algorithm combines a generative model of DeltaE - E relation and a Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES). The CMA-ES is a stochastic and derivative-free method employed to search parameter space of the model by means of a fitness function. The article describes details of the method along with results of an application on simulated labeled data.
Tasks
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08380v2
PDF	http://arxiv.org/pdf/1705.08380v2.pdf
PWC	https://paperswithcode.com/paper/an-evolutionary-strategy-for-deltae-e
Repo
Framework