January 30, 2020

3165 words 15 mins read

Paper Group ANR 441

Predicting Vehicle Behaviors Over An Extended Horizon Using Behavior Interaction Network. Using musical relationships between chord labels in automatic chord extraction tasks. Identifying Emotions from Walking using Affective and Deep Features. A Framework for Evaluating Snippet Generation for Dataset Search. Variational Coupling Revisited: Simpler …

Predicting Vehicle Behaviors Over An Extended Horizon Using Behavior Interaction Network


Title	Predicting Vehicle Behaviors Over An Extended Horizon Using Behavior Interaction Network
Authors	Wenchao Ding, Jing Chen, Shaojie Shen
Abstract	Anticipating possible behaviors of traffic participants is an essential capability of autonomous vehicles. Many behavior detection and maneuver recognition methods only have a very limited prediction horizon that leaves inadequate time and space for planning. To avoid unsatisfactory reactive decisions, it is essential to count long-term future rewards in planning, which requires extending the prediction horizon. In this paper, we uncover that clues to vehicle behaviors over an extended horizon can be found in vehicle interaction, which makes it possible to anticipate the likelihood of a certain behavior, even in the absence of any clear maneuver pattern. We adopt a recurrent neural network (RNN) for observation encoding, and based on that, we propose a novel vehicle behavior interaction network (VBIN) to capture the vehicle interaction from the hidden states and connection feature of each interaction pair. The output of our method is a probabilistic likelihood of multiple behavior classes, which matches the multimodal and uncertain nature of the distant future. A systematic comparison of our method against two state-of-the-art methods and another two baseline methods on a publicly available real highway dataset is provided, showing that our method has superior accuracy and advanced capability for interaction modeling.
Tasks	Autonomous Vehicles
Published	2019-03-03
URL	https://arxiv.org/abs/1903.00848v2
PDF	https://arxiv.org/pdf/1903.00848v2.pdf
PWC	https://paperswithcode.com/paper/predicting-vehicle-behaviors-over-an-extended
Repo
Framework

Using musical relationships between chord labels in automatic chord extraction tasks


Title	Using musical relationships between chord labels in automatic chord extraction tasks
Authors	Tristan Carsault, Jérôme Nika, Philippe Esling
Abstract	Recent researches on Automatic Chord Extraction (ACE) have focused on the improvement of models based on machine learning. However, most models still fail to take into account the prior knowledge underlying the labeling alphabets (chord labels). Furthermore, recent works have shown that ACE performances are converging towards a glass ceiling. Therefore, this prompts the need to focus on other aspects of the task, such as the introduction of musical knowledge in the representation, the improvement of the models towards more complex chord alphabets and the development of more adapted evaluation methods. In this paper, we propose to exploit specific properties and relationships between chord labels in order to improve the learning of statistical ACE models. Hence, we analyze the interdependence of the representations of chords and their associated distances, the precision of the chord alphabets, and the impact of the reduction of the alphabet before or after training of the model. Furthermore, we propose new training losses based on musical theory. We show that these improve the results of ACE systems based on Convolutional Neural Networks. By performing an in-depth analysis of our results, we uncover a set of related insights on ACE tasks based on statistical models, and also formalize the musical meaning of some classification errors.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04973v2
PDF	https://arxiv.org/pdf/1911.04973v2.pdf
PWC	https://paperswithcode.com/paper/using-musical-relationships-between-chord
Repo
Framework

Identifying Emotions from Walking using Affective and Deep Features


Title	Identifying Emotions from Walking using Affective and Deep Features
Authors	Tanmay Randhavane, Uttaran Bhattacharya, Kyra Kapsaskis, Kurt Gray, Aniket Bera, Dinesh Manocha
Abstract	We present a new data-driven model and algorithm to identify the perceived emotions of individuals based on their walking styles. Given an RGB video of an individual walking, we extract his/her walking gait in the form of a series of 3D poses. Our goal is to exploit the gait features to classify the emotional state of the human into one of four emotions: happy, sad, angry, or neutral. Our perceived emotion recognition approach uses deep features learned via LSTM on labeled emotion datasets. Furthermore, we combine these features with affective features computed from gaits using posture and movement cues. These features are classified using a Random Forest Classifier. We show that our mapping between the combined feature space and the perceived emotional state provides 80.07% accuracy in identifying the perceived emotions. In addition to classifying discrete categories of emotions, our algorithm also predicts the values of perceived valence and arousal from gaits. We also present an EWalk (Emotion Walk) dataset that consists of videos of walking individuals with gaits and labeled emotions. To the best of our knowledge, this is the first gait-based model to identify perceived emotions from videos of walking individuals.
Tasks	Emotion Recognition
Published	2019-06-14
URL	https://arxiv.org/abs/1906.11884v4
PDF	https://arxiv.org/pdf/1906.11884v4.pdf
PWC	https://paperswithcode.com/paper/identifying-emotions-from-walking-using
Repo
Framework

A Framework for Evaluating Snippet Generation for Dataset Search


Title	A Framework for Evaluating Snippet Generation for Dataset Search
Authors	Xiaxia Wang, Jinchi Chen, Shuxin Li, Gong Cheng, Jeff Z. Pan, Evgeny Kharlamov, Yuzhong Qu
Abstract	Reusing existing datasets is of considerable significance to researchers and developers. Dataset search engines help a user find relevant datasets for reuse. They can present a snippet for each retrieved dataset to explain its relevance to the user’s data needs. This emerging problem of snippet generation for dataset search has not received much research attention. To provide a basis for future research, we introduce a framework for quantitatively evaluating the quality of a dataset snippet. The proposed metrics assess the extent to which a snippet matches the query intent and covers the main content of the dataset. To establish a baseline, we adapt four state-of-the-art methods from related fields to our problem, and perform an empirical evaluation based on real-world datasets and queries. We also conduct a user study to verify our findings. The results demonstrate the effectiveness of our evaluation framework, and suggest directions for future research.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01183v1
PDF	https://arxiv.org/pdf/1907.01183v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-evaluating-snippet-generation
Repo
Framework

Variational Coupling Revisited: Simpler Models, Theoretical Connections, and Novel Applications


Title	Variational Coupling Revisited: Simpler Models, Theoretical Connections, and Novel Applications
Authors	Aaron Wewior, Joachim Weickert
Abstract	Variational models with coupling terms are becoming increasingly popular in image analysis. They involve auxiliary variables, such that their energy minimisation splits into multiple fractional steps that can be solved easier and more efficiently. In our paper we show that coupling models offer a number of interesting properties that go far beyond their obvious numerical benefits. We demonstrate that discontinuity-preserving denoising can be achieved even with quadratic data and smoothness terms, provided that the coupling term involves the $L^1$ norm. We show that such an $L^1$ coupling term provides additional information as a powerful edge detector that has remained unexplored so far. While coupling models in the literature approximate higher order regularisation, we argue that already first order coupling models can be useful. As a specific example, we present a first order coupling model that outperforms classical TV regularisation. It also establishes a theoretical connection between TV regularisation and the Mumford-Shah segmentation approach. Unlike other Mumford-Shah algorithms, it is a strictly convex approximation, for which we can guarantee convergence of a split Bregman algorithm.
Tasks	Denoising
Published	2019-12-12
URL	https://arxiv.org/abs/1912.05888v1
PDF	https://arxiv.org/pdf/1912.05888v1.pdf
PWC	https://paperswithcode.com/paper/variational-coupling-revisited-simpler-models
Repo
Framework

ImmuNeCS: Neural Committee Search by an Artificial Immune System


Title	ImmuNeCS: Neural Committee Search by an Artificial Immune System
Authors	Luc Frachon, Wei Pang, George M. Coghill
Abstract	Current Neural Architecture Search techniques can suffer from a few shortcomings, including high computational cost, excessive bias from the search space, conceptual complexity or uncertain empirical benefits over random search. In this paper, we present ImmuNeCS, an attempt at addressing these issues with a method that offers a simple, flexible, and efficient way of building deep learning models automatically, and we demonstrate its effectiveness in the context of convolutional neural networks. Instead of searching for the 1-best architecture for a given task, we focus on building a population of neural networks that are then ensembled into a neural network committee, an approach we dub ‘Neural Committee Search’. To ensure sufficient performance from the committee, our search algorithm is based on an artificial immune system that balances individual performance with population diversity. This allows us to stop the search when accuracy starts to plateau, and to bridge the performance gap through ensembling. In order to justify our method, we first verify that the chosen search space exhibits the locality property. To further improve efficiency, we also combine partial evaluation, weight inheritance, and progressive search. First, experiments are run to verify the validity of these techniques. Then, preliminary experimental results on two popular computer vision benchmarks show that our method consistently outperforms random search and yields promising results within reasonable GPU budgets. An additional experiment also shows that ImmuNeCS’s solutions transfer effectively to a more difficult task, where they achieve results comparable to a direct search on the new task. We believe these findings can open the way for new, accessible alternatives to traditional NAS.
Tasks	Neural Architecture Search
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07729v2
PDF	https://arxiv.org/pdf/1911.07729v2.pdf
PWC	https://paperswithcode.com/paper/immunecs-neural-committee-search-by-an
Repo
Framework

Melody Generation using an Interactive Evolutionary Algorithm


Title	Melody Generation using an Interactive Evolutionary Algorithm
Authors	Majid Farzaneh, Rahil Mahdian Toroghi
Abstract	Music generation with the aid of computers has been recently grabbed the attention of many scientists in the area of artificial intelligence. Deep learning techniques have evolved sequence production methods for this purpose. Yet, a challenging problem is how to evaluate generated music by a machine. In this paper, a methodology has been developed based upon an interactive evolutionary optimization method, with which the scoring of the generated melodies is primarily performed by human expertise, during the training. This music quality scoring is modeled using a Bi-LSTM recurrent neural network. Moreover, the innovative generated melody through a Genetic algorithm will then be evaluated using this Bi-LSTM network. The results of this mechanism clearly show that the proposed method is able to create pleasurable melodies with desired styles and pieces. This method is also quite fast, compared to the state-of-the-art data-oriented evolutionary systems.
Tasks	Music Generation
Published	2019-07-07
URL	https://arxiv.org/abs/1907.04258v1
PDF	https://arxiv.org/pdf/1907.04258v1.pdf
PWC	https://paperswithcode.com/paper/melody-generation-using-an-interactive
Repo
Framework

Deep Learning-based Denoising of Mammographic Images using Physics-driven Data Augmentation


Title	Deep Learning-based Denoising of Mammographic Images using Physics-driven Data Augmentation
Authors	Dominik Eckert, Sulaiman Vesal, Ludwig Ritschl, Steffen Kappler, Andreas Maier
Abstract	Mammography is using low-energy X-rays to screen the human breast and is utilized by radiologists to detect breast cancer. Typically radiologists require a mammogram with impeccable image quality for an accurate diagnosis. In this study, we propose a deep learning method based on Convolutional Neural Networks (CNNs) for mammogram denoising to improve the image quality. We first enhance the noise level and employ Anscombe Transformation (AT) to transform Poisson noise to white Gaussian noise. With this data augmentation, a deep residual network is trained to learn the noise map of the noisy images. We show, that the proposed method can remove not only simulated but also real noise. Furthermore, we also compare our results with state-of-the-art denoising methods, such as BM3D and DNCNN. In an early investigation, we achieved qualitatively better mammogram denoising results.
Tasks	Data Augmentation, Denoising
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05240v1
PDF	https://arxiv.org/pdf/1912.05240v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-denoising-of-mammographic
Repo
Framework

Online Filter Clustering and Pruning for Efficient Convnets


Title	Online Filter Clustering and Pruning for Efficient Convnets
Authors	Zhengguang Zhou, Wengang Zhou, Richang Hong, Houqiang Li
Abstract	Pruning filters is an effective method for accelerating deep neural networks (DNNs), but most existing approaches prune filters on a pre-trained network directly which limits in acceleration. Although each filter has its own effect in DNNs, but if two filters are the same with each other, we could prune one safely. In this paper, we add an extra cluster loss term in the loss function which can force filters in each cluster to be similar online. After training, we keep one filter in each cluster and prune others and fine-tune the pruned network to compensate for the loss. Particularly, the clusters in every layer can be defined firstly which is effective for pruning DNNs within residual blocks. Extensive experiments on CIFAR10 and CIFAR100 benchmarks demonstrate the competitive performance of our proposed filter pruning method.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11787v1
PDF	https://arxiv.org/pdf/1905.11787v1.pdf
PWC	https://paperswithcode.com/paper/online-filter-clustering-and-pruning-for
Repo
Framework

The Local Elasticity of Neural Networks


Title	The Local Elasticity of Neural Networks
Authors	Hangfeng He, Weijie J. Su
Abstract	This paper presents a phenomenon in neural networks that we refer to as \textit{local elasticity}. Roughly speaking, a classifier is said to be locally elastic if its prediction at a feature vector $\bx'$ is \textit{not} significantly perturbed, after the classifier is updated via stochastic gradient descent at a (labeled) feature vector $\bx$ that is \textit{dissimilar} to $\bx'$ in a certain sense. This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and synthetic datasets, whereas this is not observed in linear classifiers. In addition, we offer a geometric interpretation of local elasticity using the neural tangent kernel \citep{jacot2018neural}. Building on top of local elasticity, we obtain pairwise similarity measures between feature vectors, which can be used for clustering in conjunction with $K$-means. The effectiveness of the clustering algorithm on the MNIST and CIFAR-10 datasets in turn corroborates the hypothesis of local elasticity of neural networks on real-life data. Finally, we discuss some implications of local elasticity to shed light on several intriguing aspects of deep neural networks.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06943v2
PDF	https://arxiv.org/pdf/1910.06943v2.pdf
PWC	https://paperswithcode.com/paper/the-local-elasticity-of-neural-networks
Repo
Framework

Satellite-Net: Automatic Extraction of Land Cover Indicators from Satellite Imagery by Deep Learning


Title	Satellite-Net: Automatic Extraction of Land Cover Indicators from Satellite Imagery by Deep Learning
Authors	Eleonora Bernasconi, Francesco Pugliese, Diego Zardetto, Monica Scannapieco
Abstract	In this paper we address the challenge of land cover classification for satellite images via Deep Learning (DL). Land Cover aims to detect the physical characteristics of the territory and estimate the percentage of land occupied by a certain category of entities: vegetation, residential buildings, industrial areas, forest areas, rivers, lakes, etc. DL is a new paradigm for Big Data analytics and in particular for Computer Vision. The application of DL in images classification for land cover purposes has a great potential owing to the high degree of automation and computing performance. In particular, the invention of Convolution Neural Networks (CNNs) was a fundament for the advancements in this field. In [1], the Satellite Task Team of the UN Global Working Group describes the results achieved so far with respect to the use of earth observation for Official Statistics. However, in that study, CNNs have not yet been explored for automatic classification of imagery. This work investigates the usage of CNNs for the estimation of land cover indicators, providing evidence of the first promising results. In particular, the paper proposes a customized model, called Satellite-Net, able to reach an accuracy level up to 98% on test sets.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09423v1
PDF	https://arxiv.org/pdf/1907.09423v1.pdf
PWC	https://paperswithcode.com/paper/satellite-net-automatic-extraction-of-land
Repo
Framework

Exploring Conditioning for Generative Music Systems with Human-Interpretable Controls


Title	Exploring Conditioning for Generative Music Systems with Human-Interpretable Controls
Authors	Nicholas Meade, Nicholas Barreyre, Scott C. Lowe, Sageev Oore
Abstract	Performance RNN is a machine-learning system designed primarily for the generation of solo piano performances using an event-based (rather than audio) representation. More specifically, Performance RNN is a long short-term memory (LSTM) based recurrent neural network that models polyphonic music with expressive timing and dynamics (Oore et al., 2018). The neural network uses a simple language model based on the Musical Instrument Digital Interface (MIDI) file format. Performance RNN is trained on the e-Piano Junior Competition Dataset (International Piano e-Competition, 2018), a collection of solo piano performances by expert pianists. As an artistic tool, one of the limitations of the original model has been the lack of useable controls. The standard form of Performance RNN can generate interesting pieces, but little control is provided over what specifically is generated. This paper explores a set of conditioning-based controls used to influence the generation process.
Tasks	Language Modelling
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04352v3
PDF	https://arxiv.org/pdf/1907.04352v3.pdf
PWC	https://paperswithcode.com/paper/exploring-conditioning-for-generative-music
Repo
Framework

A Computational Analysis of Natural Languages to Build a Sentence Structure Aware Artificial Neural Network


Title	A Computational Analysis of Natural Languages to Build a Sentence Structure Aware Artificial Neural Network
Authors	Alberto Calderone
Abstract	Natural languages are complexly structured entities. They exhibit characterising regularities that can be exploited to link them one another. In this work, I compare two morphological aspects of languages: Written Patterns and Sentence Structure. I show how languages spontaneously group by similarity in both analyses and derive an average language distance. Finally, exploiting Sentence Structure I developed an Artificial Neural Network capable of distinguishing languages suggesting that not only word roots but also grammatical sentence structure is a characterising trait which alone suffice to identify them.
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05491v1
PDF	https://arxiv.org/pdf/1906.05491v1.pdf
PWC	https://paperswithcode.com/paper/a-computational-analysis-of-natural-languages
Repo
Framework

Multi-level Attention network using text, audio and video for Depression Prediction


Title	Multi-level Attention network using text, audio and video for Depression Prediction
Authors	Anupama Ray, Siddharth Kumar, Rutvik Reddy, Prerana Mukherjee, Ritu Garg
Abstract	Depression has been the leading cause of mental-health illness worldwide. Major depressive disorder (MDD), is a common mental health disorder that affects both psychologically as well as physically which could lead to loss of lives. Due to the lack of diagnostic tests and subjectivity involved in detecting depression, there is a growing interest in using behavioural cues to automate depression diagnosis and stage prediction. The absence of labelled behavioural datasets for such problems and the huge amount of variations possible in behaviour makes the problem more challenging. This paper presents a novel multi-level attention based network for multi-modal depression prediction that fuses features from audio, video and text modalities while learning the intra and inter modality relevance. The multi-level attention reinforces overall learning by selecting the most influential features within each modality for the decision making. We perform exhaustive experimentation to create different regression models for audio, video and text modalities. Several fusions models with different configurations are constructed to understand the impact of each feature and modality. We outperform the current baseline by 17.52% in terms of root mean squared error.
Tasks	Decision Making
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01417v1
PDF	https://arxiv.org/pdf/1909.01417v1.pdf
PWC	https://paperswithcode.com/paper/multi-level-attention-network-using-text
Repo
Framework

Revealing interpretable object representations from human behavior


Title	Revealing interpretable object representations from human behavior
Authors	Charles Y. Zheng, Francisco Pereira, Chris I. Baker, Martin N. Hebart
Abstract	To study how mental object representations are related to behavior, we estimated sparse, non-negative representations of objects using human behavioral judgments on images representative of 1,854 object categories. These representations predicted a latent similarity structure between objects, which captured most of the explainable variance in human behavioral judgments. Individual dimensions in the low-dimensional embedding were found to be highly reproducible and interpretable as conveying degrees of taxonomic membership, functionality, and perceptual attributes. We further demonstrated the predictive power of the embeddings for explaining other forms of human behavior, including categorization, typicality judgments, and feature ratings, suggesting that the dimensions reflect human conceptual representations of objects beyond the specific task.
Tasks
Published	2019-01-09
URL	http://arxiv.org/abs/1901.02915v1
PDF	http://arxiv.org/pdf/1901.02915v1.pdf
PWC	https://paperswithcode.com/paper/revealing-interpretable-object
Repo
Framework