October 20, 2019

3053 words 15 mins read

Paper Group AWR 335

Paper Group AWR 335

Learning a Neural-network-based Representation for Open Set Recognition. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations. A Neural Temporal Model for Human Motion Prediction. Neural Program Synthesis with Priority Queue Training. Statistical Analysis on E-Commerce Reviews, with Sentiment Classification using Bidirect …

Learning a Neural-network-based Representation for Open Set Recognition

Title Learning a Neural-network-based Representation for Open Set Recognition
Authors Mehadi Hassen, Philip K. Chan
Abstract Open set recognition problems exist in many domains. For example in security, new malware classes emerge regularly; therefore malware classification systems need to identify instances from unknown classes in addition to discriminating between known classes. In this paper we present a neural network based representation for addressing the open set recognition problem. In this representation instances from the same class are close to each other while instances from different classes are further apart, resulting in statistically significant improvement when compared to other approaches on three datasets from two different domains.
Tasks Malware Classification, Open Set Learning
Published 2018-02-12
URL http://arxiv.org/abs/1802.04365v1
PDF http://arxiv.org/pdf/1802.04365v1.pdf
PWC https://paperswithcode.com/paper/learning-a-neural-network-based
Repo https://github.com/shrtCKT/opennet
Framework tf

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations

Title MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations
Authors Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, Rada Mihalcea
Abstract Emotion recognition in conversations is a challenging task that has recently gained popularity due to its potential applications. Until now, however, a large-scale multimodal multi-party emotional conversational database containing more than two speakers per dialogue was missing. Thus, we propose the Multimodal EmotionLines Dataset (MELD), an extension and enhancement of EmotionLines. MELD contains about 13,000 utterances from 1,433 dialogues from the TV-series Friends. Each utterance is annotated with emotion and sentiment labels, and encompasses audio, visual and textual modalities. We propose several strong multimodal baselines and show the importance of contextual and multimodal information for emotion recognition in conversations. The full dataset is available for use at http:// affective-meld.github.io.
Tasks Dialogue Generation, Emotion Recognition
Published 2018-10-05
URL https://arxiv.org/abs/1810.02508v6
PDF https://arxiv.org/pdf/1810.02508v6.pdf
PWC https://paperswithcode.com/paper/meld-a-multimodal-multi-party-dataset-for
Repo https://github.com/SenticNet/MELD
Framework tf

A Neural Temporal Model for Human Motion Prediction

Title A Neural Temporal Model for Human Motion Prediction
Authors Anand Gopalakrishnan, Ankur Mali, Dan Kifer, C. Lee Giles, Alexander G. Ororbia
Abstract We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories, 2) a simple set of easily computable features that integrate derivative information, and 3) a novel multi-objective loss function that helps the model to slowly progress from simple next-step prediction to the harder task of multi-step, closed-loop prediction. Our results demonstrate that these innovations improve the modeling of long-term motion trajectories. Finally, we propose a novel metric, called Normalized Power Spectrum Similarity (NPSS), to evaluate the long-term predictive ability of motion synthesis models, complementing the popular mean-squared error (MSE) measure of Euler joint angles over time. We conduct a user study to determine if the proposed NPSS correlates with human evaluation of long-term motion more strongly than MSE and find that it indeed does. We release code and additional results (visualizations) for this paper at: https://github.com/cr7anand/neural_temporal_models
Tasks motion prediction
Published 2018-09-09
URL https://arxiv.org/abs/1809.03036v5
PDF https://arxiv.org/pdf/1809.03036v5.pdf
PWC https://paperswithcode.com/paper/a-neural-temporal-model-for-human-motion
Repo https://github.com/cr7anand/neural_temporal_models
Framework tf

Neural Program Synthesis with Priority Queue Training

Title Neural Program Synthesis with Priority Queue Training
Authors Daniel A. Abolafia, Mohammad Norouzi, Jonathan Shen, Rui Zhao, Quoc V. Le
Abstract Models and examples built with TensorFlow
Tasks Program Synthesis
Published 2018-01-10
URL http://arxiv.org/abs/1801.03526v2
PDF http://arxiv.org/pdf/1801.03526v2.pdf
PWC https://paperswithcode.com/paper/neural-program-synthesis-with-priority-queue
Repo https://github.com/tensorflow/models/tree/master/research/brain_coder
Framework tf

Statistical Analysis on E-Commerce Reviews, with Sentiment Classification using Bidirectional Recurrent Neural Network (RNN)

Title Statistical Analysis on E-Commerce Reviews, with Sentiment Classification using Bidirectional Recurrent Neural Network (RNN)
Authors Abien Fred Agarap, Paul Grafilon
Abstract Understanding customer sentiments is of paramount importance in marketing strategies today. Not only will it give companies an insight as to how customers perceive their products and/or services, but it will also give them an idea on how to improve their offers. This paper attempts to understand the correlation of different variables in customer reviews on a women clothing e-commerce, and to classify each review whether it recommends the reviewed product or not and whether it consists of positive, negative, or neutral sentiment. To achieve these goals, we employed univariate and multivariate analyses on dataset features except for review titles and review texts, and we implemented a bidirectional recurrent neural network (RNN) with long-short term memory unit (LSTM) for recommendation and sentiment classification. Results have shown that a recommendation is a strong indicator of a positive sentiment score, and vice-versa. On the other hand, ratings in product reviews are fuzzy indicators of sentiment scores. We also found out that the bidirectional LSTM was able to reach an F1-score of 0.88 for recommendation classification, and 0.93 for sentiment classification.
Tasks Sentiment Analysis
Published 2018-05-08
URL http://arxiv.org/abs/1805.03687v1
PDF http://arxiv.org/pdf/1805.03687v1.pdf
PWC https://paperswithcode.com/paper/statistical-analysis-on-e-commerce-reviews
Repo https://github.com/arjit3004/Recommended-System
Framework none

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

Title VulDeePecker: A Deep Learning-Based System for Vulnerability Detection
Authors Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, Yuyi Zhong
Abstract The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features. Since deep learning is motivated to deal with problems that are very different from the problem of vulnerability detection, we need some guiding principles for applying deep learning to vulnerability detection. In particular, we need to find representations of software programs that are suitable for deep learning. For this purpose, we propose using code gadgets to represent programs and then transform them into vectors, where a code gadget is a number of (not necessarily consecutive) lines of code that are semantically related to each other. This leads to the design and implementation of a deep learning-based vulnerability detection system, called Vulnerability Deep Pecker (VulDeePecker). In order to evaluate VulDeePecker, we present the first vulnerability dataset for deep learning approaches. Experimental results show that VulDeePecker can achieve much fewer false negatives (with reasonable false positives) than other approaches. We further apply VulDeePecker to 3 software products (namely Xen, Seamonkey, and Libav) and detect 4 vulnerabilities, which are not reported in the National Vulnerability Database but were “silently” patched by the vendors when releasing later versions of these products; in contrast, these vulnerabilities are almost entirely missed by the other vulnerability detection systems we experimented with.
Tasks Vulnerability Detection
Published 2018-01-05
URL http://arxiv.org/abs/1801.01681v1
PDF http://arxiv.org/pdf/1801.01681v1.pdf
PWC https://paperswithcode.com/paper/vuldeepecker-a-deep-learning-based-system-for
Repo https://github.com/dascimal-org/MDSeqVAE/blob/master/VulDeePeck.py
Framework tf

Deep Multi-Center Learning for Face Alignment

Title Deep Multi-Center Learning for Face Alignment
Authors Zhiwen Shao, Hengliang Zhu, Xin Tan, Yangyang Hao, Lizhuang Ma
Abstract Facial landmarks are highly correlated with each other since a certain landmark can be estimated by its neighboring landmarks. Most of the existing deep learning methods only use one fully-connected layer called shape prediction layer to estimate the locations of facial landmarks. In this paper, we propose a novel deep learning framework named Multi-Center Learning with multiple shape prediction layers for face alignment. In particular, each shape prediction layer emphasizes on the detection of a certain cluster of semantically relevant landmarks respectively. Challenging landmarks are focused firstly, and each cluster of landmarks is further optimized respectively. Moreover, to reduce the model complexity, we propose a model assembling method to integrate multiple shape prediction layers into one shape prediction layer. Extensive experiments demonstrate that our method is effective for handling complex occlusions and appearance variations with real-time performance. The code for our method is available at https://github.com/ZhiwenShao/MCNet-Extension.
Tasks Face Alignment
Published 2018-08-05
URL http://arxiv.org/abs/1808.01558v2
PDF http://arxiv.org/pdf/1808.01558v2.pdf
PWC https://paperswithcode.com/paper/deep-multi-center-learning-for-face-alignment
Repo https://github.com/ZhiwenShao/MCNet-Extension
Framework none

Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors

Title Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors
Authors Cristian Rusu
Abstract The ability to decompose a signal in an orthonormal basis (a set of orthogonal components, each normalized to have unit length) using a fast numerical procedure rests at the heart of many signal processing methods and applications. The classic examples are the Fourier and wavelet transforms that enjoy numerically efficient implementations (FFT and FWT, respectively). Unfortunately, orthonormal transformations are in general unstructured, and therefore they do not enjoy low computational complexity properties. In this paper, based on Householder reflectors, we introduce a class of orthonormal matrices that are numerically efficient to manipulate: we control the complexity of matrix-vector multiplications with these matrices using a given parameter. We provide numerical algorithms that approximate any orthonormal or symmetric transform with a new orthonormal or symmetric structure made up of products of a given number of Householder reflectors. We show analyses and numerical evidence to highlight the accuracy of the proposed approximations and provide an application to the case of learning fast Mahanalobis distance metric transformations.
Tasks
Published 2018-11-19
URL https://arxiv.org/abs/1811.07624v2
PDF https://arxiv.org/pdf/1811.07624v2.pdf
PWC https://paperswithcode.com/paper/approximate-eigenvalue-decompositions-of
Repo https://github.com/cristian-rusu-research/approximate-householder-decomposition
Framework none

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Title Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
Authors Jiaxuan You, Bowen Liu, Rex Ying, Vijay Pande, Jure Leskovec
Abstract Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research. This is especially important in the task of molecular graph generation, whose goal is to discover novel molecules with desired properties such as drug-likeness and synthetic accessibility, while obeying physical laws such as chemical valency. However, designing models to find molecules that optimize desired properties while incorporating highly complex and non-differentiable rules remains to be a challenging task. Here we propose Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.
Tasks Graph Generation
Published 2018-06-07
URL http://arxiv.org/abs/1806.02473v3
PDF http://arxiv.org/pdf/1806.02473v3.pdf
PWC https://paperswithcode.com/paper/graph-convolutional-policy-network-for-goal
Repo https://github.com/LeeJunHyun/The-Databases-for-Drug-Discovery
Framework tf

Bounding Box Regression with Uncertainty for Accurate Object Detection

Title Bounding Box Regression with Uncertainty for Accurate Object Detection
Authors Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang
Abstract Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization accuracies of various architectures with nearly no additional computation. The learned localization variance allows us to merge neighboring bounding boxes during non-maximum suppression (NMS), which further improves the localization performance. On MS-COCO, we boost the Average Precision (AP) of VGG-16 Faster R-CNN from 23.6% to 29.1%. More importantly, for ResNet-50-FPN Mask R-CNN, our method improves the AP and AP90 by 1.8% and 6.2% respectively, which significantly outperforms previous state-of-the-art bounding box refinement methods. Our code and models are available at: github.com/yihui-he/KL-Loss
Tasks Object Detection, Object Localization
Published 2018-09-23
URL http://arxiv.org/abs/1809.08545v3
PDF http://arxiv.org/pdf/1809.08545v3.pdf
PWC https://paperswithcode.com/paper/softer-nms-rethinking-bounding-box-regression
Repo https://github.com/yihui-he/softer-NMS
Framework none

Monge-Ampère Flow for Generative Modeling

Title Monge-Ampère Flow for Generative Modeling
Authors Linfeng Zhang, Weinan E, Lei Wang
Abstract We present a deep generative model, named Monge-Amp`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp`ere equation in optimal transport theory. The generative map from the latent space to the data space follows a dynamical system, where a learnable potential function guides a compressible fluid to flow towards the target density distribution. Training of the model amounts to solving an optimal control problem. The Monge-Amp`ere flow has tractable likelihoods and supports efficient sampling and inference. One can easily impose symmetry constraints in the generative model by designing suitable scalar potential functions. We apply the approach to unsupervised density estimation of the MNIST dataset and variational calculation of the two-dimensional Ising model at the critical point. This approach brings insights and techniques from Monge-Amp`ere equation, optimal transport, and fluid dynamics into reversible flow-based generative models.
Tasks Density Estimation
Published 2018-09-26
URL http://arxiv.org/abs/1809.10188v1
PDF http://arxiv.org/pdf/1809.10188v1.pdf
PWC https://paperswithcode.com/paper/monge-ampere-flow-for-generative-modeling
Repo https://github.com/wangleiphy/MongeAmpereFlow
Framework pytorch

An Automated System for Epilepsy Detection using EEG Brain Signals based on Deep Learning Approach

Title An Automated System for Epilepsy Detection using EEG Brain Signals based on Deep Learning Approach
Authors Ihsan Ullah, Muhammad Hussain, Emad-ul-Haq Qazi, Hatim Aboalsamh
Abstract Epilepsy is a neurological disorder and for its detection, encephalography (EEG) is a commonly used clinical approach. Manual inspection of EEG brain signals is a time-consuming and laborious process, which puts heavy burden on neurologists and affects their performance. Several automatic techniques have been proposed using traditional approaches to assist neurologists in detecting binary epilepsy scenarios e.g. seizure vs. non-seizure or normal vs. ictal. These methods do not perform well when classifying ternary case e.g. ictal vs. normal vs. inter-ictal; the maximum accuracy for this case by the state-of-the-art-methods is 97+-1%. To overcome this problem, we propose a system based on deep learning, which is an ensemble of pyramidal one-dimensional convolutional neural network (P-1D-CNN) models. In a CNN model, the bottleneck is the large number of learnable parameters. P-1D-CNN works on the concept of refinement approach and it results in 60% fewer parameters compared to traditional CNN models. Further to overcome the limitations of small amount of data, we proposed augmentation schemes for learning P-1D-CNN model. In almost all the cases concerning epilepsy detection, the proposed system gives an accuracy of 99.1+-0.9% on the University of Bonn dataset.
Tasks EEG
Published 2018-01-16
URL http://arxiv.org/abs/1801.05412v1
PDF http://arxiv.org/pdf/1801.05412v1.pdf
PWC https://paperswithcode.com/paper/an-automated-system-for-epilepsy-detection
Repo https://github.com/majorash/eeg_epilepsy_conv1d
Framework none

Devil in the Details: Towards Accurate Single and Multiple Human Parsing

Title Devil in the Details: Towards Accurate Single and Multiple Human Parsing
Authors Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao Zhao, Thomas Huang
Abstract Human parsing has received considerable interest due to its wide application potentials. Nevertheless, it is still unclear how to develop an accurate human parsing system in an efficient and elegant way. In this paper, we identify several useful properties, including feature resolution, global context information and edge details, and perform rigorous analyses to reveal how to leverage them to benefit the human parsing task. The advantages of these useful properties finally result in a simple yet effective Context Embedding with Edge Perceiving (CE2P) framework for single human parsing. Our CE2P is end-to-end trainable and can be easily adopted for conducting multiple human parsing. Benefiting the superiority of CE2P, we achieved the 1st places on all three human parsing benchmarks. Without any bells and whistles, we achieved 56.50% (mIoU), 45.31% (mean $AP^r$) and 33.34% ($AP^p_{0.5}$) in LIP, CIHP and MHP v2.0, which outperform the state-of-the-arts more than 2.06%, 3.81% and 1.87%, respectively. We hope our CE2P will serve as a solid baseline and help ease future research in single/multiple human parsing. Code has been made available at \url{https://github.com/liutinglt/CE2P}.
Tasks Human Parsing, Semantic Segmentation
Published 2018-09-17
URL http://arxiv.org/abs/1809.05996v3
PDF http://arxiv.org/pdf/1809.05996v3.pdf
PWC https://paperswithcode.com/paper/devil-in-the-details-towards-accurate-single
Repo https://github.com/liutinglt/CE2P
Framework pytorch

Interpretable Convolutional Filters with SincNet

Title Interpretable Convolutional Filters with SincNet
Authors Mirco Ravanelli, Yoshua Bengio
Abstract Deep learning is currently playing a crucial role toward higher levels of artificial intelligence. This paradigm allows neural networks to learn complex and abstract representations, that are progressively obtained by combining simpler ones. Nevertheless, the internal “black-box” representations automatically discovered by current neural architectures often suffer from a lack of interpretability, making of primary interest the study of explainable machine learning techniques. This paper summarizes our recent efforts to develop a more interpretable neural model for directly processing speech from the raw waveform. In particular, we propose SincNet, a novel Convolutional Neural Network (CNN) that encourages the first layer to discover more meaningful filters by exploiting parametrized sinc functions. In contrast to standard CNNs, which learn all the elements of each filter, only low and high cutoff frequencies of band-pass filters are directly learned from data. This inductive bias offers a very compact way to derive a customized filter-bank front-end, that only depends on some parameters with a clear physical meaning. Our experiments, conducted on both speaker and speech recognition, show that the proposed architecture converges faster, performs better, and is more interpretable than standard CNNs.
Tasks Distant Speech Recognition, Speech Recognition
Published 2018-11-23
URL https://arxiv.org/abs/1811.09725v2
PDF https://arxiv.org/pdf/1811.09725v2.pdf
PWC https://paperswithcode.com/paper/interpretable-convolutional-filters-with
Repo https://github.com/mravanelli/pytorch-kaldi
Framework pytorch

MixUp as Locally Linear Out-Of-Manifold Regularization

Title MixUp as Locally Linear Out-Of-Manifold Regularization
Authors Hongyu Guo, Yongyi Mao, Richong Zhang
Abstract MixUp is a recently proposed data-augmentation scheme, which linearly interpolates a random pair of training examples and correspondingly the one-hot representations of their labels. Training deep neural networks with such additional data is shown capable of significantly improving the predictive accuracy of the current art. The power of MixUp, however, is primarily established empirically and its working and effectiveness have not been explained in any depth. In this paper, we develop an understanding for MixUp as a form of “out-of-manifold regularization”, which imposes certain “local linearity” constraints on the model’s input space beyond the data manifold. This analysis enables us to identify a limitation of MixUp, which we call “manifold intrusion”. In a nutshell, manifold intrusion in MixUp is a form of under-fitting resulting from conflicts between the synthetic labels of the mixed-up examples and the labels of original training data. Such a phenomenon usually happens when the parameters controlling the generation of mixing policies are not sufficiently fine-tuned on the training data. To address this issue, we propose a novel adaptive version of MixUp, where the mixing policies are automatically learned from the data using an additional network and objective function designed to avoid manifold intrusion. The proposed regularizer, AdaMixUp, is empirically evaluated on several benchmark datasets. Extensive experiments demonstrate that AdaMixUp improves upon MixUp when applied to the current art of deep classification models.
Tasks Data Augmentation
Published 2018-09-07
URL http://arxiv.org/abs/1809.02499v3
PDF http://arxiv.org/pdf/1809.02499v3.pdf
PWC https://paperswithcode.com/paper/mixup-as-locally-linear-out-of-manifold
Repo https://github.com/SITE5039/AdaMixUp
Framework tf
comments powered by Disqus