October 17, 2019

2960 words 14 mins read

Paper Group ANR 803

Action and intention recognition of pedestrians in urban traffic. Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization. Sparse DNNs with Improved Adversarial Robustness. Optimizing over a Restricted Policy Class in Markov Decision Processes. Cluster Size Management in Multi-Stage Agglomerative …

Action and intention recognition of pedestrians in urban traffic


Title	Action and intention recognition of pedestrians in urban traffic
Authors	Dimitrios Varytimidis, Fernando Alonso-Fernandez, Boris Duran, Cristofer Englund
Abstract	Action and intention recognition of pedestrians in urban settings are challenging problems for Advanced Driver Assistance Systems as well as future autonomous vehicles to maintain smooth and safe traffic. This work investigates a number of feature extraction methods in combination with several machine learning algorithms to build knowledge on how to automatically detect the action and intention of pedestrians in urban traffic. We focus on the motion and head orientation to predict whether the pedestrian is about to cross the street or not. The work is based on the Joint Attention for Autonomous Driving (JAAD) dataset, which contains 346 videoclips of various traffic scenarios captured with cameras mounted in the windshield of a car. An accuracy of 72% for head orientation estimation and 85% for motion detection is obtained in our experiments.
Tasks	Autonomous Driving, Autonomous Vehicles, Intent Detection, Motion Detection
Published	2018-10-23
URL	http://arxiv.org/abs/1810.09805v1
PDF	http://arxiv.org/pdf/1810.09805v1.pdf
PWC	https://paperswithcode.com/paper/action-and-intention-recognition-of
Repo
Framework

Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization


Title	Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization
Authors	Kamal Al-Sabahi, Zhang Zuping, Yang Kang
Abstract	Sequence generative models with RNN variants, such as LSTM, GRU, show promising performance on abstractive document summarization. However, they still have some issues that limit their performance, especially while deal-ing with long sequences. One of the issues is that, to the best of our knowledge, all current models employ a unidirectional decoder, which reasons only about the past and still limited to retain future context while giving a prediction. This makes these models suffer on their own by generating unbalanced outputs. Moreover, unidirec-tional attention-based document summarization can only capture partial aspects of attentional regularities due to the inherited challenges in document summarization. To this end, we propose an end-to-end trainable bidirectional RNN model to tackle the aforementioned issues. The model has a bidirectional encoder-decoder architecture; in which the encoder and the decoder are bidirectional LSTMs. The forward decoder is initialized with the last hidden state of the backward encoder while the backward decoder is initialized with the last hidden state of the for-ward encoder. In addition, a bidirectional beam search mechanism is proposed as an approximate inference algo-rithm for generating the output summaries from the bidi-rectional model. This enables the model to reason about the past and future and to generate balanced outputs as a result. Experimental results on CNN / Daily Mail dataset show that the proposed model outperforms the current abstractive state-of-the-art models by a considerable mar-gin.
Tasks	Abstractive Text Summarization, Document Summarization
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06662v1
PDF	http://arxiv.org/pdf/1809.06662v1.pdf
PWC	https://paperswithcode.com/paper/bidirectional-attentional-encoder-decoder
Repo
Framework

Sparse DNNs with Improved Adversarial Robustness


Title	Sparse DNNs with Improved Adversarial Robustness
Authors	Yiwen Guo, Chao Zhang, Changshui Zhang, Yurong Chen
Abstract	Deep neural networks (DNNs) are computationally/memory-intensive and vulnerable to adversarial attacks, making them prohibitive in some real-world applications. By converting dense models into sparse ones, pruning appears to be a promising solution to reducing the computation/memory cost. This paper studies classification models, especially DNN-based ones, to demonstrate that there exists intrinsic relationships between their sparsity and adversarial robustness. Our analyses reveal, both theoretically and empirically, that nonlinear DNN-based classifiers behave differently under $l_2$ attacks from some linear ones. We further demonstrate that an appropriately higher model sparsity implies better robustness of nonlinear DNNs, whereas over-sparsified models can be more difficult to resist adversarial examples.
Tasks
Published	2018-10-23
URL	https://arxiv.org/abs/1810.09619v2
PDF	https://arxiv.org/pdf/1810.09619v2.pdf
PWC	https://paperswithcode.com/paper/sparse-dnns-with-improved-adversarial
Repo
Framework

Optimizing over a Restricted Policy Class in Markov Decision Processes


Title	Optimizing over a Restricted Policy Class in Markov Decision Processes
Authors	Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, Nikos Vlassis
Abstract	We address the problem of finding an optimal policy in a Markov decision process under a restricted policy class defined by the convex hull of a set of base policies. This problem is of great interest in applications in which a number of reasonably good (or safe) policies are already known and we are only interested in optimizing in their convex hull. We show that this problem is NP-hard to solve exactly as well as to approximate to arbitrary accuracy. However, under a condition that is akin to the occupancy measures of the base policies having large overlap, we show that there exists an efficient algorithm that finds a policy that is almost as good as the best convex combination of the base policies. The running time of the proposed algorithm is linear in the number of states and polynomial in the number of base policies. In practice, we demonstrate an efficient implementation for large state problems. Compared to traditional policy gradient methods, the proposed approach has the advantage that, apart from the computation of occupancy measures of some base policies, the iterative method need not interact with the environment during the optimization process. This is especially important in complex systems where estimating the value of a policy can be a time consuming process.
Tasks	Policy Gradient Methods
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09646v1
PDF	http://arxiv.org/pdf/1802.09646v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-over-a-restricted-policy-class-in
Repo
Framework

Cluster Size Management in Multi-Stage Agglomerative Hierarchical Clustering of Acoustic Speech Segments


Title	Cluster Size Management in Multi-Stage Agglomerative Hierarchical Clustering of Acoustic Speech Segments
Authors	Lerato Lerato, Thomas Niesler
Abstract	Agglomerative hierarchical clustering (AHC) requires only the similarity between objects to be known. This is attractive when clustering signals of varying length, such as speech, which are not readily represented in fixed-dimensional vector space. However, AHC is characterised by $O(N^2)$ space and time complexity, making it infeasible for partitioning large datasets. This has recently been addressed by an approach based on the iterative re-clustering of independent subsets of the larger dataset. We show that, due to its iterative nature, this procedure can sometimes lead to unchecked growth of individual subsets, thereby compromising its effectiveness. We propose the integration of a simple space management strategy into the iterative process, and show experimentally that this leads to no loss in performance in terms of F-measure while guaranteeing that a threshold space complexity is not breached.
Tasks
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12744v1
PDF	http://arxiv.org/pdf/1810.12744v1.pdf
PWC	https://paperswithcode.com/paper/cluster-size-management-in-multi-stage
Repo
Framework


Title	Shortening Time Required for Adaptive Structural Learning Method of Deep Belief Network with Multi-Modal Data Arrangement
Authors	Shin Kamada, Takumi Ichimura
Abstract	Recently, Deep Learning has been applied in the techniques of artificial intelligence. Especially, Deep Learning performed good results in the field of image recognition. Most new Deep Learning architectures are naturally developed in image recognition. For this reason, not only the numerical data and text data but also the time-series data are transformed to the image data format. Multi-modal data consists of two or more kinds of data such as picture and text. The arrangement in a general method is formed in the squared array with no specific aim. In this paper, the data arrangement are modified according to the similarity of input-output pattern in Adaptive Structural Learning method of Deep Belief Network. The similarity of output signals of hidden neurons is made by the order rearrangement of hidden neurons. The experimental results for the data rearrangement in squared array showed the shortening time required for DBN learning.
Tasks	Time Series
Published	2018-07-11
URL	http://arxiv.org/abs/1807.03952v1
PDF	http://arxiv.org/pdf/1807.03952v1.pdf
PWC	https://paperswithcode.com/paper/shortening-time-required-for-adaptive
Repo
Framework

Deep Variational Sufficient Dimensionality Reduction


Title	Deep Variational Sufficient Dimensionality Reduction
Authors	Ershad Banijamali, Amir-Hossein Karimi, Ali Ghodsi
Abstract	We consider the problem of sufficient dimensionality reduction (SDR), where the high-dimensional observation is transformed to a low-dimensional sub-space in which the information of the observations regarding the label variable is preserved. We propose DVSDR, a deep variational approach for sufficient dimensionality reduction. The deep structure in our model has a bottleneck that represent the low-dimensional embedding of the data. We explain the SDR problem using graphical models and use the framework of variational autoencoders to maximize the lower bound of the log-likelihood of the joint distribution of the observation and label. We show that such a maximization problem can be interpreted as solving the SDR problem. DVSDR can be easily adopted to semi-supervised learning setting. In our experiment we show that DVSDR performs competitively on classification tasks while being able to generate novel data samples.
Tasks	Dimensionality Reduction
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07641v1
PDF	http://arxiv.org/pdf/1812.07641v1.pdf
PWC	https://paperswithcode.com/paper/deep-variational-sufficient-dimensionality
Repo
Framework

Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation


Title	Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation
Authors	François Plesse, Alexandru Ginsca, Bertrand Delezoide, Françoise Prêteux
Abstract	A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world. One of the key issues is to describe the visual relationships between objects. When dealing with real world data, capturing these very diverse interactions is a difficult problem. It can be alleviated by incorporating common sense in a network. For this, we propose a framework that makes use of semantic knowledge and estimates the relevance of object pairs during both training and test phases. Extracted from precomputed models and training annotations, this information is distilled into the neural network dedicated to this task. Using this approach, we observe a significant improvement on all classes of Visual Genome, a challenging visual relationship dataset. A 68.5% relative gain on the recall at 100 is directly related to the relevance estimate and a 32.7% gain to the knowledge distillation.
Tasks	Common Sense Reasoning
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10802v1
PDF	http://arxiv.org/pdf/1805.10802v1.pdf
PWC	https://paperswithcode.com/paper/visual-relationship-detection-based-on-guided
Repo
Framework

Data Augmentation for Spoken Language Understanding via Joint Variational Generation


Title	Data Augmentation for Spoken Language Understanding via Joint Variational Generation
Authors	Kang Min Yoo, Youhyun Shin, Sang-goo Lee
Abstract	Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and rigorous statistical testing.
Tasks	Data Augmentation, Domain Adaptation, Latent Variable Models, Spoken Language Understanding
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02305v2
PDF	http://arxiv.org/pdf/1809.02305v2.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-for-spoken-language
Repo
Framework

Analytically Embedding Differential Equation Constraints into Least Squares Support Vector Machines using the Theory of Functional Connections


Title	Analytically Embedding Differential Equation Constraints into Least Squares Support Vector Machines using the Theory of Functional Connections
Authors	Carl Leake, Hunter Johnston, Lidia Smith, Daniele Mortari
Abstract	Differential equations (DEs) are used as numerical models to describe physical phenomena throughout the field of engineering and science, including heat and fluid flow, structural bending, and systems dynamics. While there are many other techniques for finding approximate solutions to these equations, this paper looks to compare the application of the Theory of Functional Connections (TFC) with one based on least-squares support vector machines (LS-SVM). The TFC method uses a constrained expression, an expression that always satisfies the DE constraints, which transforms the process of solving a DE into solving an unconstrained optimization problem that is ultimately solved via least-squares (LS). In addition to individual analysis, the two methods are merged into a new methodology, called constrained SVMs (CSVM), by incorporating the LS-SVM method into the TFC framework to solve unconstrained problems. Numerical tests are conducted on four sample problems: One first order linear ordinary differential equation (ODE), one first order nonlinear ODE, one second order linear ODE, and one two-dimensional linear partial differential equation (PDE). Using the LS-SVM method as a benchmark, a speed comparison is made for all the problems by timing the training period, and an accuracy comparison is made using the maximum error and mean squared error on the training and test sets. In general, TFC is shown to be slightly faster (by an order of magnitude or less) and more accurate (by multiple orders of magnitude) than the LS-SVM and CSVM approaches.
Tasks
Published	2018-12-13
URL	https://arxiv.org/abs/1812.05571v3
PDF	https://arxiv.org/pdf/1812.05571v3.pdf
PWC	https://paperswithcode.com/paper/approximating-ordinary-and-partial
Repo
Framework

Speeding Up Neural Machine Translation Decoding by Cube Pruning


Title	Speeding Up Neural Machine Translation Decoding by Cube Pruning
Authors	Wen Zhang, Liang Huang, Yang Feng, Lei Shen, Qun Liu
Abstract	Although neural machine translation has achieved promising results, it suffers from slow translation speed. The direct consequence is that a trade-off has to be made between translation quality and speed, thus its performance can not come into full play. We apply cube pruning, a popular technique to speed up dynamic programming, into neural machine translation to speed up the translation. To construct the equivalence class, similar target hidden states are combined, leading to less RNN expansion operations on the target side and less $\mathrm{softmax}$ operations over the large target vocabulary. The experiments show that, at the same or even better translation quality, our method can translate faster compared with naive beam search by $3.3\times$ on GPUs and $3.5\times$ on CPUs.
Tasks	Machine Translation
Published	2018-09-09
URL	http://arxiv.org/abs/1809.02992v1
PDF	http://arxiv.org/pdf/1809.02992v1.pdf
PWC	https://paperswithcode.com/paper/speeding-up-neural-machine-translation
Repo
Framework

Variational Dropout via Empirical Bayes


Title	Variational Dropout via Empirical Bayes
Authors	Valery Kharitonov, Dmitry Molchanov, Dmitry Vetrov
Abstract	We study the Automatic Relevance Determination procedure applied to deep neural networks. We show that ARD applied to Bayesian DNNs with Gaussian approximate posterior distributions leads to a variational bound similar to that of variational dropout, and in the case of a fixed dropout rate, objectives are exactly the same. Experimental results show that the two approaches yield comparable results in practice even when the dropout rates are trained. This leads to an alternative Bayesian interpretation of dropout and mitigates some of the theoretical issues that arise with the use of improper priors in the variational dropout model. Additionally, we explore the use of the hierarchical priors in ARD and show that it helps achieve higher sparsity for the same accuracy.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00596v2
PDF	http://arxiv.org/pdf/1811.00596v2.pdf
PWC	https://paperswithcode.com/paper/variational-dropout-via-empirical-bayes
Repo
Framework

Convolutional Point-set Representation: A Convolutional Bridge Between a Densely Annotated Image and 3D Face Alignment


Title	Convolutional Point-set Representation: A Convolutional Bridge Between a Densely Annotated Image and 3D Face Alignment
Authors	Yuhang Wu, Le Anh Vu Ha, Xiang Xu, Ioannis A. Kakadiaris
Abstract	We present a robust method for estimating the facial pose and shape information from a densely annotated facial image. The method relies on Convolutional Point-set Representation (CPR), a carefully designed matrix representation to summarize different layers of information encoded in the set of detected points in the annotated image. The CPR disentangles the dependencies of shape and different pose parameters and enables updating different parameters in a sequential manner via convolutional neural networks and recurrent layers. When updating the pose parameters, we sample reprojection errors along with a predicted direction and update the parameters based on the pattern of reprojection errors. This technique boosts the model’s capability in searching a local minimum under challenging scenarios. We also demonstrate that annotation from different sources can be merged under the framework of CPR and contributes to outperforming the current state-of-the-art solutions for 3D face alignment. Experiments indicate the proposed CPRFA (CPR-based Face Alignment) significantly improves 3D alignment accuracy when the densely annotated image contains noise and missing values, which is common under “in-the-wild” acquisition scenarios.
Tasks	Face Alignment
Published	2018-03-17
URL	http://arxiv.org/abs/1803.06542v2
PDF	http://arxiv.org/pdf/1803.06542v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-point-set-representation-a
Repo
Framework

Training Dynamic Exponential Family Models with Causal and Lateral Dependencies for Generalized Neuromorphic Computing


Title	Training Dynamic Exponential Family Models with Causal and Lateral Dependencies for Generalized Neuromorphic Computing
Authors	Hyeryung Jang, Osvaldo Simeone
Abstract	Neuromorphic hardware platforms, such as Intel’s Loihi chip, support the implementation of Spiking Neural Networks (SNNs) as an energy-efficient alternative to Artificial Neural Networks (ANNs). SNNs are networks of neurons with internal analogue dynamics that communicate by means of binary time series. In this work, a probabilistic model is introduced for a generalized set-up in which the synaptic time series can take values in an arbitrary alphabet and are characterized by both causal and instantaneous statistical dependencies. The model, which can be considered as an extension of exponential family harmoniums to time series, is introduced by means of a hybrid directed-undirected graphical representation. Furthermore, distributed learning rules are derived for Maximum Likelihood and Bayesian criteria under the assumption of fully observed time series in the training set.
Tasks	Time Series
Published	2018-10-21
URL	https://arxiv.org/abs/1810.08940v3
PDF	https://arxiv.org/pdf/1810.08940v3.pdf
PWC	https://paperswithcode.com/paper/training-dynamic-exponential-family-models
Repo
Framework

Emerging Applications of Reversible Data Hiding


Title	Emerging Applications of Reversible Data Hiding
Authors	Dongdong Hou, Weiming Zhang, Jiayang Liu, Siyan Zhou, Dongdong Chen, Nenghai Yu
Abstract	Reversible data hiding (RDH) is one special type of information hiding, by which the host sequence as well as the embedded data can be both restored from the marked sequence without loss. Beside media annotation and integrity authentication, recently some scholars begin to apply RDH in many other fields innovatively. In this paper, we summarize these emerging applications, including steganography, adversarial example, visual transformation, image processing, and give out the general frameworks to make these operations reversible. As far as we are concerned, this is the first paper to summarize the extended applications of RDH.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02928v1
PDF	http://arxiv.org/pdf/1811.02928v1.pdf
PWC	https://paperswithcode.com/paper/emerging-applications-of-reversible-data
Repo
Framework