July 29, 2019

3029 words 15 mins read

Paper Group ANR 2

Analysis of universal adversarial perturbations. Efficient Spatio-Temporal Gaussian Regression via Kalman Filtering. Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming. Exploring Latent Semantic Factors to Find Useful Product Reviews. Sequential Dual Deep Learning with Shape and Texture Features for Sketch Recognition. Spa …

Analysis of universal adversarial perturbations


Title	Analysis of universal adversarial perturbations
Authors	Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto
Abstract	Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers. In this paper, we propose the first quantitative analysis of the robustness of classifiers to universal perturbations, and draw a formal link between the robustness to universal perturbations, and the geometry of the decision boundary. Specifically, we establish theoretical bounds on the robustness of classifiers under two decision boundary models (flat and curved models). We show in particular that the robustness of deep networks to universal perturbations is driven by a key property of their curvature: there exists shared directions along which the decision boundary of deep networks is systematically positively curved. Under such conditions, we prove the existence of small universal perturbations. Our analysis further provides a novel geometric method for computing universal perturbations, in addition to explaining their properties.
Tasks
Published	2017-05-26
URL	http://arxiv.org/abs/1705.09554v1
PDF	http://arxiv.org/pdf/1705.09554v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-universal-adversarial
Repo
Framework

Efficient Spatio-Temporal Gaussian Regression via Kalman Filtering


Title	Efficient Spatio-Temporal Gaussian Regression via Kalman Filtering
Authors	Marco Todescato, Andrea Carron, Ruggero Carli, Gianluigi Pillonetto, Luca Schenato
Abstract	In this work we study the non-parametric reconstruction of spatio-temporal dynamical Gaussian processes (GPs) via GP regression from sparse and noisy data. GPs have been mainly applied to spatial regression where they represent one of the most powerful estimation approaches also thanks to their universal representing properties. Their extension to dynamical processes has been instead elusive so far since classical implementations lead to unscalable algorithms. We then propose a novel procedure to address this problem by coupling GP regression and Kalman filtering. In particular, assuming space/time separability of the covariance (kernel) of the process and rational time spectrum, we build a finite-dimensional discrete-time state-space process representation amenable of Kalman filtering. With sampling over a finite set of fixed spatial locations, our major finding is that the Kalman filter state at instant $t_k$ represents a sufficient statistic to compute the minimum variance estimate of the process at any $t \geq t_k$ over the entire spatial domain. This result can be interpreted as a novel Kalman representer theorem for dynamical GPs. We then extend the study to situations where the set of spatial input locations can vary over time. The proposed algorithms are finally tested on both synthetic and real field data, also providing comparisons with standard GP and truncated GP regression techniques.
Tasks	Gaussian Processes
Published	2017-05-03
URL	http://arxiv.org/abs/1705.01485v1
PDF	http://arxiv.org/pdf/1705.01485v1.pdf
PWC	https://paperswithcode.com/paper/efficient-spatio-temporal-gaussian-regression
Repo
Framework

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming


Title	Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming
Authors	Tadashi Kozuno, Eiji Uchibe, Kenji Doya
Abstract	Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks. In this paper we propose a new, robust dynamic programming algorithm that unifies value iteration, advantage learning, and dynamic policy programming. We call it generalized value iteration (GVI) and its approximated version, approximate GVI (AGVI). We show AGVI’s performance guarantee, which includes performance guarantees for existing algorithms, as special cases. We discuss theoretical weaknesses of existing algorithms, and explain the advantages of AGVI. Numerical experiments in a simple environment support theoretical arguments, and suggest that AGVI is a promising alternative to previous algorithms.
Tasks
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10866v1
PDF	http://arxiv.org/pdf/1710.10866v1.pdf
PWC	https://paperswithcode.com/paper/unifying-value-iteration-advantage-learning
Repo
Framework

Exploring Latent Semantic Factors to Find Useful Product Reviews


Title	Exploring Latent Semantic Factors to Find Useful Product Reviews
Authors	Subhabrata Mukherjee, Kashyap Popat, Gerhard Weikum
Abstract	Online reviews provided by consumers are a valuable asset for e-Commerce platforms, influencing potential consumers in making purchasing decisions. However, these reviews are of varying quality, with the useful ones buried deep within a heap of non-informative reviews. In this work, we attempt to automatically identify review quality in terms of its helpfulness to the end consumers. In contrast to previous works in this domain exploiting a variety of syntactic and community-level features, we delve deep into the semantics of reviews as to what makes them useful, providing interpretable explanation for the same. We identify a set of consistency and semantic factors, all from the text, ratings, and timestamps of user-generated reviews, making our approach generalizable across all communities and domains. We explore review semantics in terms of several latent factors like the expertise of its author, his judgment about the fine-grained facets of the underlying product, and his writing style. These are cast into a Hidden Markov Model – Latent Dirichlet Allocation (HMM-LDA) based model to jointly infer: (i) reviewer expertise, (ii) item facets, and (iii) review helpfulness. Large-scale experiments on five real-world datasets from Amazon show significant improvement over state-of-the-art baselines in predicting and ranking useful reviews.
Tasks
Published	2017-05-06
URL	http://arxiv.org/abs/1705.02518v1
PDF	http://arxiv.org/pdf/1705.02518v1.pdf
PWC	https://paperswithcode.com/paper/exploring-latent-semantic-factors-to-find
Repo
Framework

Sequential Dual Deep Learning with Shape and Texture Features for Sketch Recognition


Title	Sequential Dual Deep Learning with Shape and Texture Features for Sketch Recognition
Authors	Qi Jia, Meiyu Yu, Xin Fan, Haojie Li
Abstract	Recognizing freehand sketches with high arbitrariness is greatly challenging. Most existing methods either ignore the geometric characteristics or treat sketches as handwritten characters with fixed structural ordering. Consequently, they can hardly yield high recognition performance even though sophisticated learning techniques are employed. In this paper, we propose a sequential deep learning strategy that combines both shape and texture features. A coded shape descriptor is exploited to characterize the geometry of sketch strokes with high flexibility, while the outputs of constitutional neural networks (CNN) are taken as the abstract texture feature. We develop dual deep networks with memorable gated recurrent units (GRUs), and sequentially feed these two types of features into the dual networks, respectively. These dual networks enable the feature fusion by another gated recurrent unit (GRU), and thus accurately recognize sketches invariant to stroke ordering. The experiments on the TU-Berlin data set show that our method outperforms the average of human and state-of-the-art algorithms even when significant shape and appearance variations occur.
Tasks	Sketch Recognition
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02716v1
PDF	http://arxiv.org/pdf/1708.02716v1.pdf
PWC	https://paperswithcode.com/paper/sequential-dual-deep-learning-with-shape-and
Repo
Framework

Sparse Weighted Canonical Correlation Analysis


Title	Sparse Weighted Canonical Correlation Analysis
Authors	Wenwen Min, Juan Liu, Shihua Zhang
Abstract	Given two data matrices $X$ and $Y$, sparse canonical correlation analysis (SCCA) is to seek two sparse canonical vectors $u$ and $v$ to maximize the correlation between $Xu$ and $Yv$. However, classical and sparse CCA models consider the contribution of all the samples of data matrices and thus cannot identify an underlying specific subset of samples. To this end, we propose a novel sparse weighted canonical correlation analysis (SWCCA), where weights are used for regularizing different samples. We solve the $L_0$-regularized SWCCA ($L_0$-SWCCA) using an alternating iterative algorithm. We apply $L_0$-SWCCA to synthetic data and real-world data to demonstrate its effectiveness and superiority compared to related methods. Lastly, we consider also SWCCA with different penalties like LASSO (Least absolute shrinkage and selection operator) and Group LASSO, and extend it for integrating more than three data matrices.
Tasks
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04792v1
PDF	http://arxiv.org/pdf/1710.04792v1.pdf
PWC	https://paperswithcode.com/paper/sparse-weighted-canonical-correlation
Repo
Framework

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning


Title	Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning
Authors	Jingkuan Song, Zhao Guo, Lianli Gao, Wu Liu, Dongxiang Zhang, Heng Tao Shen
Abstract	Recent progress has been made in using attention based encoder-decoder framework for video captioning. However, most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., “gun” and “shooting”) and non-visual words (e.g. “the”, “a”). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of video captioning. To address this issue, we propose a hierarchical LSTM with adjusted temporal attention (hLSTMat) approach for video captioning. Specifically, the proposed framework utilizes the temporal attention for selecting specific frames to predict the related words, while the adjusted temporal attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the video caption generation. To demonstrate the effectiveness of our proposed framework, we test our method on two prevalent datasets: MSVD and MSR-VTT, and experimental results show that our approach outperforms the state-of-the-art methods on both two datasets.
Tasks	Language Modelling, Video Captioning
Published	2017-06-05
URL	http://arxiv.org/abs/1706.01231v1
PDF	http://arxiv.org/pdf/1706.01231v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-lstm-with-adjusted-temporal
Repo
Framework

Unconstrained Face Detection and Open-Set Face Recognition Challenge


Title	Unconstrained Face Detection and Open-Set Face Recognition Challenge
Authors	Manuel Günther, Peiyun Hu, Christian Herrmann, Chi Ho Chan, Min Jiang, Shufan Yang, Akshay Raj Dhamija, Deva Ramanan, Jürgen Beyerer, Josef Kittler, Mohamad Al Jazaery, Mohammad Iqbal Nouyed, Guodong Guo, Cezary Stankiewicz, Terrance E. Boult
Abstract	Face detection and recognition benchmarks have shifted toward more difficult environments. The challenge presented in this paper addresses the next step in the direction of automatic detection and identification of people from outdoor surveillance cameras. While face detection has shown remarkable success in images collected from the web, surveillance cameras include more diverse occlusions, poses, weather conditions and image blur. Although face verification or closed-set face identification have surpassed human capabilities on some datasets, open-set identification is much more complex as it needs to reject both unknown identities and false accepts from the face detector. We show that unconstrained face detection can approach high detection rates albeit with moderate false accept rates. By contrast, open-set face recognition is currently weak and requires much more attention.
Tasks	Face Detection, Face Identification, Face Recognition, Face Verification, Robust Face Recognition
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02337v3
PDF	http://arxiv.org/pdf/1708.02337v3.pdf
PWC	https://paperswithcode.com/paper/unconstrained-face-detection-and-open-set
Repo
Framework

DPCA: Dimensionality Reduction for Discriminative Analytics of Multiple Large-Scale Datasets


Title	DPCA: Dimensionality Reduction for Discriminative Analytics of Multiple Large-Scale Datasets
Authors	Gang Wang, Jia Chen, Georgios B. Giannakis
Abstract	Principal component analysis (PCA) has well-documented merits for data extraction and dimensionality reduction. PCA deals with a single dataset at a time, and it is challenged when it comes to analyzing multiple datasets. Yet in certain setups, one wishes to extract the most significant information of one dataset relative to other datasets. Specifically, the interest may be on identifying, namely extracting features that are specific to a single target dataset but not the others. This paper develops a novel approach for such so-termed discriminative data analysis, and establishes its optimality in the least-squares (LS) sense under suitable data modeling assumptions. The criterion reveals linear combinations of variables by maximizing the ratio of the variance of the target data to that of the remainders. The novel approach solves a generalized eigenvalue problem by performing SVD just once. Numerical tests using synthetic and real datasets showcase the merits of the proposed approach relative to its competing alternatives.
Tasks	Dimensionality Reduction
Published	2017-10-25
URL	http://arxiv.org/abs/1710.09429v1
PDF	http://arxiv.org/pdf/1710.09429v1.pdf
PWC	https://paperswithcode.com/paper/dpca-dimensionality-reduction-for
Repo
Framework

End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks


Title	End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks
Authors	Min-je Choi, Sehun Jeong, Hakjoo Oh, Jaegul Choo
Abstract	Detecting buffer overruns from a source code is one of the most common and yet challenging tasks in program analysis. Current approaches have mainly relied on rigid rules and handcrafted features devised by a few experts, limiting themselves in terms of flexible applicability and robustness due to diverse bug patterns and characteristics existing in sophisticated real-world software programs. In this paper, we propose a novel, data-driven approach that is completely end-to-end without requiring any hand-crafted features, thus free from any program language-specific structural limitations. In particular, our approach leverages a recently proposed neural network model called memory networks that have shown the state-of-the-art performances mainly in question-answering tasks. Our experimental results using source codes demonstrate that our proposed model is capable of accurately detecting simple buffer overruns. We also present in-depth analyses on how a memory network can learn to understand the semantics in programming languages solely from raw source codes, such as tracing variables of interest, identifying numerical values, and performing their quantitative comparisons.
Tasks	Question Answering
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02458v1
PDF	http://arxiv.org/pdf/1703.02458v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-prediction-of-buffer-overruns-from
Repo
Framework

An Augmented Lagrangian Method for Piano Transcription using Equal Loudness Thresholding and LSTM-based Decoding


Title	An Augmented Lagrangian Method for Piano Transcription using Equal Loudness Thresholding and LSTM-based Decoding
Authors	Sebastian Ewert, Mark B. Sandler
Abstract	A central goal in automatic music transcription is to detect individual note events in music recordings. An important variant is instrument-dependent music transcription where methods can use calibration data for the instruments in use. However, despite the additional information, results rarely exceed an f-measure of 80%. As a potential explanation, the transcription problem can be shown to be badly conditioned and thus relies on appropriate regularization. A recently proposed method employs a mixture of simple, convex regularizers (to stabilize the parameter estimation process) and more complex terms (to encourage more meaningful structure). In this paper, we present two extensions to this method. First, we integrate a computational loudness model to better differentiate real from spurious note detections. Second, we employ (Bidirectional) Long Short Term Memory networks to re-weight the likelihood of detected note constellations. Despite their simplicity, our two extensions lead to a drop of about 35% in note error rate compared to the state-of-the-art.
Tasks	Calibration
Published	2017-07-01
URL	http://arxiv.org/abs/1707.00160v3
PDF	http://arxiv.org/pdf/1707.00160v3.pdf
PWC	https://paperswithcode.com/paper/an-augmented-lagrangian-method-for-piano
Repo
Framework

Robust Tracking and Behavioral Modeling of Movements of Biological Collectives from Ordinary Video Recordings


Title	Robust Tracking and Behavioral Modeling of Movements of Biological Collectives from Ordinary Video Recordings
Authors	Hiroki Sayama, Farnaz Zamani Esfahlani, Ali Jazayeri, J. Scott Turner
Abstract	We propose a novel computational method to extract information about interactions among individuals with different behavioral states in a biological collective from ordinary video recordings. Assuming that individuals are acting as finite state machines, our method first detects discrete behavioral states of those individuals and then constructs a model of their state transitions, taking into account the positions and states of other individuals in the vicinity. We have tested the proposed method through applications to two real-world biological collectives: termites in an experimental setting and human pedestrians in a university campus. For each application, a robust tracking system was developed in-house, utilizing interactive human intervention (for termite tracking) or online agent-based simulation (for pedestrian tracking). In both cases, significant interactions were detected between nearby individuals with different states, demonstrating the effectiveness of the proposed method.
Tasks
Published	2017-07-23
URL	http://arxiv.org/abs/1707.07310v2
PDF	http://arxiv.org/pdf/1707.07310v2.pdf
PWC	https://paperswithcode.com/paper/robust-tracking-and-behavioral-modeling-of
Repo
Framework

Modelling dependency completion in sentence comprehension as a Bayesian hierarchical mixture process: A case study involving Chinese relative clauses


Title	Modelling dependency completion in sentence comprehension as a Bayesian hierarchical mixture process: A case study involving Chinese relative clauses
Authors	Shravan Vasishth, Nicolas Chopin, Robin Ryder, Bruno Nicenboim
Abstract	We present a case-study demonstrating the usefulness of Bayesian hierarchical mixture modelling for investigating cognitive processes. In sentence comprehension, it is widely assumed that the distance between linguistic co-dependents affects the latency of dependency resolution: the longer the distance, the longer the retrieval time (the distance-based account). An alternative theory, direct-access, assumes that retrieval times are a mixture of two distributions: one distribution represents successful retrievals (these are independent of dependency distance) and the other represents an initial failure to retrieve the correct dependent, followed by a reanalysis that leads to successful retrieval. We implement both models as Bayesian hierarchical models and show that the direct-access model explains Chinese relative clause reading time data better than the distance account.
Tasks
Published	2017-02-02
URL	http://arxiv.org/abs/1702.00564v2
PDF	http://arxiv.org/pdf/1702.00564v2.pdf
PWC	https://paperswithcode.com/paper/modelling-dependency-completion-in-sentence
Repo
Framework

Representation Learning by Rotating Your Faces


Title	Representation Learning by Rotating Your Faces
Authors	Luan Tran, Xi Yin, Xiaoming Liu
Abstract	The large pose discrepancy between two face images is one of the fundamental challenges in automatic face recognition. Conventional approaches to pose-invariant face recognition either perform face frontalization on, or learn a pose-invariant representation from, a non-frontal face image. We argue that it is more desirable to perform both tasks jointly to allow them to leverage each other. To this end, this paper proposes a Disentangled Representation learning-Generative Adversarial Network (DR-GAN) with three distinct novelties. First, the encoder-decoder structure of the generator enables DR-GAN to learn a representation that is both generative and discriminative, which can be used for face image synthesis and pose-invariant face recognition. Second, this representation is explicitly disentangled from other face variations such as pose, through the pose code provided to the decoder and pose estimation in the discriminator. Third, DR-GAN can take one or multiple images as the input, and generate one unified identity representation along with an arbitrary number of synthetic face images. Extensive quantitative and qualitative evaluation on a number of controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art in both learning representations and rotating large-pose face images.
Tasks	Face Recognition, Image Generation, Pose Estimation, Representation Learning, Robust Face Recognition
Published	2017-05-31
URL	http://arxiv.org/abs/1705.11136v2
PDF	http://arxiv.org/pdf/1705.11136v2.pdf
PWC	https://paperswithcode.com/paper/representation-learning-by-rotating-your
Repo
Framework

Dempster-Shafer Belief Function - A New Interpretation


Title	Dempster-Shafer Belief Function - A New Interpretation
Authors	Mieczysław Kłopotek
Abstract	We develop our interpretation of the joint belief distribution and of evidential updating that matches the following basic requirements: * there must exist an efficient method for reasoning within this framework * there must exist a clear correspondence between the contents of the knowledge base and the real world * there must be a clear correspondence between the reasoning method and some real world process * there must exist a clear correspondence between the results of the reasoning process and the results of the real world process corresponding to the reasoning process.
Tasks
Published	2017-04-13
URL	http://arxiv.org/abs/1704.04000v1
PDF	http://arxiv.org/pdf/1704.04000v1.pdf
PWC	https://paperswithcode.com/paper/dempster-shafer-belief-function-a-new
Repo
Framework