January 30, 2020

2887 words 14 mins read

Paper Group ANR 298

Deep-Aligned Convolutional Neural Network for Skeleton-based Action Recognition and Segmentation. Curvature: A signature for Action Recognition in Video Sequences. Deep Compressed Sensing. Learning robust visual representations using data augmentation invariance. Iterative Self-Learning: Semi-Supervised Improvement to Dataset Volumes and Model Accu …

Deep-Aligned Convolutional Neural Network for Skeleton-based Action Recognition and Segmentation


Title	Deep-Aligned Convolutional Neural Network for Skeleton-based Action Recognition and Segmentation
Authors	Babak Hosseini, Romain Montagne, Barbara Hammer
Abstract	Convolutional neural networks (CNNs) are deep learning frameworks which are well-known for their notable performance in classification tasks. Hence, many skeleton-based action recognition and segmentation (SBARS) algorithms benefit from them in their designs. However, a shortcoming of such applications is the general lack of spatial relationships between the input features in such data types. Besides, non-uniform temporal scalings is a common issue in skeleton-based data streams which leads to having different input sizes even within one specific action category. In this work, we propose a novel deep-aligned convolutional neural network (DACNN) to tackle the above challenges for the particular problem of SBARS. Our network is designed by introducing a new type of filters in the context of CNNs which are trained based on their alignments to the local subsequences in the inputs. These filters result in efficient predictions as well as learning interpretable patterns in the data. We empirically evaluate our framework on real-world benchmarks showing that the proposed DACNN algorithm obtains a competitive performance compared to the state-of-the-art while benefiting from a less complicated yet more interpretable model.
Tasks	Skeleton Based Action Recognition
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04969v1
PDF	https://arxiv.org/pdf/1911.04969v1.pdf
PWC	https://paperswithcode.com/paper/deep-aligned-convolutional-neural-network-for
Repo
Framework

Curvature: A signature for Action Recognition in Video Sequences


Title	Curvature: A signature for Action Recognition in Video Sequences
Authors	He Chen, Gregory S. Chirikjian
Abstract	In this paper, a novel signature of human action recognition, namely the curvature of a video sequence, is introduced. In this way, the distribution of sequential data is modeled, which enables few-shot learning. Instead of depending on recognizing features within images, our algorithm views actions as sequences on the universal time scale across a whole sequence of images. The video sequence, viewed as a curve in pixel space, is aligned by reparameterization using the arclength of the curve in pixel space. Once such curvatures are obtained, statistical indexes are extracted and fed into a learning-based classifier. Overall, our method is simple but powerful. Preliminary experimental results show that our method is effective and achieves state-of-the-art performance in video-based human action recognition. Moreover, we see latent capacity in transferring this idea into other sequence-based recognition applications such as speech recognition, machine translation, and text generation.
Tasks	Few-Shot Learning, Machine Translation, Speech Recognition, Temporal Action Localization, Text Generation
Published	2019-04-30
URL	https://arxiv.org/abs/1904.13003v2
PDF	https://arxiv.org/pdf/1904.13003v2.pdf
PWC	https://paperswithcode.com/paper/curvature-a-signature-for-action-recognition
Repo
Framework

Deep Compressed Sensing


Title	Deep Compressed Sensing
Authors	Yan Wu, Mihaela Rosca, Timothy Lillicrap
Abstract	Compressed sensing (CS) provides an elegant framework for recovering sparse signals from compressed measurements. For example, CS can exploit the structure of natural images and recover an image from only a few random measurements. CS is flexible and data efficient, but its application has been restricted by the strong assumption of sparsity and costly reconstruction process. A recent approach that combines CS with neural network generators has removed the constraint of sparsity, but reconstruction remains slow. Here we propose a novel framework that significantly improves both the performance and speed of signal recovery by jointly training a generator and the optimisation process for reconstruction via meta-learning. We explore training the measurements with different objectives, and derive a family of models based on minimising measurement errors. We show that Generative Adversarial Nets (GANs) can be viewed as a special case in this family of models. Borrowing insights from the CS perspective, we develop a novel way of improving GANs using gradient information from the discriminator.
Tasks	Meta-Learning
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06723v2
PDF	https://arxiv.org/pdf/1905.06723v2.pdf
PWC	https://paperswithcode.com/paper/deep-compressed-sensing
Repo
Framework

Learning robust visual representations using data augmentation invariance


Title	Learning robust visual representations using data augmentation invariance
Authors	Alex Hernández-García, Peter König, Tim C. Kietzmann
Abstract	Deep convolutional neural networks trained for image object categorization have shown remarkable similarities with representations found across the primate ventral visual stream. Yet, artificial and biological networks still exhibit important differences. Here we investigate one such property: increasing invariance to identity-preserving image transformations found along the ventral stream. Despite theoretical evidence that invariance should emerge naturally from the optimization process, we present empirical evidence that the activations of convolutional neural networks trained for object categorization are not robust to identity-preserving image transformations commonly used in data augmentation. As a solution, we propose data augmentation invariance, an unsupervised learning objective which improves the robustness of the learned representations by promoting the similarity between the activations of augmented image samples. Our results show that this approach is a simple, yet effective and efficient (10 % increase in training time) way of increasing the invariance of the models while obtaining similar categorization performance.
Tasks	Data Augmentation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04547v1
PDF	https://arxiv.org/pdf/1906.04547v1.pdf
PWC	https://paperswithcode.com/paper/learning-robust-visual-representations-using
Repo
Framework

Iterative Self-Learning: Semi-Supervised Improvement to Dataset Volumes and Model Accuracy


Title	Iterative Self-Learning: Semi-Supervised Improvement to Dataset Volumes and Model Accuracy
Authors	Robert Dupre, Jiri Fajtl, Vasileios Argyriou, Paolo Remagnin
Abstract	A novel semi-supervised learning technique is introduced based on a simple iterative learning cycle together with learned thresholding techniques and an ensemble decision support system. State-of-the-art model performance and increased training data volume are demonstrated, through the use of unlabelled data when training deeply learned classification models. Evaluation of the proposed approach is performed on commonly used datasets when evaluating semi-supervised learning techniques as well as a number of more challenging image classification datasets (CIFAR-100 and a 200 class subset of ImageNet).
Tasks	Image Classification
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02823v1
PDF	https://arxiv.org/pdf/1906.02823v1.pdf
PWC	https://paperswithcode.com/paper/iterative-self-learning-semi-supervised
Repo
Framework

Spatial Shortcut Network for Human Pose Estimation


Title	Spatial Shortcut Network for Human Pose Estimation
Authors	Te Qi, Bayram Bayramli, Usman Ali, Qinchuan Zhang, Hongtao Lu
Abstract	Like many computer vision problems, human pose estimation is a challenging problem in that recognizing a body part requires not only information from local area but also from areas with large spatial distance. In order to spatially pass information, large convolutional kernels and deep layers have been normally used, introducing high computation cost and large parameter space. Luckily for pose estimation, human body is geometrically structured in images, enabling modeling of spatial dependency. In this paper, we propose a spatial shortcut network for pose estimation task, where information is easier to flow spatially. We evaluate our model with detailed analyses and present its outstanding performance with smaller structure.
Tasks	Pose Estimation
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03141v1
PDF	http://arxiv.org/pdf/1904.03141v1.pdf
PWC	https://paperswithcode.com/paper/spatial-shortcut-network-for-human-pose
Repo
Framework

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis


Title	One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis
Authors	Vishal Sunder, Ashwin Srinivasan, Lovekesh Vig, Gautam Shroff, Rohit Rahul
Abstract	Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural networks to populate a relational database with facts about each document-image; and (b) we use a form of deductive reasoning, related to meta-interpretive learning of transition systems to learn extraction programs: Given task-specific transitions defined using the entities and relations identified by the neural detectors and a small number of instances (usually 1, sometimes 2) of images and the desired outputs, a resource-bounded meta-interpreter constructs proofs for the instance(s) via logical deduction; a set of logic programs that extract each desired entity is easily synthesized from such proofs. In most cases a single training example together with a noisy-clone of itself suffices to learn a program-set that generalizes well on test documents, at which time the value of each entity is determined by a majority vote across its program-set. We demonstrate our two-level neuro-deductive approach on publicly available datasets (“Patent” and “Doctor’s Bills”) and also describe its use in a real-life industrial problem.
Tasks	Program Synthesis
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02427v1
PDF	https://arxiv.org/pdf/1906.02427v1.pdf
PWC	https://paperswithcode.com/paper/one-shot-information-extraction-from-document
Repo
Framework

OpenHowNet: An Open Sememe-based Lexical Knowledge Base


Title	OpenHowNet: An Open Sememe-based Lexical Knowledge Base
Authors	Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Qiang Dong, Maosong Sun, Zhendong Dong
Abstract	In this paper, we present an open sememe-based lexical knowledge base OpenHowNet. Based on well-known HowNet, OpenHowNet comprises three components: core data which is composed of more than 100 thousand senses annotated with sememes, OpenHowNet Web which gives a brief introduction to OpenHowNet as well as provides online exhibition of OpenHowNet information, and OpenHowNet API which includes several useful APIs such as accessing OpenHowNet core data and drawing sememe tree structures of senses. In the main text, we first give some backgrounds including definition of sememe and details of HowNet. And then we introduce some previous HowNet and sememe-based research works. Last but not least, we detail the constituents of OpenHowNet and their basic features and functionalities. Additionally, we briefly make a summary and list some future works.
Tasks
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09957v1
PDF	http://arxiv.org/pdf/1901.09957v1.pdf
PWC	https://paperswithcode.com/paper/openhownet-an-open-sememe-based-lexical
Repo
Framework

Memory-Attended Recurrent Network for Video Captioning


Title	Memory-Attended Recurrent Network for Video Captioning
Authors	Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen, Yu-Wing Tai
Abstract	Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed. A potential disadvantage of such design is that it cannot capture the multiple visual context information of a word appearing in more than one relevant videos in training data. To tackle this limitation, we propose the Memory-Attended Recurrent Network (MARN) for video captioning, in which a memory structure is designed to explore the full-spectrum correspondence between a word and its various similar visual contexts across videos in training data. Thus, our model is able to achieve a more comprehensive understanding for each word and yield higher captioning quality. Furthermore, the built memory structure enables our method to model the compatibility between adjacent words explicitly instead of asking the model to learn implicitly, as most existing models do. Extensive validation on two real-word datasets demonstrates that our MARN consistently outperforms state-of-the-art methods.
Tasks	Video Captioning
Published	2019-05-10
URL	https://arxiv.org/abs/1905.03966v1
PDF	https://arxiv.org/pdf/1905.03966v1.pdf
PWC	https://paperswithcode.com/paper/memory-attended-recurrent-network-for-video
Repo
Framework

NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks


Title	NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks
Authors	Yandong Li, Lijun Li, Liqiang Wang, Tong Zhang, Boqing Gong
Abstract	Powerful adversarial attack methods are vital for understanding how to construct robust deep neural networks (DNNs) and for thoroughly testing defense techniques. In this paper, we propose a black-box adversarial attack algorithm that can defeat both vanilla DNNs and those generated by various defense techniques developed recently. Instead of searching for an “optimal” adversarial example for a benign input to a targeted DNN, our algorithm finds a probability density distribution over a small region centered around the input, such that a sample drawn from this distribution is likely an adversarial example, without the need of accessing the DNN’s internal layers or weights. Our approach is universal as it can successfully attack different neural networks by a single algorithm. It is also strong; according to the testing against 2 vanilla DNNs and 13 defended ones, it outperforms state-of-the-art black-box or white-box attack methods for most test cases. Additionally, our results reveal that adversarial training remains one of the best defense techniques, and the adversarial examples are not as transferable across defended DNNs as them across vanilla DNNs.
Tasks	Adversarial Attack
Published	2019-05-01
URL	https://arxiv.org/abs/1905.00441v3
PDF	https://arxiv.org/pdf/1905.00441v3.pdf
PWC	https://paperswithcode.com/paper/nattack-learning-the-distributions-of
Repo
Framework

A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning


Title	A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning
Authors	Phanideep Gampa, Sairam Satwik Kondamudi, Lakshmanan Kailasam
Abstract	We consider the finite horizon continuous reinforcement learning problem. Our contribution is three-fold. First,we give a tractable algorithm based on optimistic value iteration for the problem. Next,we give a lower bound on regret of order $\Omega(T^{2/3})$ for any algorithm discretizes the state space, improving the previous regret bound of $\Omega(T^{1/2})$ of Ortner and Ryabko \cite{contrl} for the same problem. Next,under the assumption that the rewards and transitions are H"{o}lder Continuous we show that the upper bound on the discretization error is $const.Ln^{-\alpha}T$. Finally,we give some simple experiments to validate our propositions.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11245v1
PDF	https://arxiv.org/pdf/1906.11245v1.pdf
PWC	https://paperswithcode.com/paper/a-tractable-algorithm-for-finite-horizon
Repo
Framework

Monotonic Gaussian Process Flow


Title	Monotonic Gaussian Process Flow
Authors	Ivan Ustyuzhaninov, Ieva Kazlauskaite, Carl Henrik Ek, Neill D. F. Campbell
Abstract	We propose a new framework for imposing monotonicity constraints in a Bayesian nonparametric setting based on numerical solutions of stochastic differential equations. We derive a nonparametric model of monotonic functions that allows for interpretable priors and principled quantification of hierarchical uncertainty. We demonstrate the efficacy of the proposed model by providing competitive results to other probabilistic monotonic models on a number of benchmark functions. In addition, we consider the utility of a monotonic random process as a part of a hierarchical probabilistic model; we examine the task of temporal alignment of time-series data where it is beneficial to use a monotonic random process in order to preserve the uncertainty in the temporal warpings.
Tasks	Gaussian Processes, Time Series
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12930v2
PDF	https://arxiv.org/pdf/1905.12930v2.pdf
PWC	https://paperswithcode.com/paper/monotonic-gaussian-process-flow
Repo
Framework

Neural Likelihoods for Multi-Output Gaussian Processes


Title	Neural Likelihoods for Multi-Output Gaussian Processes
Authors	Martin Jankowiak, Jacob Gardner
Abstract	We construct flexible likelihoods for multi-output Gaussian process models that leverage neural networks as components. We make use of sparse variational inference methods to enable scalable approximate inference for the resulting class of models. An attractive feature of these models is that they can admit analytic predictive means even when the likelihood is non-linear and the predictive distributions are non-Gaussian. We validate the modeling potential of these models in a variety of experiments in both the supervised and unsupervised setting. We demonstrate that the flexibility of these `neural’ likelihoods can improve prediction quality as compared to simpler Gaussian process models and that neural likelihoods can be readily combined with a variety of underlying Gaussian process models, including deep Gaussian processes. \|
Tasks	Gaussian Processes
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13697v1
PDF	https://arxiv.org/pdf/1905.13697v1.pdf
PWC	https://paperswithcode.com/paper/neural-likelihoods-for-multi-output-gaussian
Repo
Framework

POBA-GA: Perturbation Optimized Black-Box Adversarial Attacks via Genetic Algorithm


Title	POBA-GA: Perturbation Optimized Black-Box Adversarial Attacks via Genetic Algorithm
Authors	Jinyin Chen, Mengmeng Su, Shijing Shen, Hui Xiong, Haibin Zheng
Abstract	Most deep learning models are easily vulnerable to adversarial attacks. Various adversarial attacks are designed to evaluate the robustness of models and develop defense model. Currently, adversarial attacks are brought up to attack their own target model with their own evaluation metrics. And most of the black-box adversarial attack algorithms cannot achieve the expected success rate compared with white-box attacks. In this paper, comprehensive evaluation metrics are brought up for different adversarial attack methods. A novel perturbation optimized black-box adversarial attack based on genetic algorithm (POBA-GA) is proposed for achieving white-box comparable attack performances. Approximate optimal adversarial examples are evolved through evolutionary operations including initialization, selection, crossover and mutation. Fitness function is specifically designed to evaluate the example individual in both aspects of attack ability and perturbation control. Population diversity strategy is brought up in evolutionary process to promise the approximate optimal perturbations obtained. Comprehensive experiments are carried out to testify POBA-GA’s performances. Both simulation and application results prove that our method is better than current state-of-art black-box attack methods in aspects of attack capability and perturbation control.
Tasks	Adversarial Attack
Published	2019-05-01
URL	https://arxiv.org/abs/1906.03181v1
PDF	https://arxiv.org/pdf/1906.03181v1.pdf
PWC	https://paperswithcode.com/paper/poba-ga-perturbation-optimized-black-box
Repo
Framework

AT-GAN: An Adversarial Generator Model for Non-constrained Adversarial Examples


Title	AT-GAN: An Adversarial Generator Model for Non-constrained Adversarial Examples
Authors	Xiaosen Wang, Kun He, Chuanbiao Song, Liwei Wang, John E. Hopcroft
Abstract	Despite the rapid development of adversarial machine learning, most adversarial attack and defense researches mainly focus on the perturbation-based adversarial examples, which is constrained by the input images. In comparison with existing works, we propose non-constrained adversarial examples, which are generated entirely from scratch without any constraint on the input. Unlike perturbation-based attacks, or the so-called unrestricted adversarial attack which is still constrained by the input noise, we aim to learn the distribution of adversarial examples to generate non-constrained but semantically meaningful adversarial examples. Following this spirit, we propose a novel attack framework called AT-GAN (Adversarial Transfer on Generative Adversarial Net). Specifically, we first develop a normal GAN model to learn the distribution of benign data, and then transfer the pre-trained GAN model to estimate the distribution of adversarial examples for the target model. In this way, AT-GAN can learn the distribution of adversarial examples that is very close to the distribution of real data. To our knowledge, this is the first work of building an adversarial generator model that could produce adversarial examples directly from any input noise. Extensive experiments and visualizations show that the proposed AT-GAN can very efficiently generate diverse adversarial examples that are more realistic to human perception. In addition, AT-GAN yields higher attack success rates against adversarially trained models under white-box attack setting and exhibits moderate transferability against black-box models.
Tasks	Adversarial Attack
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07793v4
PDF	https://arxiv.org/pdf/1904.07793v4.pdf
PWC	https://paperswithcode.com/paper/at-gan-a-generative-attack-model-for
Repo
Framework