July 29, 2019

3110 words 15 mins read

Paper Group AWR 116

Learning Universal Adversarial Perturbations with Generative Models. The Lifted Matrix-Space Model for Semantic Composition. Table-to-text Generation by Structure-aware Seq2seq Learning. Specialising Word Vectors for Lexical Entailment. Accelerated Nearest Neighbor Search with Quick ADC. Catalyst Acceleration for First-order Convex Optimization: fr …

Learning Universal Adversarial Perturbations with Generative Models


Title	Learning Universal Adversarial Perturbations with Generative Models
Authors	Jamie Hayes, George Danezis
Abstract	Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally perturbed to remain visually similar to the source input, but cause a misclassification. It was recently shown that given a dataset and classifier, there exists so called universal adversarial perturbations, a single perturbation that causes a misclassification when applied to any input. In this work, we introduce universal adversarial networks, a generative network that is capable of fooling a target classifier when it’s generated output is added to a clean sample from a dataset. We show that this technique improves on known universal adversarial attacks.
Tasks	Graph Classification
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05207v3
PDF	http://arxiv.org/pdf/1708.05207v3.pdf
PWC	https://paperswithcode.com/paper/learning-universal-adversarial-perturbations
Repo	https://github.com/jhayes14/UAN
Framework	pytorch

The Lifted Matrix-Space Model for Semantic Composition


Title	The Lifted Matrix-Space Model for Semantic Composition
Authors	WooJin Chung, Sheng-Fu Wang, Samuel R. Bowman
Abstract	Tree-structured neural network architectures for sentence encoding draw inspiration from the approach to semantic composition generally seen in formal linguistics, and have shown empirical improvements over comparable sequence models by doing so. Moreover, adding multiplicative interaction terms to the composition functions in these models can yield significant further improvements. However, existing compositional approaches that adopt such a powerful composition function scale poorly, with parameter counts exploding as model dimension or vocabulary size grows. We introduce the Lifted Matrix-Space model, which uses a global transformation to map vector word embeddings to matrices, which can then be composed via an operation based on matrix-matrix multiplication. Its composition function effectively transmits a larger number of activations across layers with relatively few model parameters. We evaluate our model on the Stanford NLI corpus, the Multi-Genre NLI corpus, and the Stanford Sentiment Treebank and find that it consistently outperforms TreeLSTM (Tai et al., 2015), the previous best known composition function for tree-structured models.
Tasks	Semantic Composition, Word Embeddings
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03602v2
PDF	http://arxiv.org/pdf/1711.03602v2.pdf
PWC	https://paperswithcode.com/paper/the-lifted-matrix-space-model-for-semantic
Repo	https://github.com/woojinchung/lms
Framework	pytorch

Table-to-text Generation by Structure-aware Seq2seq Learning


Title	Table-to-text Generation by Structure-aware Seq2seq Learning
Authors	Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, Zhifang Sui
Abstract	Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the \texttt{WIKIBIO} dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio.
Tasks	Table-to-Text Generation, Text Generation
Published	2017-11-27
URL	http://arxiv.org/abs/1711.09724v1
PDF	http://arxiv.org/pdf/1711.09724v1.pdf
PWC	https://paperswithcode.com/paper/table-to-text-generation-by-structure-aware
Repo	https://github.com/tyliupku/wiki2bio
Framework	tf

Specialising Word Vectors for Lexical Entailment


Title	Specialising Word Vectors for Lexical Entailment
Authors	Ivan Vulić, Nikola Mrkšić
Abstract	We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06371v2
PDF	http://arxiv.org/pdf/1710.06371v2.pdf
PWC	https://paperswithcode.com/paper/specialising-word-vectors-for-lexical-1
Repo	https://github.com/nmrksic/lear
Framework	tf

Accelerated Nearest Neighbor Search with Quick ADC


Title	Accelerated Nearest Neighbor Search with Quick ADC
Authors	Fabien André, Anne-Marie Kermarrec, Nicolas Le Scouarnec
Abstract	Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. Because it offers low responses times, Product Quantization (PQ) is a popular solution. PQ compresses high-dimensional vectors into short codes using several sub-quantizers, which enables in-RAM storage of large databases. This allows fast answers to NN queries, without accessing the SSD or HDD. The key feature of PQ is that it can compute distances between short codes and high-dimensional vectors using cache-resident lookup tables. The efficiency of this technique, named Asymmetric Distance Computation (ADC), remains limited because it performs many cache accesses. In this paper, we introduce Quick ADC, a novel technique that achieves a 3 to 6 times speedup over ADC by exploiting Single Instruction Multiple Data (SIMD) units available in current CPUs. Efficiently exploiting SIMD requires algorithmic changes to the ADC procedure. Namely, Quick ADC relies on two key modifications of ADC: (i) the use 4-bit sub-quantizers instead of the standard 8-bit sub-quantizers and (ii) the quantization of floating-point distances. This allows Quick ADC to exceed the performance of state-of-the-art systems, e.g., it achieves a Recall@100 of 0.94 in 3.4 ms on 1 billion SIFT descriptors (128-bit codes).
Tasks	Quantization
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07355v1
PDF	http://arxiv.org/pdf/1704.07355v1.pdf
PWC	https://paperswithcode.com/paper/accelerated-nearest-neighbor-search-with
Repo	https://github.com/technicolor-research/quick-adc
Framework	none

Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice


Title	Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice
Authors	Hongzhou Lin, Julien Mairal, Zaid Harchaoui
Abstract	We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration in theory and in practice is to solve these sub-problems with appropriate accuracy by using the right stopping criterion and the right warm-start strategy. We give practical guidelines to use Catalyst and present a comprehensive analysis of its global complexity. We show that Catalyst applies to a large class of algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, MISO/Finito, and their proximal variants. For all of these methods, we establish faster rates using the Catalyst acceleration, for strongly convex and non-strongly convex objectives. We conclude with extensive experiments showing that acceleration is useful in practice, especially for ill-conditioned problems.
Tasks
Published	2017-12-15
URL	http://arxiv.org/abs/1712.05654v2
PDF	http://arxiv.org/pdf/1712.05654v2.pdf
PWC	https://paperswithcode.com/paper/catalyst-acceleration-for-first-order-convex
Repo	https://github.com/hongzhoulin89/Catalyst-QNing
Framework	none

Crowd counting via scale-adaptive convolutional neural network


Title	Crowd counting via scale-adaptive convolutional neural network
Authors	Lu Zhang, Miaojing Shi, Qiaobo Chen
Abstract	The task of crowd counting is to automatically estimate the pedestrian number in crowd images. To cope with the scale and perspective changes that commonly exist in crowd images, state-of-the-art approaches employ multi-column CNN architectures to regress density maps of crowd images. Multiple columns have different receptive fields corresponding to pedestrians (heads) of different scales. We instead propose a scale-adaptive CNN (SaCNN) architecture with a backbone of fixed small receptive fields. We extract feature maps from multiple layers and adapt them to have the same output size; we combine them to produce the final density map. The number of people is computed by integrating the density map. We also introduce a relative count loss along with the density map loss to improve the network generalization on crowd scenes with few pedestrians, where most representative approaches perform poorly on. We conduct extensive experiments on the ShanghaiTech, UCF_CC_50 and WorldExpo datasets as well as a new dataset SmartCity that we collect for crowd scenes with few people. The results demonstrate significant improvements of SaCNN over the state-of-the-art.
Tasks	Crowd Counting
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04433v4
PDF	http://arxiv.org/pdf/1711.04433v4.pdf
PWC	https://paperswithcode.com/paper/crowd-counting-via-scale-adaptive
Repo	https://github.com/miao0913/SaCNN-CrowdCounting-Tencent_Youtu
Framework	none

ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification


Title	ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification
Authors	Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O’Connor
Abstract	In this paper we propose ResnetCrowd, a deep residual architecture for simultaneous crowd counting, violent behaviour detection and crowd density level classification. To train and evaluate the proposed multi-objective technique, a new 100 image dataset referred to as Multi Task Crowd is constructed. This new dataset is the first computer vision dataset fully annotated for crowd counting, violent behaviour detection and density level classification. Our experiments show that a multi-task approach boosts individual task performance for all tasks and most notably for violent behaviour detection which receives a 9% boost in ROC curve AUC (Area under the curve). The trained ResnetCrowd model is also evaluated on several additional benchmarks highlighting the superior generalisation of crowd analysis models trained for multiple objectives.
Tasks	Crowd Counting
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10698v1
PDF	http://arxiv.org/pdf/1705.10698v1.pdf
PWC	https://paperswithcode.com/paper/resnetcrowd-a-residual-deep-learning
Repo	https://github.com/lcylmhlcy/ResnetCrowd-Caffe
Framework	caffe2

Automatic Goal Generation for Reinforcement Learning Agents


Title	Automatic Goal Generation for Reinforcement Learning Agents
Authors	Carlos Florensa, David Held, Xinyang Geng, Pieter Abbeel
Abstract	Reinforcement learning is a powerful technique to train an agent to perform a task. However, an agent that is trained using reinforcement learning is only capable of achieving the single task that is specified via its reward function. Such an approach does not scale well to settings in which an agent needs to perform a diverse set of tasks, such as navigating to varying positions in a room or moving objects to varying locations. Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing. We use a generator network to propose tasks for the agent to try to achieve, specified as goal states. The generator network is optimized using adversarial training to produce tasks that are always at the appropriate level of difficulty for the agent. Our method thus automatically produces a curriculum of tasks for the agent to learn. We show that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment. Our method can also learn to achieve tasks with sparse rewards, which traditionally pose significant challenges.
Tasks
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06366v5
PDF	http://arxiv.org/pdf/1705.06366v5.pdf
PWC	https://paperswithcode.com/paper/automatic-goal-generation-for-reinforcement
Repo	https://github.com/jeffchy/Artificial-Idiot
Framework	tf

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image


Title	Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image
Authors	Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao
Abstract	Although deep convolutional neural network has been proved to efficiently eliminate coding artifacts caused by the coarse quantization of traditional codec, it’s difficult to train any neural network in front of the encoder for gradient’s back-propagation. In this paper, we propose an end-to-end image compression framework based on convolutional neural network to resolve the problem of non-differentiability of the quantization function in the standard codec. First, the feature description neural network is used to get a valid description in the low-dimension space with respect to the ground-truth image so that the amount of image data is greatly reduced for storage or transmission. After image’s valid description, standard image codec such as JPEG is leveraged to further compress image, which leads to image’s great distortion and compression artifacts, especially blocking artifacts, detail missing, blurring, and ringing artifacts. Then, we use a post-processing neural network to remove these artifacts. Due to the challenge of directly learning a non-linear function for a standard codec based on convolutional neural network, we propose to learn a virtual codec neural network to approximate the projection from the valid description image to the post-processed compressed image, so that the gradient could be efficiently back-propagated from the post-processing neural network to the feature description neural network during training. Meanwhile, an advanced learning algorithm is proposed to train our deep neural networks for compression. Obviously, the priority of the proposed method is compatible with standard existing codecs and our learning strategy can be easily extended into these codecs based on convolutional neural network. Experimental results have demonstrated the advances of the proposed method as compared to several state-of-the-art approaches, especially at very low bit-rate.
Tasks	Image Compression, Quantization
Published	2017-12-16
URL	http://arxiv.org/abs/1712.05969v7
PDF	http://arxiv.org/pdf/1712.05969v7.pdf
PWC	https://paperswithcode.com/paper/learning-a-virtual-codec-based-on-deep
Repo	https://github.com/mdcnn/mdcnn.github.io
Framework	none

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems


Title	NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems
Authors	Ozan Caglayan, Mercedes García-Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault
Abstract	In this paper, we present nmtpy, a flexible Python toolkit based on Theano for training Neural Machine Translation and other neural sequence-to-sequence architectures. nmtpy decouples the specification of a network from the training and inference utilities to simplify the addition of a new architecture and reduce the amount of boilerplate code to be written. nmtpy has been used for LIUM’s top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.
Tasks	Machine Translation, Multimodal Machine Translation
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00457v1
PDF	http://arxiv.org/pdf/1706.00457v1.pdf
PWC	https://paperswithcode.com/paper/nmtpy-a-flexible-toolkit-for-advanced-neural
Repo	https://github.com/lium-lst/nmtpy
Framework	none

Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets


Title	Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets
Authors	Sanjay Purushotham, Chuizheng Meng, Zhengping Che, Yan Liu
Abstract	Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models with respect to the state-of-the-art machine learning models and prognostic scoring systems on publicly available healthcare datasets. In this paper, we present the benchmarking results for several clinical prediction tasks such as mortality prediction, length of stay prediction, and ICD-9 code group prediction using Deep Learning models, ensemble of machine learning models (Super Learner algorithm), SAPS II and SOFA scores. We used the Medical Information Mart for Intensive Care III (MIMIC-III) (v1.4) publicly available dataset, which includes all patients admitted to an ICU at the Beth Israel Deaconess Medical Center from 2001 to 2012, for the benchmarking tasks. Our results show that deep learning models consistently outperform all the other approaches especially when the `raw’ clinical time series data is used as input features to the models. \|
Tasks	Length-of-Stay prediction, Mortality Prediction, Speech Recognition, Time Series
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08531v1
PDF	http://arxiv.org/pdf/1710.08531v1.pdf
PWC	https://paperswithcode.com/paper/benchmark-of-deep-learning-models-on-large
Repo	https://github.com/USC-Melady/Benchmarking_DL_MIMICIII
Framework	none

Convolutional neural network architecture for geometric matching


Title	Convolutional neural network architecture for geometric matching
Authors	Ignacio Rocco, Relja Arandjelović, Josef Sivic
Abstract	We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset.
Tasks
Published	2017-03-16
URL	http://arxiv.org/abs/1703.05593v2
PDF	http://arxiv.org/pdf/1703.05593v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-architecture-for
Repo	https://github.com/ignacio-rocco/cnngeometric_pytorch
Framework	pytorch

Action Branching Architectures for Deep Reinforcement Learning


Title	Action Branching Architectures for Deep Reinforcement Learning
Authors	Arash Tavakoli, Fabio Pardo, Petar Kormushev
Abstract	Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via discretization. In this paper, we propose a novel neural architecture featuring a shared decision module followed by several network branches, one for each action dimension. This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension. To illustrate the approach, we present a novel agent, called Branching Dueling Q-Network (BDQ), as a branching variant of the Dueling Double Deep Q-Network (Dueling DDQN). We evaluate the performance of our agent on a set of challenging continuous control tasks. The empirical results show that the proposed agent scales gracefully to environments with increasing action dimensionality and indicate the significance of the shared decision module in coordination of the distributed action branches. Furthermore, we show that the proposed agent performs competitively against a state-of-the-art continuous control algorithm, Deep Deterministic Policy Gradient (DDPG).
Tasks	Continuous Control
Published	2017-11-24
URL	http://arxiv.org/abs/1711.08946v2
PDF	http://arxiv.org/pdf/1711.08946v2.pdf
PWC	https://paperswithcode.com/paper/action-branching-architectures-for-deep
Repo	https://github.com/MoMe36/BranchingDQN
Framework	pytorch

Iterative Machine Teaching


Title	Iterative Machine Teaching
Authors	Weiyang Liu, Bo Dai, Ahmad Humayun, Charlene Tay, Chen Yu, Linda B. Smith, James M. Rehg, Le Song
Abstract	In this paper, we consider the problem of machine teaching, the inverse problem of machine learning. Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner. We show that the teaching complexity in the iterative case is very different from that in the batch case. Instead of constructing a minimal training set for learners, our iterative machine teaching focuses on achieving fast convergence in the learner model. Depending on the level of information the teacher has from the learner model, we design teaching algorithms which can provably reduce the number of teaching examples and achieve faster convergence than learning without teachers. We also validate our theoretical findings with extensive experiments on different data distribution and real image datasets.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10470v3
PDF	http://arxiv.org/pdf/1705.10470v3.pdf
PWC	https://paperswithcode.com/paper/iterative-machine-teaching
Repo	https://github.com/Ipsedo/IterativeMachineTeaching
Framework	pytorch