Paper Group AWR 116
Learning Universal Adversarial Perturbations with Generative Models. The Lifted Matrix-Space Model for Semantic Composition. Table-to-text Generation by Structure-aware Seq2seq Learning. Specialising Word Vectors for Lexical Entailment. Accelerated Nearest Neighbor Search with Quick ADC. Catalyst Acceleration for First-order Convex Optimization: fr …
Learning Universal Adversarial Perturbations with Generative Models
Title | Learning Universal Adversarial Perturbations with Generative Models |
Authors | Jamie Hayes, George Danezis |
Abstract | Neural networks are known to be vulnerable to adversarial examples, inputs that have been intentionally perturbed to remain visually similar to the source input, but cause a misclassification. It was recently shown that given a dataset and classifier, there exists so called universal adversarial perturbations, a single perturbation that causes a misclassification when applied to any input. In this work, we introduce universal adversarial networks, a generative network that is capable of fooling a target classifier when it’s generated output is added to a clean sample from a dataset. We show that this technique improves on known universal adversarial attacks. |
Tasks | Graph Classification |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05207v3 |
http://arxiv.org/pdf/1708.05207v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-universal-adversarial-perturbations |
Repo | https://github.com/jhayes14/UAN |
Framework | pytorch |
The Lifted Matrix-Space Model for Semantic Composition
Title | The Lifted Matrix-Space Model for Semantic Composition |
Authors | WooJin Chung, Sheng-Fu Wang, Samuel R. Bowman |
Abstract | Tree-structured neural network architectures for sentence encoding draw inspiration from the approach to semantic composition generally seen in formal linguistics, and have shown empirical improvements over comparable sequence models by doing so. Moreover, adding multiplicative interaction terms to the composition functions in these models can yield significant further improvements. However, existing compositional approaches that adopt such a powerful composition function scale poorly, with parameter counts exploding as model dimension or vocabulary size grows. We introduce the Lifted Matrix-Space model, which uses a global transformation to map vector word embeddings to matrices, which can then be composed via an operation based on matrix-matrix multiplication. Its composition function effectively transmits a larger number of activations across layers with relatively few model parameters. We evaluate our model on the Stanford NLI corpus, the Multi-Genre NLI corpus, and the Stanford Sentiment Treebank and find that it consistently outperforms TreeLSTM (Tai et al., 2015), the previous best known composition function for tree-structured models. |
Tasks | Semantic Composition, Word Embeddings |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03602v2 |
http://arxiv.org/pdf/1711.03602v2.pdf | |
PWC | https://paperswithcode.com/paper/the-lifted-matrix-space-model-for-semantic |
Repo | https://github.com/woojinchung/lms |
Framework | pytorch |
Table-to-text Generation by Structure-aware Seq2seq Learning
Title | Table-to-text Generation by Structure-aware Seq2seq Learning |
Authors | Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, Zhifang Sui |
Abstract | Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the \texttt{WIKIBIO} dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio. |
Tasks | Table-to-Text Generation, Text Generation |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09724v1 |
http://arxiv.org/pdf/1711.09724v1.pdf | |
PWC | https://paperswithcode.com/paper/table-to-text-generation-by-structure-aware |
Repo | https://github.com/tyliupku/wiki2bio |
Framework | tf |
Specialising Word Vectors for Lexical Entailment
Title | Specialising Word Vectors for Lexical Entailment |
Authors | Ivan Vulić, Nikola Mrkšić |
Abstract | We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.06371v2 |
http://arxiv.org/pdf/1710.06371v2.pdf | |
PWC | https://paperswithcode.com/paper/specialising-word-vectors-for-lexical-1 |
Repo | https://github.com/nmrksic/lear |
Framework | tf |
Accelerated Nearest Neighbor Search with Quick ADC
Title | Accelerated Nearest Neighbor Search with Quick ADC |
Authors | Fabien André, Anne-Marie Kermarrec, Nicolas Le Scouarnec |
Abstract | Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. Because it offers low responses times, Product Quantization (PQ) is a popular solution. PQ compresses high-dimensional vectors into short codes using several sub-quantizers, which enables in-RAM storage of large databases. This allows fast answers to NN queries, without accessing the SSD or HDD. The key feature of PQ is that it can compute distances between short codes and high-dimensional vectors using cache-resident lookup tables. The efficiency of this technique, named Asymmetric Distance Computation (ADC), remains limited because it performs many cache accesses. In this paper, we introduce Quick ADC, a novel technique that achieves a 3 to 6 times speedup over ADC by exploiting Single Instruction Multiple Data (SIMD) units available in current CPUs. Efficiently exploiting SIMD requires algorithmic changes to the ADC procedure. Namely, Quick ADC relies on two key modifications of ADC: (i) the use 4-bit sub-quantizers instead of the standard 8-bit sub-quantizers and (ii) the quantization of floating-point distances. This allows Quick ADC to exceed the performance of state-of-the-art systems, e.g., it achieves a Recall@100 of 0.94 in 3.4 ms on 1 billion SIFT descriptors (128-bit codes). |
Tasks | Quantization |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07355v1 |
http://arxiv.org/pdf/1704.07355v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-nearest-neighbor-search-with |
Repo | https://github.com/technicolor-research/quick-adc |
Framework | none |
Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice
Title | Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice |
Authors | Hongzhou Lin, Julien Mairal, Zaid Harchaoui |
Abstract | We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration in theory and in practice is to solve these sub-problems with appropriate accuracy by using the right stopping criterion and the right warm-start strategy. We give practical guidelines to use Catalyst and present a comprehensive analysis of its global complexity. We show that Catalyst applies to a large class of algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, MISO/Finito, and their proximal variants. For all of these methods, we establish faster rates using the Catalyst acceleration, for strongly convex and non-strongly convex objectives. We conclude with extensive experiments showing that acceleration is useful in practice, especially for ill-conditioned problems. |
Tasks | |
Published | 2017-12-15 |
URL | http://arxiv.org/abs/1712.05654v2 |
http://arxiv.org/pdf/1712.05654v2.pdf | |
PWC | https://paperswithcode.com/paper/catalyst-acceleration-for-first-order-convex |
Repo | https://github.com/hongzhoulin89/Catalyst-QNing |
Framework | none |
Crowd counting via scale-adaptive convolutional neural network
Title | Crowd counting via scale-adaptive convolutional neural network |
Authors | Lu Zhang, Miaojing Shi, Qiaobo Chen |
Abstract | The task of crowd counting is to automatically estimate the pedestrian number in crowd images. To cope with the scale and perspective changes that commonly exist in crowd images, state-of-the-art approaches employ multi-column CNN architectures to regress density maps of crowd images. Multiple columns have different receptive fields corresponding to pedestrians (heads) of different scales. We instead propose a scale-adaptive CNN (SaCNN) architecture with a backbone of fixed small receptive fields. We extract feature maps from multiple layers and adapt them to have the same output size; we combine them to produce the final density map. The number of people is computed by integrating the density map. We also introduce a relative count loss along with the density map loss to improve the network generalization on crowd scenes with few pedestrians, where most representative approaches perform poorly on. We conduct extensive experiments on the ShanghaiTech, UCF_CC_50 and WorldExpo datasets as well as a new dataset SmartCity that we collect for crowd scenes with few people. The results demonstrate significant improvements of SaCNN over the state-of-the-art. |
Tasks | Crowd Counting |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.04433v4 |
http://arxiv.org/pdf/1711.04433v4.pdf | |
PWC | https://paperswithcode.com/paper/crowd-counting-via-scale-adaptive |
Repo | https://github.com/miao0913/SaCNN-CrowdCounting-Tencent_Youtu |
Framework | none |
ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification
Title | ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification |
Authors | Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O’Connor |
Abstract | In this paper we propose ResnetCrowd, a deep residual architecture for simultaneous crowd counting, violent behaviour detection and crowd density level classification. To train and evaluate the proposed multi-objective technique, a new 100 image dataset referred to as Multi Task Crowd is constructed. This new dataset is the first computer vision dataset fully annotated for crowd counting, violent behaviour detection and density level classification. Our experiments show that a multi-task approach boosts individual task performance for all tasks and most notably for violent behaviour detection which receives a 9% boost in ROC curve AUC (Area under the curve). The trained ResnetCrowd model is also evaluated on several additional benchmarks highlighting the superior generalisation of crowd analysis models trained for multiple objectives. |
Tasks | Crowd Counting |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10698v1 |
http://arxiv.org/pdf/1705.10698v1.pdf | |
PWC | https://paperswithcode.com/paper/resnetcrowd-a-residual-deep-learning |
Repo | https://github.com/lcylmhlcy/ResnetCrowd-Caffe |
Framework | caffe2 |
Automatic Goal Generation for Reinforcement Learning Agents
Title | Automatic Goal Generation for Reinforcement Learning Agents |
Authors | Carlos Florensa, David Held, Xinyang Geng, Pieter Abbeel |
Abstract | Reinforcement learning is a powerful technique to train an agent to perform a task. However, an agent that is trained using reinforcement learning is only capable of achieving the single task that is specified via its reward function. Such an approach does not scale well to settings in which an agent needs to perform a diverse set of tasks, such as navigating to varying positions in a room or moving objects to varying locations. Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing. We use a generator network to propose tasks for the agent to try to achieve, specified as goal states. The generator network is optimized using adversarial training to produce tasks that are always at the appropriate level of difficulty for the agent. Our method thus automatically produces a curriculum of tasks for the agent to learn. We show that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment. Our method can also learn to achieve tasks with sparse rewards, which traditionally pose significant challenges. |
Tasks | |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06366v5 |
http://arxiv.org/pdf/1705.06366v5.pdf | |
PWC | https://paperswithcode.com/paper/automatic-goal-generation-for-reinforcement |
Repo | https://github.com/jeffchy/Artificial-Idiot |
Framework | tf |
Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image
Title | Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image |
Authors | Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao |
Abstract | Although deep convolutional neural network has been proved to efficiently eliminate coding artifacts caused by the coarse quantization of traditional codec, it’s difficult to train any neural network in front of the encoder for gradient’s back-propagation. In this paper, we propose an end-to-end image compression framework based on convolutional neural network to resolve the problem of non-differentiability of the quantization function in the standard codec. First, the feature description neural network is used to get a valid description in the low-dimension space with respect to the ground-truth image so that the amount of image data is greatly reduced for storage or transmission. After image’s valid description, standard image codec such as JPEG is leveraged to further compress image, which leads to image’s great distortion and compression artifacts, especially blocking artifacts, detail missing, blurring, and ringing artifacts. Then, we use a post-processing neural network to remove these artifacts. Due to the challenge of directly learning a non-linear function for a standard codec based on convolutional neural network, we propose to learn a virtual codec neural network to approximate the projection from the valid description image to the post-processed compressed image, so that the gradient could be efficiently back-propagated from the post-processing neural network to the feature description neural network during training. Meanwhile, an advanced learning algorithm is proposed to train our deep neural networks for compression. Obviously, the priority of the proposed method is compatible with standard existing codecs and our learning strategy can be easily extended into these codecs based on convolutional neural network. Experimental results have demonstrated the advances of the proposed method as compared to several state-of-the-art approaches, especially at very low bit-rate. |
Tasks | Image Compression, Quantization |
Published | 2017-12-16 |
URL | http://arxiv.org/abs/1712.05969v7 |
http://arxiv.org/pdf/1712.05969v7.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-virtual-codec-based-on-deep |
Repo | https://github.com/mdcnn/mdcnn.github.io |
Framework | none |
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems
Title | NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems |
Authors | Ozan Caglayan, Mercedes García-Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault |
Abstract | In this paper, we present nmtpy, a flexible Python toolkit based on Theano for training Neural Machine Translation and other neural sequence-to-sequence architectures. nmtpy decouples the specification of a network from the training and inference utilities to simplify the addition of a new architecture and reduce the amount of boilerplate code to be written. nmtpy has been used for LIUM’s top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017. |
Tasks | Machine Translation, Multimodal Machine Translation |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00457v1 |
http://arxiv.org/pdf/1706.00457v1.pdf | |
PWC | https://paperswithcode.com/paper/nmtpy-a-flexible-toolkit-for-advanced-neural |
Repo | https://github.com/lium-lst/nmtpy |
Framework | none |
Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets
Title | Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets |
Authors | Sanjay Purushotham, Chuizheng Meng, Zhengping Che, Yan Liu |
Abstract | Deep learning models (aka Deep Neural Networks) have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being increasingly used in clinical healthcare applications. However, few works exist which have benchmarked the performance of the deep learning models with respect to the state-of-the-art machine learning models and prognostic scoring systems on publicly available healthcare datasets. In this paper, we present the benchmarking results for several clinical prediction tasks such as mortality prediction, length of stay prediction, and ICD-9 code group prediction using Deep Learning models, ensemble of machine learning models (Super Learner algorithm), SAPS II and SOFA scores. We used the Medical Information Mart for Intensive Care III (MIMIC-III) (v1.4) publicly available dataset, which includes all patients admitted to an ICU at the Beth Israel Deaconess Medical Center from 2001 to 2012, for the benchmarking tasks. Our results show that deep learning models consistently outperform all the other approaches especially when the `raw’ clinical time series data is used as input features to the models. | |
Tasks | Length-of-Stay prediction, Mortality Prediction, Speech Recognition, Time Series |
Published | 2017-10-23 |
URL | http://arxiv.org/abs/1710.08531v1 |
http://arxiv.org/pdf/1710.08531v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmark-of-deep-learning-models-on-large |
Repo | https://github.com/USC-Melady/Benchmarking_DL_MIMICIII |
Framework | none |
Convolutional neural network architecture for geometric matching
Title | Convolutional neural network architecture for geometric matching |
Authors | Ignacio Rocco, Relja Arandjelović, Josef Sivic |
Abstract | We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset. |
Tasks | |
Published | 2017-03-16 |
URL | http://arxiv.org/abs/1703.05593v2 |
http://arxiv.org/pdf/1703.05593v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-architecture-for |
Repo | https://github.com/ignacio-rocco/cnngeometric_pytorch |
Framework | pytorch |
Action Branching Architectures for Deep Reinforcement Learning
Title | Action Branching Architectures for Deep Reinforcement Learning |
Authors | Arash Tavakoli, Fabio Pardo, Petar Kormushev |
Abstract | Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via discretization. In this paper, we propose a novel neural architecture featuring a shared decision module followed by several network branches, one for each action dimension. This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension. To illustrate the approach, we present a novel agent, called Branching Dueling Q-Network (BDQ), as a branching variant of the Dueling Double Deep Q-Network (Dueling DDQN). We evaluate the performance of our agent on a set of challenging continuous control tasks. The empirical results show that the proposed agent scales gracefully to environments with increasing action dimensionality and indicate the significance of the shared decision module in coordination of the distributed action branches. Furthermore, we show that the proposed agent performs competitively against a state-of-the-art continuous control algorithm, Deep Deterministic Policy Gradient (DDPG). |
Tasks | Continuous Control |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.08946v2 |
http://arxiv.org/pdf/1711.08946v2.pdf | |
PWC | https://paperswithcode.com/paper/action-branching-architectures-for-deep |
Repo | https://github.com/MoMe36/BranchingDQN |
Framework | pytorch |
Iterative Machine Teaching
Title | Iterative Machine Teaching |
Authors | Weiyang Liu, Bo Dai, Ahmad Humayun, Charlene Tay, Chen Yu, Linda B. Smith, James M. Rehg, Le Song |
Abstract | In this paper, we consider the problem of machine teaching, the inverse problem of machine learning. Different from traditional machine teaching which views the learners as batch algorithms, we study a new paradigm where the learner uses an iterative algorithm and a teacher can feed examples sequentially and intelligently based on the current performance of the learner. We show that the teaching complexity in the iterative case is very different from that in the batch case. Instead of constructing a minimal training set for learners, our iterative machine teaching focuses on achieving fast convergence in the learner model. Depending on the level of information the teacher has from the learner model, we design teaching algorithms which can provably reduce the number of teaching examples and achieve faster convergence than learning without teachers. We also validate our theoretical findings with extensive experiments on different data distribution and real image datasets. |
Tasks | |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10470v3 |
http://arxiv.org/pdf/1705.10470v3.pdf | |
PWC | https://paperswithcode.com/paper/iterative-machine-teaching |
Repo | https://github.com/Ipsedo/IterativeMachineTeaching |
Framework | pytorch |