October 20, 2019

2968 words 14 mins read

Paper Group AWR 224

NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension. LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks. TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation. SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized V …

NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension


Title	NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension
Authors	Tim Niven, Hung-Yu Kao
Abstract	The Argument Reasoning Comprehension Task requires significant language understanding and complex reasoning over world knowledge. We focus on transfer of a sentence encoder to bootstrap more complicated models given the small size of the dataset. Our best model uses a pre-trained BiLSTM to encode input sentences, learns task-specific features for the argument and warrants, then performs independent argument-warrant matching. This model achieves mean test set accuracy of 64.43%. Encoder transfer yields a significant gain to our best model over random initialization. Independent warrant matching effectively doubles the size of the dataset and provides additional regularization. We demonstrate that regularization comes from ignoring statistical correlations between warrant features and position. We also report an experiment with our best model that only matches warrants to reasons, ignoring claims. Relatively low performance degradation suggests that our model is not necessarily learning the intended task.
Tasks
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08266v1
PDF	http://arxiv.org/pdf/1804.08266v1.pdf
PWC	https://paperswithcode.com/paper/nlitrans-at-semeval-2018-task-12-transfer-of
Repo	https://github.com/IKMLab/arct
Framework	none

LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks


Title	LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks
Authors	Daniel H. Noronha, Bahar Salehpour, Steven J. E. Wilton
Abstract	Recent work has shown that Field-Programmable Gate Arrays (FPGAs) play an important role in the acceleration of Machine Learning applications. Initial specification of machine learning applications are often done using a high-level Python-oriented framework such as Tensorflow, followed by a manual translation to either C or RTL for synthesis using vendor tools. This manual translation step is time-consuming and requires expertise that limit the applicability of FPGAs in this important domain. In this paper, we present an open-source tool-flow that maps numerical computation models written in Tensorflow to synthesizable hardware. Unlike other tools, which are often constrained by a small number of inflexible templates, our flow uses Google’s XLA compiler which emits LLVM code directly from a Tensorflow specification. This LLVM code can then be used with a high-level synthesis tool to automatically generate hardware. We show that our flow allows users to generate Deep Neural Networks with very few lines of Python code.
Tasks
Published	2018-07-14
URL	http://arxiv.org/abs/1807.05317v1
PDF	http://arxiv.org/pdf/1807.05317v1.pdf
PWC	https://paperswithcode.com/paper/leflow-enabling-flexible-fpga-high-level
Repo	https://github.com/danielholanda/LeFlow
Framework	tf

TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation


Title	TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation
Authors	Pengcheng Yin, Graham Neubig
Abstract	We present TRANX, a transition-based neural semantic parser that maps natural language (NL) utterances into formal meaning representations (MRs). TRANX uses a transition system based on the abstract syntax description language for the target MR, which gives it two major advantages: (1) it is highly accurate, using information from the syntax of the target MR to constrain the output space and model the information flow, and (2) it is highly generalizable, and can easily be applied to new types of MR by just writing a new abstract syntax description corresponding to the allowable structures in the MR. Experiments on four different semantic parsing and code generation tasks show that our system is generalizable, extensible, and effective, registering strong results compared to existing neural semantic parsers.
Tasks	Code Generation, Semantic Parsing
Published	2018-10-05
URL	http://arxiv.org/abs/1810.02720v1
PDF	http://arxiv.org/pdf/1810.02720v1.pdf
PWC	https://paperswithcode.com/paper/tranx-a-transition-based-neural-abstract
Repo	https://github.com/pcyin/tranX
Framework	pytorch

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension


Title	SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Authors	Taeuk Kim, Jihun Choi, Sang-goo Lee
Abstract	We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70% on the development set and about 60% on the test set.
Tasks	Machine Translation, Transfer Learning
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07049v1
PDF	http://arxiv.org/pdf/1805.07049v1.pdf
PWC	https://paperswithcode.com/paper/snu_ids-at-semeval-2018-task-12-sentence
Repo	https://github.com/galsang/SemEval2018-task12
Framework	pytorch

NEUZZ: Efficient Fuzzing with Neural Program Smoothing


Title	NEUZZ: Efficient Fuzzing with Neural Program Smoothing
Authors	Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, Suman Jana
Abstract	Fuzzing has become the de facto standard technique for finding software vulnerabilities. However, even state-of-the-art fuzzers are not very efficient at finding hard-to-trigger software bugs. Most popular fuzzers use evolutionary guidance to generate inputs that can trigger different bugs. Such evolutionary algorithms, while fast and simple to implement, often get stuck in fruitless sequences of random mutations. Gradient-guided optimization presents a promising alternative to evolutionary guidance. Gradient-guided techniques have been shown to significantly outperform evolutionary algorithms at solving high-dimensional structured optimization problems in domains like machine learning by efficiently utilizing gradients or higher-order derivatives of the underlying function. However, gradient-guided approaches are not directly applicable to fuzzing as real-world program behaviors contain many discontinuities, plateaus, and ridges where the gradient-based methods often get stuck. We observe that this problem can be addressed by creating a smooth surrogate function approximating the discrete branching behavior of target program. In this paper, we propose a novel program smoothing technique using surrogate neural network models that can incrementally learn smooth approximations of a complex, real-world program’s branching behaviors. We further demonstrate that such neural network models can be used together with gradient-guided input generation schemes to significantly improve the fuzzing efficiency. Our extensive evaluations demonstrate that NEUZZ significantly outperforms 10 state-of-the-art graybox fuzzers on 10 real-world programs both at finding new bugs and achieving higher edge coverage. NEUZZ found 31 unknown bugs that other fuzzers failed to find in 10 real world programs and achieved 3X more edge coverage than all of the tested graybox fuzzers for 24 hours running.
Tasks
Published	2018-07-15
URL	https://arxiv.org/abs/1807.05620v4
PDF	https://arxiv.org/pdf/1807.05620v4.pdf
PWC	https://paperswithcode.com/paper/neuzz-efficient-fuzzing-with-neural-program
Repo	https://github.com/dongdongshe/neuzz
Framework	tf

Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons


Title	Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons
Authors	Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi
Abstract	An activation boundary for a neuron refers to a separating hyperplane that determines whether the neuron is activated or deactivated. It has been long considered in neural networks that the activations of neurons, rather than their exact output values, play the most important role in forming classification friendly partitions of the hidden feature space. However, as far as we know, this aspect of neural networks has not been considered in the literature of knowledge transfer. In this paper, we propose a knowledge transfer method via distillation of activation boundaries formed by hidden neurons. For the distillation, we propose an activation transfer loss that has the minimum value when the boundaries generated by the student coincide with those by the teacher. Since the activation transfer loss is not differentiable, we design a piecewise differentiable loss approximating the activation transfer loss. By the proposed method, the student learns a separating boundary between activation region and deactivation region formed by each neuron in the teacher. Through the experiments in various aspects of knowledge transfer, it is verified that the proposed method outperforms the current state-of-the-art.
Tasks	Transfer Learning
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03233v2
PDF	http://arxiv.org/pdf/1811.03233v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-transfer-via-distillation-of
Repo	https://github.com/bhheo/AB_distillation
Framework	pytorch

Differentiable MPC for End-to-end Planning and Control


Title	Differentiable MPC for End-to-end Planning and Control
Authors	Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter
Abstract	We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces. This provides one way of leveraging and combining the advantages of model-free and model-based approaches. Specifically, we differentiate through MPC by using the KKT conditions of the convex approximation at a fixed point of the controller. Using this strategy, we are able to learn the cost and dynamics of a controller via end-to-end learning. Our experiments focus on imitation learning in the pendulum and cartpole domains, where we learn the cost and dynamics terms of an MPC policy class. We show that our MPC policies are significantly more data-efficient than a generic neural network and that our method is superior to traditional system identification in a setting where the expert is unrealizable.
Tasks	Imitation Learning
Published	2018-10-31
URL	https://arxiv.org/abs/1810.13400v3
PDF	https://arxiv.org/pdf/1810.13400v3.pdf
PWC	https://paperswithcode.com/paper/differentiable-mpc-for-end-to-end-planning
Repo	https://github.com/locuslab/differentiable-mpc
Framework	pytorch

Visualizing the Flow of Discourse with a Concept Ontology


Title	Visualizing the Flow of Discourse with a Concept Ontology
Authors	Baoxu Shi, Tim Weninger
Abstract	Understanding and visualizing human discourse has long being a challenging task. Although recent work on argument mining have shown success in classifying the role of various sentences, the task of recognizing concepts and understanding the ways in which they are discussed remains challenging. Given an email thread or a transcript of a group discussion, our task is to extract the relevant concepts and understand how they are referenced and re-referenced throughout the discussion. In the present work, we present a preliminary approach for extracting and visualizing group discourse by adapting Wikipedia’s category hierarchy to be an external concept ontology. From a user study, we found that our method achieved better results than 4 strong alternative approaches, and we illustrate our visualization method based on the extracted discourse flows.
Tasks	Argument Mining
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08614v1
PDF	http://arxiv.org/pdf/1802.08614v1.pdf
PWC	https://paperswithcode.com/paper/visualizing-the-flow-of-discourse-with-a
Repo	https://github.com/bxshi/DiscourseVisualization
Framework	none

Encoding Spatial Relations from Natural Language


Title	Encoding Spatial Relations from Natural Language
Authors	Tiago Ramalho, Tomáš Kočiský, Frederic Besse, S. M. Ali Eslami, Gábor Melis, Fabio Viola, Phil Blunsom, Karl Moritz Hermann
Abstract	Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the real world. In particular, spatial relations are encoded in a way that is inconsistent with human spatial reasoning and lacking invariance to viewpoint changes. We present a system capable of capturing the semantics of spatial relations such as behind, left of, etc from natural language. Our key contributions are a novel multi-modal objective based on generating images of scenes from their textual descriptions, and a new dataset on which to train it. We demonstrate that internal representations are robust to meaning preserving transformations of descriptions (paraphrase invariance), while viewpoint invariance is an emergent property of the system.
Tasks
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01670v2
PDF	http://arxiv.org/pdf/1807.01670v2.pdf
PWC	https://paperswithcode.com/paper/encoding-spatial-relations-from-natural
Repo	https://github.com/deepmind/slim-dataset
Framework	tf

AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity


Title	AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity
Authors	Yibo Zeng, Fei Feng, Wotao Yin
Abstract	In this paper, we propose AsyncQVI, an asynchronous-parallel Q-value iteration for discounted Markov decision processes whose transition and reward can only be sampled through a generative model. Given such a problem with $\mathcal{S}$ states, $\mathcal{A}$ actions, and a discounted factor $\gamma\in(0,1)$, AsyncQVI uses memory of size $\mathcal{O}(\mathcal{S})$ and returns an $\varepsilon$-optimal policy with probability at least $1-\delta$ using $$\tilde{\mathcal{O}}\big(\frac{\mathcal{S}\mathcal{A}}{(1-\gamma)^5\varepsilon^2}\log(\frac{1}{\delta})\big)$$ samples. AsyncQVI is also the first asynchronous-parallel algorithm for discounted Markov decision processes that has a sample complexity, which nearly matches the theoretical lower bound. The relatively low memory footprint and parallel ability make AsyncQVI suitable for large-scale applications. In numerical tests, we compare AsyncQVI with four sample-based value iteration methods. The results show that our algorithm is highly efficient and achieves linear parallel speedup.
Tasks
Published	2018-12-03
URL	https://arxiv.org/abs/1812.00885v3
PDF	https://arxiv.org/pdf/1812.00885v3.pdf
PWC	https://paperswithcode.com/paper/asyncqvi-asynchronous-parallel-q-value
Repo	https://github.com/uclaopt/AsyncQVI
Framework	none

Neural Adaptation Layers for Cross-domain Named Entity Recognition


Title	Neural Adaptation Layers for Cross-domain Named Entity Recognition
Authors	Bill Yuchen Lin, Wei Lu
Abstract	Recent research efforts have shown that neural architectures can be effective in conventional information extraction tasks such as named entity recognition, yielding state-of-the-art results on standard newswire datasets. However, despite significant resources required for training such models, the performance of a model trained on one domain typically degrades dramatically when applied to a different domain, yet extracting entities from new emerging domains such as social media can be of significant interest. In this paper, we empirically investigate effective methods for conveniently adapting an existing, well-trained neural NER model for a new domain. Unlike existing approaches, we propose lightweight yet effective methods for performing domain adaptation for neural models. Specifically, we introduce adaptation layers on top of existing neural architectures, where no re-training using the source domain data is required. We conduct extensive empirical studies and show that our approach significantly outperforms state-of-the-art methods.
Tasks	Cross-Domain Named Entity Recognition, Domain Adaptation, Named Entity Recognition
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06368v1
PDF	http://arxiv.org/pdf/1810.06368v1.pdf
PWC	https://paperswithcode.com/paper/neural-adaptation-layers-for-cross-domain
Repo	https://github.com/yuchenlin/CDMA-NER
Framework	tf

Multilevel Language and Vision Integration for Text-to-Clip Retrieval


Title	Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Authors	Huijuan Xu, Kun He, Bryan A. Plummer, Leonid Sigal, Stan Sclaroff, Kate Saenko
Abstract	We address the problem of text-based activity retrieval in video. Given a sentence describing an activity, our task is to retrieve matching clips from an untrimmed video. To capture the inherent structures present in both text and video, we introduce a multilevel model that integrates vision and language features earlier and more tightly than prior work. First, we inject text features early on when generating clip proposals, to help eliminate unlikely clips and thus speed up processing and boost performance. Second, to learn a fine-grained similarity metric for retrieval, we use visual features to modulate the processing of query sentences at the word level in a recurrent neural network. A multi-task loss is also employed by adding query re-generation as an auxiliary task. Our approach significantly outperforms prior work on two challenging benchmarks: Charades-STA and ActivityNet Captions.
Tasks
Published	2018-04-13
URL	http://arxiv.org/abs/1804.05113v3
PDF	http://arxiv.org/pdf/1804.05113v3.pdf
PWC	https://paperswithcode.com/paper/multilevel-language-and-vision-integration
Repo	https://github.com/VisionLearningGroup/Text-to-Clip_Retrieval
Framework	none

Skin Lesions Classification Using Convolutional Neural Networks in Clinical Images


Title	Skin Lesions Classification Using Convolutional Neural Networks in Clinical Images
Authors	Danilo Barros Mendes, Nilton Correia da Silva
Abstract	Skin lesions are conditions that appear on a patient due to many different reasons. One of these can be because of an abnormal growth in skin tissue, defined as cancer. This disease plagues more than 14.1 million patients and had been the cause of more than 8.2 million deaths, worldwide. Therefore, the construction of a classification model for 12 lesions, including Malignant Melanoma and Basal Cell Carcinoma, is proposed. Furthermore, in this work, it is used a ResNet-152 architecture, which was trained over 3,797 images, later augmented by a factor of 29 times, using positional, scale, and lighting transformations. Finally, the network was tested with 956 images and achieve an area under the curve (AUC) of 0.96 for Melanoma and 0.91 for Basal Cell Carcinoma.
Tasks
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02316v1
PDF	http://arxiv.org/pdf/1812.02316v1.pdf
PWC	https://paperswithcode.com/paper/skin-lesions-classification-using
Repo	https://github.com/aryanmisra/Skin-Lesion-Classifier
Framework	tf

Attention Based Natural Language Grounding by Navigating Virtual Environment


Title	Attention Based Natural Language Grounding by Navigating Virtual Environment
Authors	Akilesh B, Abhishek Sinha, Mausoom Sarkar, Balaji Krishnamurthy
Abstract	In this work, we focus on the problem of grounding language by training an agent to follow a set of natural language instructions and navigate to a target object in an environment. The agent receives visual information through raw pixels and a natural language instruction telling what task needs to be achieved and is trained in an end-to-end way. We develop an attention mechanism for multi-modal fusion of visual and textual modalities that allows the agent to learn to complete the task and achieve language grounding. Our experimental results show that our attention mechanism outperforms the existing multi-modal fusion mechanisms proposed for both 2D and 3D environments in order to solve the above-mentioned task in terms of both speed and success rate. We show that the learnt textual representations are semantically meaningful as they follow vector arithmetic in the embedding space. The effectiveness of our attention approach over the contemporary fusion mechanisms is also highlighted from the textual embeddings learnt by the different approaches. We also show that our model generalizes effectively to unseen scenarios and exhibit zero-shot generalization capabilities both in 2D and 3D environments. The code for our 2D environment as well as the models that we developed for both 2D and 3D are available at https://github.com/rl-lang-grounding/rl-lang-ground.
Tasks
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08454v2
PDF	http://arxiv.org/pdf/1804.08454v2.pdf
PWC	https://paperswithcode.com/paper/attention-based-natural-language-grounding-by
Repo	https://github.com/rl-lang-grounding/rl-lang-ground
Framework	tf

t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data


Title	t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data
Authors	David M. Chan, Roshan Rao, Forrest Huang, John F. Canny
Abstract	Modern datasets and models are notoriously difficult to explore and analyze due to their inherent high dimensionality and massive numbers of samples. Existing visualization methods which employ dimensionality reduction to two or three dimensions are often inefficient and/or ineffective for these datasets. This paper introduces t-SNE-CUDA, a GPU-accelerated implementation of t-distributed Symmetric Neighbor Embedding (t-SNE) for visualizing datasets and models. t-SNE-CUDA significantly outperforms current implementations with 50-700x speedups on the CIFAR-10 and MNIST datasets. These speedups enable, for the first time, visualization of the neural network activations on the entire ImageNet dataset - a feat that was previously computationally intractable. We also demonstrate visualization performance in the NLP domain by visualizing the GloVe embedding vectors. From these visualizations, we can draw interesting conclusions about using the L2 metric in these embedding spaces. t-SNE-CUDA is publicly available athttps://github.com/CannyLab/tsne-cuda
Tasks	Dimensionality Reduction
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11824v1
PDF	http://arxiv.org/pdf/1807.11824v1.pdf
PWC	https://paperswithcode.com/paper/t-sne-cuda-gpu-accelerated-t-sne-and-its
Repo	https://github.com/CannyLab/tsne-cuda
Framework	none