Paper Group AWR 224
NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension. LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks. TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation. SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized V …
NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension
Title | NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension |
Authors | Tim Niven, Hung-Yu Kao |
Abstract | The Argument Reasoning Comprehension Task requires significant language understanding and complex reasoning over world knowledge. We focus on transfer of a sentence encoder to bootstrap more complicated models given the small size of the dataset. Our best model uses a pre-trained BiLSTM to encode input sentences, learns task-specific features for the argument and warrants, then performs independent argument-warrant matching. This model achieves mean test set accuracy of 64.43%. Encoder transfer yields a significant gain to our best model over random initialization. Independent warrant matching effectively doubles the size of the dataset and provides additional regularization. We demonstrate that regularization comes from ignoring statistical correlations between warrant features and position. We also report an experiment with our best model that only matches warrants to reasons, ignoring claims. Relatively low performance degradation suggests that our model is not necessarily learning the intended task. |
Tasks | |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08266v1 |
http://arxiv.org/pdf/1804.08266v1.pdf | |
PWC | https://paperswithcode.com/paper/nlitrans-at-semeval-2018-task-12-transfer-of |
Repo | https://github.com/IKMLab/arct |
Framework | none |
LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks
Title | LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks |
Authors | Daniel H. Noronha, Bahar Salehpour, Steven J. E. Wilton |
Abstract | Recent work has shown that Field-Programmable Gate Arrays (FPGAs) play an important role in the acceleration of Machine Learning applications. Initial specification of machine learning applications are often done using a high-level Python-oriented framework such as Tensorflow, followed by a manual translation to either C or RTL for synthesis using vendor tools. This manual translation step is time-consuming and requires expertise that limit the applicability of FPGAs in this important domain. In this paper, we present an open-source tool-flow that maps numerical computation models written in Tensorflow to synthesizable hardware. Unlike other tools, which are often constrained by a small number of inflexible templates, our flow uses Google’s XLA compiler which emits LLVM code directly from a Tensorflow specification. This LLVM code can then be used with a high-level synthesis tool to automatically generate hardware. We show that our flow allows users to generate Deep Neural Networks with very few lines of Python code. |
Tasks | |
Published | 2018-07-14 |
URL | http://arxiv.org/abs/1807.05317v1 |
http://arxiv.org/pdf/1807.05317v1.pdf | |
PWC | https://paperswithcode.com/paper/leflow-enabling-flexible-fpga-high-level |
Repo | https://github.com/danielholanda/LeFlow |
Framework | tf |
TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation
Title | TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation |
Authors | Pengcheng Yin, Graham Neubig |
Abstract | We present TRANX, a transition-based neural semantic parser that maps natural language (NL) utterances into formal meaning representations (MRs). TRANX uses a transition system based on the abstract syntax description language for the target MR, which gives it two major advantages: (1) it is highly accurate, using information from the syntax of the target MR to constrain the output space and model the information flow, and (2) it is highly generalizable, and can easily be applied to new types of MR by just writing a new abstract syntax description corresponding to the allowable structures in the MR. Experiments on four different semantic parsing and code generation tasks show that our system is generalizable, extensible, and effective, registering strong results compared to existing neural semantic parsers. |
Tasks | Code Generation, Semantic Parsing |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02720v1 |
http://arxiv.org/pdf/1810.02720v1.pdf | |
PWC | https://paperswithcode.com/paper/tranx-a-transition-based-neural-abstract |
Repo | https://github.com/pcyin/tranX |
Framework | pytorch |
SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Title | SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension |
Authors | Taeuk Kim, Jihun Choi, Sang-goo Lee |
Abstract | We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70% on the development set and about 60% on the test set. |
Tasks | Machine Translation, Transfer Learning |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07049v1 |
http://arxiv.org/pdf/1805.07049v1.pdf | |
PWC | https://paperswithcode.com/paper/snu_ids-at-semeval-2018-task-12-sentence |
Repo | https://github.com/galsang/SemEval2018-task12 |
Framework | pytorch |
NEUZZ: Efficient Fuzzing with Neural Program Smoothing
Title | NEUZZ: Efficient Fuzzing with Neural Program Smoothing |
Authors | Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, Suman Jana |
Abstract | Fuzzing has become the de facto standard technique for finding software vulnerabilities. However, even state-of-the-art fuzzers are not very efficient at finding hard-to-trigger software bugs. Most popular fuzzers use evolutionary guidance to generate inputs that can trigger different bugs. Such evolutionary algorithms, while fast and simple to implement, often get stuck in fruitless sequences of random mutations. Gradient-guided optimization presents a promising alternative to evolutionary guidance. Gradient-guided techniques have been shown to significantly outperform evolutionary algorithms at solving high-dimensional structured optimization problems in domains like machine learning by efficiently utilizing gradients or higher-order derivatives of the underlying function. However, gradient-guided approaches are not directly applicable to fuzzing as real-world program behaviors contain many discontinuities, plateaus, and ridges where the gradient-based methods often get stuck. We observe that this problem can be addressed by creating a smooth surrogate function approximating the discrete branching behavior of target program. In this paper, we propose a novel program smoothing technique using surrogate neural network models that can incrementally learn smooth approximations of a complex, real-world program’s branching behaviors. We further demonstrate that such neural network models can be used together with gradient-guided input generation schemes to significantly improve the fuzzing efficiency. Our extensive evaluations demonstrate that NEUZZ significantly outperforms 10 state-of-the-art graybox fuzzers on 10 real-world programs both at finding new bugs and achieving higher edge coverage. NEUZZ found 31 unknown bugs that other fuzzers failed to find in 10 real world programs and achieved 3X more edge coverage than all of the tested graybox fuzzers for 24 hours running. |
Tasks | |
Published | 2018-07-15 |
URL | https://arxiv.org/abs/1807.05620v4 |
https://arxiv.org/pdf/1807.05620v4.pdf | |
PWC | https://paperswithcode.com/paper/neuzz-efficient-fuzzing-with-neural-program |
Repo | https://github.com/dongdongshe/neuzz |
Framework | tf |
Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons
Title | Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons |
Authors | Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi |
Abstract | An activation boundary for a neuron refers to a separating hyperplane that determines whether the neuron is activated or deactivated. It has been long considered in neural networks that the activations of neurons, rather than their exact output values, play the most important role in forming classification friendly partitions of the hidden feature space. However, as far as we know, this aspect of neural networks has not been considered in the literature of knowledge transfer. In this paper, we propose a knowledge transfer method via distillation of activation boundaries formed by hidden neurons. For the distillation, we propose an activation transfer loss that has the minimum value when the boundaries generated by the student coincide with those by the teacher. Since the activation transfer loss is not differentiable, we design a piecewise differentiable loss approximating the activation transfer loss. By the proposed method, the student learns a separating boundary between activation region and deactivation region formed by each neuron in the teacher. Through the experiments in various aspects of knowledge transfer, it is verified that the proposed method outperforms the current state-of-the-art. |
Tasks | Transfer Learning |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03233v2 |
http://arxiv.org/pdf/1811.03233v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-transfer-via-distillation-of |
Repo | https://github.com/bhheo/AB_distillation |
Framework | pytorch |
Differentiable MPC for End-to-end Planning and Control
Title | Differentiable MPC for End-to-end Planning and Control |
Authors | Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter |
Abstract | We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces. This provides one way of leveraging and combining the advantages of model-free and model-based approaches. Specifically, we differentiate through MPC by using the KKT conditions of the convex approximation at a fixed point of the controller. Using this strategy, we are able to learn the cost and dynamics of a controller via end-to-end learning. Our experiments focus on imitation learning in the pendulum and cartpole domains, where we learn the cost and dynamics terms of an MPC policy class. We show that our MPC policies are significantly more data-efficient than a generic neural network and that our method is superior to traditional system identification in a setting where the expert is unrealizable. |
Tasks | Imitation Learning |
Published | 2018-10-31 |
URL | https://arxiv.org/abs/1810.13400v3 |
https://arxiv.org/pdf/1810.13400v3.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-mpc-for-end-to-end-planning |
Repo | https://github.com/locuslab/differentiable-mpc |
Framework | pytorch |
Visualizing the Flow of Discourse with a Concept Ontology
Title | Visualizing the Flow of Discourse with a Concept Ontology |
Authors | Baoxu Shi, Tim Weninger |
Abstract | Understanding and visualizing human discourse has long being a challenging task. Although recent work on argument mining have shown success in classifying the role of various sentences, the task of recognizing concepts and understanding the ways in which they are discussed remains challenging. Given an email thread or a transcript of a group discussion, our task is to extract the relevant concepts and understand how they are referenced and re-referenced throughout the discussion. In the present work, we present a preliminary approach for extracting and visualizing group discourse by adapting Wikipedia’s category hierarchy to be an external concept ontology. From a user study, we found that our method achieved better results than 4 strong alternative approaches, and we illustrate our visualization method based on the extracted discourse flows. |
Tasks | Argument Mining |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08614v1 |
http://arxiv.org/pdf/1802.08614v1.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-the-flow-of-discourse-with-a |
Repo | https://github.com/bxshi/DiscourseVisualization |
Framework | none |
Encoding Spatial Relations from Natural Language
Title | Encoding Spatial Relations from Natural Language |
Authors | Tiago Ramalho, Tomáš Kočiský, Frederic Besse, S. M. Ali Eslami, Gábor Melis, Fabio Viola, Phil Blunsom, Karl Moritz Hermann |
Abstract | Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the real world. In particular, spatial relations are encoded in a way that is inconsistent with human spatial reasoning and lacking invariance to viewpoint changes. We present a system capable of capturing the semantics of spatial relations such as behind, left of, etc from natural language. Our key contributions are a novel multi-modal objective based on generating images of scenes from their textual descriptions, and a new dataset on which to train it. We demonstrate that internal representations are robust to meaning preserving transformations of descriptions (paraphrase invariance), while viewpoint invariance is an emergent property of the system. |
Tasks | |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01670v2 |
http://arxiv.org/pdf/1807.01670v2.pdf | |
PWC | https://paperswithcode.com/paper/encoding-spatial-relations-from-natural |
Repo | https://github.com/deepmind/slim-dataset |
Framework | tf |
AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity
Title | AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity |
Authors | Yibo Zeng, Fei Feng, Wotao Yin |
Abstract | In this paper, we propose AsyncQVI, an asynchronous-parallel Q-value iteration for discounted Markov decision processes whose transition and reward can only be sampled through a generative model. Given such a problem with $\mathcal{S}$ states, $\mathcal{A}$ actions, and a discounted factor $\gamma\in(0,1)$, AsyncQVI uses memory of size $\mathcal{O}(\mathcal{S})$ and returns an $\varepsilon$-optimal policy with probability at least $1-\delta$ using $$\tilde{\mathcal{O}}\big(\frac{\mathcal{S}\mathcal{A}}{(1-\gamma)^5\varepsilon^2}\log(\frac{1}{\delta})\big)$$ samples. AsyncQVI is also the first asynchronous-parallel algorithm for discounted Markov decision processes that has a sample complexity, which nearly matches the theoretical lower bound. The relatively low memory footprint and parallel ability make AsyncQVI suitable for large-scale applications. In numerical tests, we compare AsyncQVI with four sample-based value iteration methods. The results show that our algorithm is highly efficient and achieves linear parallel speedup. |
Tasks | |
Published | 2018-12-03 |
URL | https://arxiv.org/abs/1812.00885v3 |
https://arxiv.org/pdf/1812.00885v3.pdf | |
PWC | https://paperswithcode.com/paper/asyncqvi-asynchronous-parallel-q-value |
Repo | https://github.com/uclaopt/AsyncQVI |
Framework | none |
Neural Adaptation Layers for Cross-domain Named Entity Recognition
Title | Neural Adaptation Layers for Cross-domain Named Entity Recognition |
Authors | Bill Yuchen Lin, Wei Lu |
Abstract | Recent research efforts have shown that neural architectures can be effective in conventional information extraction tasks such as named entity recognition, yielding state-of-the-art results on standard newswire datasets. However, despite significant resources required for training such models, the performance of a model trained on one domain typically degrades dramatically when applied to a different domain, yet extracting entities from new emerging domains such as social media can be of significant interest. In this paper, we empirically investigate effective methods for conveniently adapting an existing, well-trained neural NER model for a new domain. Unlike existing approaches, we propose lightweight yet effective methods for performing domain adaptation for neural models. Specifically, we introduce adaptation layers on top of existing neural architectures, where no re-training using the source domain data is required. We conduct extensive empirical studies and show that our approach significantly outperforms state-of-the-art methods. |
Tasks | Cross-Domain Named Entity Recognition, Domain Adaptation, Named Entity Recognition |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06368v1 |
http://arxiv.org/pdf/1810.06368v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-adaptation-layers-for-cross-domain |
Repo | https://github.com/yuchenlin/CDMA-NER |
Framework | tf |
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Title | Multilevel Language and Vision Integration for Text-to-Clip Retrieval |
Authors | Huijuan Xu, Kun He, Bryan A. Plummer, Leonid Sigal, Stan Sclaroff, Kate Saenko |
Abstract | We address the problem of text-based activity retrieval in video. Given a sentence describing an activity, our task is to retrieve matching clips from an untrimmed video. To capture the inherent structures present in both text and video, we introduce a multilevel model that integrates vision and language features earlier and more tightly than prior work. First, we inject text features early on when generating clip proposals, to help eliminate unlikely clips and thus speed up processing and boost performance. Second, to learn a fine-grained similarity metric for retrieval, we use visual features to modulate the processing of query sentences at the word level in a recurrent neural network. A multi-task loss is also employed by adding query re-generation as an auxiliary task. Our approach significantly outperforms prior work on two challenging benchmarks: Charades-STA and ActivityNet Captions. |
Tasks | |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05113v3 |
http://arxiv.org/pdf/1804.05113v3.pdf | |
PWC | https://paperswithcode.com/paper/multilevel-language-and-vision-integration |
Repo | https://github.com/VisionLearningGroup/Text-to-Clip_Retrieval |
Framework | none |
Skin Lesions Classification Using Convolutional Neural Networks in Clinical Images
Title | Skin Lesions Classification Using Convolutional Neural Networks in Clinical Images |
Authors | Danilo Barros Mendes, Nilton Correia da Silva |
Abstract | Skin lesions are conditions that appear on a patient due to many different reasons. One of these can be because of an abnormal growth in skin tissue, defined as cancer. This disease plagues more than 14.1 million patients and had been the cause of more than 8.2 million deaths, worldwide. Therefore, the construction of a classification model for 12 lesions, including Malignant Melanoma and Basal Cell Carcinoma, is proposed. Furthermore, in this work, it is used a ResNet-152 architecture, which was trained over 3,797 images, later augmented by a factor of 29 times, using positional, scale, and lighting transformations. Finally, the network was tested with 956 images and achieve an area under the curve (AUC) of 0.96 for Melanoma and 0.91 for Basal Cell Carcinoma. |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02316v1 |
http://arxiv.org/pdf/1812.02316v1.pdf | |
PWC | https://paperswithcode.com/paper/skin-lesions-classification-using |
Repo | https://github.com/aryanmisra/Skin-Lesion-Classifier |
Framework | tf |
Attention Based Natural Language Grounding by Navigating Virtual Environment
Title | Attention Based Natural Language Grounding by Navigating Virtual Environment |
Authors | Akilesh B, Abhishek Sinha, Mausoom Sarkar, Balaji Krishnamurthy |
Abstract | In this work, we focus on the problem of grounding language by training an agent to follow a set of natural language instructions and navigate to a target object in an environment. The agent receives visual information through raw pixels and a natural language instruction telling what task needs to be achieved and is trained in an end-to-end way. We develop an attention mechanism for multi-modal fusion of visual and textual modalities that allows the agent to learn to complete the task and achieve language grounding. Our experimental results show that our attention mechanism outperforms the existing multi-modal fusion mechanisms proposed for both 2D and 3D environments in order to solve the above-mentioned task in terms of both speed and success rate. We show that the learnt textual representations are semantically meaningful as they follow vector arithmetic in the embedding space. The effectiveness of our attention approach over the contemporary fusion mechanisms is also highlighted from the textual embeddings learnt by the different approaches. We also show that our model generalizes effectively to unseen scenarios and exhibit zero-shot generalization capabilities both in 2D and 3D environments. The code for our 2D environment as well as the models that we developed for both 2D and 3D are available at https://github.com/rl-lang-grounding/rl-lang-ground. |
Tasks | |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08454v2 |
http://arxiv.org/pdf/1804.08454v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-natural-language-grounding-by |
Repo | https://github.com/rl-lang-grounding/rl-lang-ground |
Framework | tf |
t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data
Title | t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data |
Authors | David M. Chan, Roshan Rao, Forrest Huang, John F. Canny |
Abstract | Modern datasets and models are notoriously difficult to explore and analyze due to their inherent high dimensionality and massive numbers of samples. Existing visualization methods which employ dimensionality reduction to two or three dimensions are often inefficient and/or ineffective for these datasets. This paper introduces t-SNE-CUDA, a GPU-accelerated implementation of t-distributed Symmetric Neighbor Embedding (t-SNE) for visualizing datasets and models. t-SNE-CUDA significantly outperforms current implementations with 50-700x speedups on the CIFAR-10 and MNIST datasets. These speedups enable, for the first time, visualization of the neural network activations on the entire ImageNet dataset - a feat that was previously computationally intractable. We also demonstrate visualization performance in the NLP domain by visualizing the GloVe embedding vectors. From these visualizations, we can draw interesting conclusions about using the L2 metric in these embedding spaces. t-SNE-CUDA is publicly available athttps://github.com/CannyLab/tsne-cuda |
Tasks | Dimensionality Reduction |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1807.11824v1 |
http://arxiv.org/pdf/1807.11824v1.pdf | |
PWC | https://paperswithcode.com/paper/t-sne-cuda-gpu-accelerated-t-sne-and-its |
Repo | https://github.com/CannyLab/tsne-cuda |
Framework | none |