July 26, 2019

2652 words 13 mins read

Paper Group NAWR 10

Paper Group NAWR 10

Exploring Neural Text Simplification Models. End-to-End Deep Learning of Optimization Heuristics. LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test. ATM: A distributed, collaborative, scalable system for automated machine learning. Semi-supervised Structured Prediction with Neural CRF Autoencoder. Aggregating and Predicting Se …

Exploring Neural Text Simplification Models

Title Exploring Neural Text Simplification Models
Authors Sergiu Nisioi, Sanja {\v{S}}tajner, Simone Paolo Ponzetto, Liviu P. Dinu
Abstract We present the first attempt at using sequence to sequence neural networks to model text simplification (TS). Unlike the previously proposed automated TS systems, our neural text simplification (NTS) systems are able to simultaneously perform lexical simplification and content reduction. An extensive human evaluation of the output has shown that NTS systems achieve almost perfect grammaticality and meaning preservation of output sentences and higher level of simplification than the state-of-the-art automated TS systems
Tasks Lexical Simplification, Machine Translation, Text Simplification, Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2014/
PDF https://www.aclweb.org/anthology/P17-2014
PWC https://paperswithcode.com/paper/exploring-neural-text-simplification-models
Repo https://github.com/senisioi/NeuralTextSimplification
Framework torch

End-to-End Deep Learning of Optimization Heuristics

Title End-to-End Deep Learning of Optimization Heuristics
Authors Chris Cummins, Pavlos Petoumenos, Zheng Wang, Hugh Leather
Abstract Accurate automatic optimization heuristics are necessary for dealing with the complexity and diversity of modern hardware and software. Machine learning is a proven technique for learning such heuristics, but its success is bound by the quality of the features used. These features must be hand crafted by developers through a combination of expert domain knowledge and trial and error. This makes the quality of the final model directly dependent on the skill and available time of the system architect. Our work introduces a better way for building heuristics. We develop a deep neural network that learns heuristics over raw code, entirely without using code features. The neural network simultaneously constructs appropriate representations of the code and learns how best to optimize, removing the need for manual feature creation. Further, we show that our neural nets can transfer learning from one optimization problem to another, improving the accuracy of new models, without the help of human experts. We compare the effectiveness of our automatically generated heuristics against ones with features hand-picked by experts. We examine two challenging tasks:predicting optimal mapping for heterogeneous parallelism and GPU thread coarsening factors. In 89% of the cases, the quality of our fully automatic heuristics matches or surpasses that of state-of-the-art predictive models using hand-crafted features, providing on average 14% and 12% more performance with no human effort expended on designing features.
Tasks Transfer Learning
Published 2017-11-02
URL https://ieeexplore.ieee.org/document/8091247
PDF http://homepages.inf.ed.ac.uk/hleather/publications/2017-deepopt-pact.pdf
PWC https://paperswithcode.com/paper/end-to-end-deep-learning-of-optimization
Repo https://github.com/ChrisCummins/paper-end2end-dl
Framework none

LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test

Title LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test
Authors Michael Bugert, Yevgeniy Puzikov, Andreas R{"u}ckl{'e}, Judith Eckle-Kohler, Teresa Martin, Eugenio Mart{'\i}nez-C{'a}mara, Daniil Sorokin, Maxime Peyrard, Iryna Gurevych
Abstract The Story Cloze test is a recent effort in providing a common test scenario for text understanding systems. As part of the LSDSem 2017 shared task, we present a system based on a deep learning architecture combined with a rich set of manually-crafted linguistic features. The system outperforms all known baselines for the task, suggesting that the chosen approach is promising. We additionally present two methods for generating further training data based on stories from the ROCStories corpus.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-0908/
PDF https://www.aclweb.org/anthology/W17-0908
PWC https://paperswithcode.com/paper/lsdsem-2017-exploring-data-generation-methods
Repo https://github.com/UKPLab/lsdsem2017-story-cloze
Framework tf

ATM: A distributed, collaborative, scalable system for automated machine learning

Title ATM: A distributed, collaborative, scalable system for automated machine learning
Authors Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross, Kalyan Veeramachaneni
Abstract In this paper, we present Auto-Tuned Models, or ATM, a distributed, collaborative, scalable system for automated machine learning. Users of ATM can simply upload a dataset, choose a subset of modeling methods, and choose to use ATM’s hybrid Bayesian and multi-armed bandit optimization system. The distributed system works in a load balanced fashion to quickly deliver results in the form of ready-to-predict models, confusion matrices, cross-validation results, and training timings. By automating hyperparameter tuning and model selection, ATM returns the emphasis of the machine learning workflow to its most irreducible part: feature engineering. We demonstrate the usefulness of ATM on 420 datasets from OpenML and train over 3 million classifiers. Our initial results show ATM can beat human-generated solutions for 30% of the datasets, and can do so in 1/100th of the time.
Tasks AutoML, Feature Engineering, Hyperparameter Optimization, Model Selection
Published 2017-12-11
URL https://ieeexplore.ieee.org/document/8257923
PDF https://dai.lids.mit.edu/wp-content/uploads/2018/02/atm_IEEE_BIgData-9-1.pdf
PWC https://paperswithcode.com/paper/atm-a-distributed-collaborative-scalable
Repo https://github.com/HDI-Project/ATM
Framework none

Semi-supervised Structured Prediction with Neural CRF Autoencoder

Title Semi-supervised Structured Prediction with Neural CRF Autoencoder
Authors Xiao Zhang, Yong Jiang, Hao Peng, Kewei Tu, Dan Goldwasser
Abstract In this paper we propose an end-to-end neural CRF autoencoder (NCRF-AE) model for semi-supervised learning of sequential structured prediction problems. Our NCRF-AE consists of two parts: an encoder which is a CRF model enhanced by deep neural networks, and a decoder which is a generative model trying to reconstruct the input. Our model has a unified structure with different loss functions for labeled and unlabeled data with shared parameters. We developed a variation of the EM algorithm for optimizing both the encoder and the decoder simultaneously by decoupling their parameters. Our Experimental results over the Part-of-Speech (POS) tagging task on eight different languages, show that our model can outperform competitive systems in both supervised and semi-supervised scenarios.
Tasks Part-Of-Speech Tagging, Structured Prediction
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1179/
PDF https://www.aclweb.org/anthology/D17-1179
PWC https://paperswithcode.com/paper/semi-supervised-structured-prediction-with
Repo https://github.com/cosmozhang/NCRF-AE
Framework none

Aggregating and Predicting Sequence Labels from Crowd Annotations

Title Aggregating and Predicting Sequence Labels from Crowd Annotations
Authors An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease
Abstract Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.
Tasks Named Entity Recognition, Part-Of-Speech Tagging
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1028/
PDF https://www.aclweb.org/anthology/P17-1028
PWC https://paperswithcode.com/paper/aggregating-and-predicting-sequence-labels
Repo https://github.com/thanhan/seqcrowd-acl17
Framework none

Deep Pyramid Convolutional Neural Networks for Text Categorization

Title Deep Pyramid Convolutional Neural Networks for Text Categorization
Authors Rie Johnson, Tong Zhang
Abstract This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for text categorization that can efficiently represent long-range associations in text. In the literature, several deep and complex neural networks have been proposed for this task, assuming availability of relatively large amounts of training data. However, the associated computational complexity increases as the networks go deeper, which poses serious challenges in practical applications. Moreover, it was shown recently that shallow word-level CNNs are more accurate and much faster than the state-of-the-art very deep nets such as character-level CNNs even in the setting of large training data. Motivated by these findings, we carefully studied deepening of word-level CNNs to capture global representations of text, and found a simple network architecture with which the best accuracy can be obtained by increasing the network depth without increasing computational cost by much. We call it deep pyramid CNN. The proposed model with 15 weight layers outperforms the previous best models on six benchmark datasets for sentiment classification and topic categorization.
Tasks Sentiment Analysis, Text Classification
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1052/
PDF https://www.aclweb.org/anthology/P17-1052
PWC https://paperswithcode.com/paper/deep-pyramid-convolutional-neural-networks
Repo https://github.com/Cheneng/DPCNN
Framework pytorch

Machine Learning of Accurate Energy-conserving Molecular Force Fields

Title Machine Learning of Accurate Energy-conserving Molecular Force Fields
Authors Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, I., Schütt, K. T., Müller, K.-R.
Abstract Using conservation of energy—a fundamental property of closed classical and quantum mechanical systems—we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The GDML implementation is able to reproduce global potential energy surfaces of intermediate-sized molecules with an accuracy of 0.3 kcal mol−1 for energies and 1 kcal mol−1 Å̊−1 for atomic forces using only 1000 conformational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The challenge of constructing conservative force fields is accomplished in our work by learning in a Hilbert space of vector-valued functions that obey the law of energy conservation. The GDML approach enables quantitative molecular dynamics simulations for molecules at a fraction of cost of explicit AIMD calculations, thereby allowing the construction of efficient force fields with the accuracy and transferability of high-level ab initio methods.
Tasks MD17 dataset
Published 2017-05-05
URL https://advances.sciencemag.org/content/3/5/e1603015
PDF https://advances.sciencemag.org/content/3/5/e1603015/tab-pdf
PWC https://paperswithcode.com/paper/machine-learning-of-accurate-energy
Repo https://github.com/stefanch/sGDML
Framework pytorch

RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos

Title RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos
Authors Wenbin Du, Yali Wang, Yu Qiao
Abstract Recent studies demonstrate the effectiveness of Recurrent Neural Networks (RNNs) for action recognition in videos. However, previous works mainly utilize video-level category as supervision to train RNNs, which may prohibit RNNs to learn complex motion structures along time. In this paper, we propose a recurrent pose-attention network (RPAN) to address this challenge, where we introduce a novel pose-attention mechanism to adaptively learn pose-related features at every time-step action prediction of RNNs. More specifically, we make three main contributions in this paper. Firstly, unlike previous works on pose-related action recognition, our RPAN is an end-to-end recurrent network which can exploit important spatial-temporal evolutions of human pose to assist action recognition in a unified framework. Secondly, instead of learning individual human-joint features separately, our pose-attention mechanism learns robust human-part features by sharing attention parameters partially on the semantically-related human joints. These human-part features are then fed into the human-part pooling layer to construct a highly-discriminative pose-related representation for temporal action modeling. Thirdly, one important byproduct of our RPAN is pose estimation in videos, which can be used for coarse pose annotation in action videos. We evaluate the proposed RPAN quantitatively and qualitatively on two popular benchmarks, i.e., Sub-JHMDB and PennAction. Experimental results show that RPAN outperforms the recent state-of-the-art methods on these challenging datasets.
Tasks Action Recognition In Videos, Pose Estimation, Skeleton Based Action Recognition
Published 2017-10-22
URL https://doi.org/10.1109/ICCV.2017.402
PDF https://doi.org/10.1109/ICCV.2017.402
PWC https://paperswithcode.com/paper/rpan-an-end-to-end-recurrent-pose-attention-1
Repo https://github.com/agethen/RPAN
Framework tf

On-demand Injection of Lexical Knowledge for Recognising Textual Entailment

Title On-demand Injection of Lexical Knowledge for Recognising Textual Entailment
Authors Pascual Mart{'\i}nez-G{'o}mez, Koji Mineshima, Yusuke Miyao, Daisuke Bekki
Abstract We approach the recognition of textual entailment using logical semantic representations and a theorem prover. In this setup, lexical divergences that preserve semantic entailment between the source and target texts need to be explicitly stated. However, recognising subsentential semantic relations is not trivial. We address this problem by monitoring the proof of the theorem and detecting unprovable sub-goals that share predicate arguments with logical premises. If a linguistic relation exists, then an appropriate axiom is constructed on-demand and the theorem proving continues. Experiments show that this approach is effective and precise, producing a system that outperforms other logic-based systems and is competitive with state-of-the-art statistical methods.
Tasks Automated Theorem Proving, Information Retrieval, Natural Language Inference, Question Answering
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1067/
PDF https://www.aclweb.org/anthology/E17-1067
PWC https://paperswithcode.com/paper/on-demand-injection-of-lexical-knowledge-for
Repo https://github.com/mynlp/ccg2lambda
Framework none

Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?

Title Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?
Authors Shangmin Guo, Xiangrong Zeng, Shizhu He, Kang Liu, Jun Zhao
Abstract As one of the most important test of China, Gaokao is designed to be difficult enough to distinguish the excellent high school students. In this work, we detailed the Gaokao History Multiple Choice Questions(GKHMC) and proposed two different approaches to address them using various resources. One approach is based on entity search technique (IR approach), the other is based on text entailment approach where we specifically employ deep neural networks(NN approach). The result of experiment on our collected real Gaokao questions showed that they are good at different categories of questions, that is IR approach performs much better at entity questions(EQs) while NN approach shows its advantage on sentence questions(SQs). We achieve state-of-the-art performance and show that it{'}s indispensable to apply hybrid method when participating in the real-world tests.
Tasks Information Retrieval, Question Answering, Reading Comprehension
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1011/
PDF https://www.aclweb.org/anthology/E17-1011
PWC https://paperswithcode.com/paper/which-is-the-effective-way-for-gaokao
Repo https://github.com/IACASNLPIR/GKHMC
Framework none

Beyond Filters: Compact Feature Map for Portable Deep Model

Title Beyond Filters: Compact Feature Map for Portable Deep Model
Authors Yunhe Wang, Chang Xu, Chao Xu, Dacheng Tao
Abstract Convolutional neural networks (CNNs) have shown extraordinary performance in a number of applications, but they are usually of heavy design for the accuracy reason. Beyond compressing the filters in CNNs, this paper focuses on the redundancy in the feature maps derived from the large number of filters in a layer. We propose to extract intrinsic representation of the feature maps and preserve the discriminability of the features. Circulant matrix is employed to formulate the feature map transformation, which only requires O(dlog d) computation complexity to embed a d-dimensional feature map. The filter is then re-configured to establish the mapping from original input to the new compact feature map, and the resulting network can preserve intrinsic information of the original network with significantly fewer parameters, which not only decreases the online memory for launching CNN but also accelerates the computation speed. Experiments on benchmark image datasets demonstrate the superiority of the proposed algorithm over state-of-the-art methods.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=466
PDF http://proceedings.mlr.press/v70/wang17m/wang17m.pdf
PWC https://paperswithcode.com/paper/beyond-filters-compact-feature-map-for
Repo https://github.com/YunheWang/RedCNN
Framework none

End-to-End System for Bacteria Habitat Extraction

Title End-to-End System for Bacteria Habitat Extraction
Authors Farrokh Mehryary, Kai Hakala, Suwisa Kaewphan, Jari Bj{"o}rne, Tapio Salakoski, Filip Ginter
Abstract We introduce an end-to-end system capable of named-entity detection, normalization and relation extraction for extracting information about bacteria and their habitats from biomedical literature. Our system is based on deep learning, CRF classifiers and vector space models. We train and evaluate the system on the BioNLP 2016 Shared Task Bacteria Biotope data. The official evaluation shows that the joint performance of our entity detection and relation extraction models outperforms the winning team of the Shared Task by 19pp on F1-score, establishing a new top score for the task. We also achieve state-of-the-art results in the normalization task. Our system is open source and freely available at \url{https://github.com/TurkuNLP/BHE}.
Tasks Named Entity Recognition, Relation Extraction
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2310/
PDF https://www.aclweb.org/anthology/W17-2310
PWC https://paperswithcode.com/paper/end-to-end-system-for-bacteria-habitat
Repo https://github.com/TurkuNLP/BHE
Framework none

UD Annotatrix: An annotation tool for Universal Dependencies

Title UD Annotatrix: An annotation tool for Universal Dependencies
Authors Francis M. Tyers, Mariya Sheyanova, Jonathan North Washington
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7604/
PDF https://www.aclweb.org/anthology/W17-7604
PWC https://paperswithcode.com/paper/ud-annotatrix-an-annotation-tool-for
Repo https://github.com/jonorthwash/ud-annotatrix
Framework none

Deep IV: A Flexible Approach for Counterfactual Prediction

Title Deep IV: A Flexible Approach for Counterfactual Prediction
Authors Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy
Abstract Counterfactual prediction requires understanding causal relationships between so-called treatment and outcome variables. This paper provides a recipe for augmenting deep learning methods to accurately characterize such relationships in the presence of instrument variables (IVs) – sources of treatment randomization that are conditionally independent from the outcomes. Our IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework allows us to take advantage of off-the-shelf supervised learning techniques to estimate causal effects by adapting the loss function. Experiments show that it outperforms existing machine learning approaches.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=883
PDF http://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf
PWC https://paperswithcode.com/paper/deep-iv-a-flexible-approach-for
Repo https://github.com/jhartford/DeepIV
Framework none
comments powered by Disqus