July 26, 2019

2652 words 13 mins read

Paper Group NAWR 10

Exploring Neural Text Simplification Models. End-to-End Deep Learning of Optimization Heuristics. LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test. ATM: A distributed, collaborative, scalable system for automated machine learning. Semi-supervised Structured Prediction with Neural CRF Autoencoder. Aggregating and Predicting Se …

Exploring Neural Text Simplification Models


Title	Exploring Neural Text Simplification Models
Authors	Sergiu Nisioi, Sanja {\v{S}}tajner, Simone Paolo Ponzetto, Liviu P. Dinu
Abstract	We present the first attempt at using sequence to sequence neural networks to model text simplification (TS). Unlike the previously proposed automated TS systems, our neural text simplification (NTS) systems are able to simultaneously perform lexical simplification and content reduction. An extensive human evaluation of the output has shown that NTS systems achieve almost perfect grammaticality and meaning preservation of output sentences and higher level of simplification than the state-of-the-art automated TS systems
Tasks	Lexical Simplification, Machine Translation, Text Simplification, Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2014/
PDF	https://www.aclweb.org/anthology/P17-2014
PWC	https://paperswithcode.com/paper/exploring-neural-text-simplification-models
Repo	https://github.com/senisioi/NeuralTextSimplification
Framework	torch

End-to-End Deep Learning of Optimization Heuristics


Title	End-to-End Deep Learning of Optimization Heuristics
Authors	Chris Cummins, Pavlos Petoumenos, Zheng Wang, Hugh Leather
Abstract	Accurate automatic optimization heuristics are necessary for dealing with the complexity and diversity of modern hardware and software. Machine learning is a proven technique for learning such heuristics, but its success is bound by the quality of the features used. These features must be hand crafted by developers through a combination of expert domain knowledge and trial and error. This makes the quality of the final model directly dependent on the skill and available time of the system architect. Our work introduces a better way for building heuristics. We develop a deep neural network that learns heuristics over raw code, entirely without using code features. The neural network simultaneously constructs appropriate representations of the code and learns how best to optimize, removing the need for manual feature creation. Further, we show that our neural nets can transfer learning from one optimization problem to another, improving the accuracy of new models, without the help of human experts. We compare the effectiveness of our automatically generated heuristics against ones with features hand-picked by experts. We examine two challenging tasks:predicting optimal mapping for heterogeneous parallelism and GPU thread coarsening factors. In 89% of the cases, the quality of our fully automatic heuristics matches or surpasses that of state-of-the-art predictive models using hand-crafted features, providing on average 14% and 12% more performance with no human effort expended on designing features.
Tasks	Transfer Learning
Published	2017-11-02
URL	https://ieeexplore.ieee.org/document/8091247
PDF	http://homepages.inf.ed.ac.uk/hleather/publications/2017-deepopt-pact.pdf
PWC	https://paperswithcode.com/paper/end-to-end-deep-learning-of-optimization
Repo	https://github.com/ChrisCummins/paper-end2end-dl
Framework	none

LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test


Title	LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test
Authors	Michael Bugert, Yevgeniy Puzikov, Andreas R{"u}ckl{'e}, Judith Eckle-Kohler, Teresa Martin, Eugenio Mart{'\i}nez-C{'a}mara, Daniil Sorokin, Maxime Peyrard, Iryna Gurevych
Abstract	The Story Cloze test is a recent effort in providing a common test scenario for text understanding systems. As part of the LSDSem 2017 shared task, we present a system based on a deep learning architecture combined with a rich set of manually-crafted linguistic features. The system outperforms all known baselines for the task, suggesting that the chosen approach is promising. We additionally present two methods for generating further training data based on stories from the ROCStories corpus.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-0908/
PDF	https://www.aclweb.org/anthology/W17-0908
PWC	https://paperswithcode.com/paper/lsdsem-2017-exploring-data-generation-methods
Repo	https://github.com/UKPLab/lsdsem2017-story-cloze
Framework	tf

ATM: A distributed, collaborative, scalable system for automated machine learning


Title	ATM: A distributed, collaborative, scalable system for automated machine learning
Authors	Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross, Kalyan Veeramachaneni
Abstract	In this paper, we present Auto-Tuned Models, or ATM, a distributed, collaborative, scalable system for automated machine learning. Users of ATM can simply upload a dataset, choose a subset of modeling methods, and choose to use ATM’s hybrid Bayesian and multi-armed bandit optimization system. The distributed system works in a load balanced fashion to quickly deliver results in the form of ready-to-predict models, confusion matrices, cross-validation results, and training timings. By automating hyperparameter tuning and model selection, ATM returns the emphasis of the machine learning workflow to its most irreducible part: feature engineering. We demonstrate the usefulness of ATM on 420 datasets from OpenML and train over 3 million classifiers. Our initial results show ATM can beat human-generated solutions for 30% of the datasets, and can do so in 1/100th of the time.
Tasks	AutoML, Feature Engineering, Hyperparameter Optimization, Model Selection
Published	2017-12-11
URL	https://ieeexplore.ieee.org/document/8257923
PDF	https://dai.lids.mit.edu/wp-content/uploads/2018/02/atm_IEEE_BIgData-9-1.pdf
PWC	https://paperswithcode.com/paper/atm-a-distributed-collaborative-scalable
Repo	https://github.com/HDI-Project/ATM
Framework	none

Semi-supervised Structured Prediction with Neural CRF Autoencoder


Title	Semi-supervised Structured Prediction with Neural CRF Autoencoder
Authors	Xiao Zhang, Yong Jiang, Hao Peng, Kewei Tu, Dan Goldwasser
Abstract	In this paper we propose an end-to-end neural CRF autoencoder (NCRF-AE) model for semi-supervised learning of sequential structured prediction problems. Our NCRF-AE consists of two parts: an encoder which is a CRF model enhanced by deep neural networks, and a decoder which is a generative model trying to reconstruct the input. Our model has a unified structure with different loss functions for labeled and unlabeled data with shared parameters. We developed a variation of the EM algorithm for optimizing both the encoder and the decoder simultaneously by decoupling their parameters. Our Experimental results over the Part-of-Speech (POS) tagging task on eight different languages, show that our model can outperform competitive systems in both supervised and semi-supervised scenarios.
Tasks	Part-Of-Speech Tagging, Structured Prediction
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1179/
PDF	https://www.aclweb.org/anthology/D17-1179
PWC	https://paperswithcode.com/paper/semi-supervised-structured-prediction-with
Repo	https://github.com/cosmozhang/NCRF-AE
Framework	none

Aggregating and Predicting Sequence Labels from Crowd Annotations


Title	Aggregating and Predicting Sequence Labels from Crowd Annotations
Authors	An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease
Abstract	Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.
Tasks	Named Entity Recognition, Part-Of-Speech Tagging
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1028/
PDF	https://www.aclweb.org/anthology/P17-1028
PWC	https://paperswithcode.com/paper/aggregating-and-predicting-sequence-labels
Repo	https://github.com/thanhan/seqcrowd-acl17
Framework	none

Deep Pyramid Convolutional Neural Networks for Text Categorization


Title	Deep Pyramid Convolutional Neural Networks for Text Categorization
Authors	Rie Johnson, Tong Zhang
Abstract	This paper proposes a low-complexity word-level deep convolutional neural network (CNN) architecture for text categorization that can efficiently represent long-range associations in text. In the literature, several deep and complex neural networks have been proposed for this task, assuming availability of relatively large amounts of training data. However, the associated computational complexity increases as the networks go deeper, which poses serious challenges in practical applications. Moreover, it was shown recently that shallow word-level CNNs are more accurate and much faster than the state-of-the-art very deep nets such as character-level CNNs even in the setting of large training data. Motivated by these findings, we carefully studied deepening of word-level CNNs to capture global representations of text, and found a simple network architecture with which the best accuracy can be obtained by increasing the network depth without increasing computational cost by much. We call it deep pyramid CNN. The proposed model with 15 weight layers outperforms the previous best models on six benchmark datasets for sentiment classification and topic categorization.
Tasks	Sentiment Analysis, Text Classification
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1052/
PDF	https://www.aclweb.org/anthology/P17-1052
PWC	https://paperswithcode.com/paper/deep-pyramid-convolutional-neural-networks
Repo	https://github.com/Cheneng/DPCNN
Framework	pytorch

Machine Learning of Accurate Energy-conserving Molecular Force Fields


Title	Machine Learning of Accurate Energy-conserving Molecular Force Fields
Authors	Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, I., Schütt, K. T., Müller, K.-R.
Abstract	Using conservation of energy—a fundamental property of closed classical and quantum mechanical systems—we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The GDML implementation is able to reproduce global potential energy surfaces of intermediate-sized molecules with an accuracy of 0.3 kcal mol−1 for energies and 1 kcal mol−1 Å̊−1 for atomic forces using only 1000 conformational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The challenge of constructing conservative force fields is accomplished in our work by learning in a Hilbert space of vector-valued functions that obey the law of energy conservation. The GDML approach enables quantitative molecular dynamics simulations for molecules at a fraction of cost of explicit AIMD calculations, thereby allowing the construction of efficient force fields with the accuracy and transferability of high-level ab initio methods.
Tasks	MD17 dataset
Published	2017-05-05
URL	https://advances.sciencemag.org/content/3/5/e1603015
PDF	https://advances.sciencemag.org/content/3/5/e1603015/tab-pdf
PWC	https://paperswithcode.com/paper/machine-learning-of-accurate-energy
Repo	https://github.com/stefanch/sGDML
Framework	pytorch

RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos


Title	RPAN: An End-to-End Recurrent Pose-Attention Network for Action Recognition in Videos
Authors	Wenbin Du, Yali Wang, Yu Qiao
Abstract	Recent studies demonstrate the effectiveness of Recurrent Neural Networks (RNNs) for action recognition in videos. However, previous works mainly utilize video-level category as supervision to train RNNs, which may prohibit RNNs to learn complex motion structures along time. In this paper, we propose a recurrent pose-attention network (RPAN) to address this challenge, where we introduce a novel pose-attention mechanism to adaptively learn pose-related features at every time-step action prediction of RNNs. More specifically, we make three main contributions in this paper. Firstly, unlike previous works on pose-related action recognition, our RPAN is an end-to-end recurrent network which can exploit important spatial-temporal evolutions of human pose to assist action recognition in a unified framework. Secondly, instead of learning individual human-joint features separately, our pose-attention mechanism learns robust human-part features by sharing attention parameters partially on the semantically-related human joints. These human-part features are then fed into the human-part pooling layer to construct a highly-discriminative pose-related representation for temporal action modeling. Thirdly, one important byproduct of our RPAN is pose estimation in videos, which can be used for coarse pose annotation in action videos. We evaluate the proposed RPAN quantitatively and qualitatively on two popular benchmarks, i.e., Sub-JHMDB and PennAction. Experimental results show that RPAN outperforms the recent state-of-the-art methods on these challenging datasets.
Tasks	Action Recognition In Videos, Pose Estimation, Skeleton Based Action Recognition
Published	2017-10-22
URL	https://doi.org/10.1109/ICCV.2017.402
PDF	https://doi.org/10.1109/ICCV.2017.402
PWC	https://paperswithcode.com/paper/rpan-an-end-to-end-recurrent-pose-attention-1
Repo	https://github.com/agethen/RPAN
Framework	tf

On-demand Injection of Lexical Knowledge for Recognising Textual Entailment


Title	On-demand Injection of Lexical Knowledge for Recognising Textual Entailment
Authors	Pascual Mart{'\i}nez-G{'o}mez, Koji Mineshima, Yusuke Miyao, Daisuke Bekki
Abstract	We approach the recognition of textual entailment using logical semantic representations and a theorem prover. In this setup, lexical divergences that preserve semantic entailment between the source and target texts need to be explicitly stated. However, recognising subsentential semantic relations is not trivial. We address this problem by monitoring the proof of the theorem and detecting unprovable sub-goals that share predicate arguments with logical premises. If a linguistic relation exists, then an appropriate axiom is constructed on-demand and the theorem proving continues. Experiments show that this approach is effective and precise, producing a system that outperforms other logic-based systems and is competitive with state-of-the-art statistical methods.
Tasks	Automated Theorem Proving, Information Retrieval, Natural Language Inference, Question Answering
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1067/
PDF	https://www.aclweb.org/anthology/E17-1067
PWC	https://paperswithcode.com/paper/on-demand-injection-of-lexical-knowledge-for
Repo	https://github.com/mynlp/ccg2lambda
Framework	none

Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?


Title	Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?
Authors	Shangmin Guo, Xiangrong Zeng, Shizhu He, Kang Liu, Jun Zhao
Abstract	As one of the most important test of China, Gaokao is designed to be difficult enough to distinguish the excellent high school students. In this work, we detailed the Gaokao History Multiple Choice Questions(GKHMC) and proposed two different approaches to address them using various resources. One approach is based on entity search technique (IR approach), the other is based on text entailment approach where we specifically employ deep neural networks(NN approach). The result of experiment on our collected real Gaokao questions showed that they are good at different categories of questions, that is IR approach performs much better at entity questions(EQs) while NN approach shows its advantage on sentence questions(SQs). We achieve state-of-the-art performance and show that it{'}s indispensable to apply hybrid method when participating in the real-world tests.
Tasks	Information Retrieval, Question Answering, Reading Comprehension
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1011/
PDF	https://www.aclweb.org/anthology/E17-1011
PWC	https://paperswithcode.com/paper/which-is-the-effective-way-for-gaokao
Repo	https://github.com/IACASNLPIR/GKHMC
Framework	none

Beyond Filters: Compact Feature Map for Portable Deep Model


Title	Beyond Filters: Compact Feature Map for Portable Deep Model
Authors	Yunhe Wang, Chang Xu, Chao Xu, Dacheng Tao
Abstract	Convolutional neural networks (CNNs) have shown extraordinary performance in a number of applications, but they are usually of heavy design for the accuracy reason. Beyond compressing the filters in CNNs, this paper focuses on the redundancy in the feature maps derived from the large number of filters in a layer. We propose to extract intrinsic representation of the feature maps and preserve the discriminability of the features. Circulant matrix is employed to formulate the feature map transformation, which only requires O(dlog d) computation complexity to embed a d-dimensional feature map. The filter is then re-configured to establish the mapping from original input to the new compact feature map, and the resulting network can preserve intrinsic information of the original network with significantly fewer parameters, which not only decreases the online memory for launching CNN but also accelerates the computation speed. Experiments on benchmark image datasets demonstrate the superiority of the proposed algorithm over state-of-the-art methods.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=466
PDF	http://proceedings.mlr.press/v70/wang17m/wang17m.pdf
PWC	https://paperswithcode.com/paper/beyond-filters-compact-feature-map-for
Repo	https://github.com/YunheWang/RedCNN
Framework	none

End-to-End System for Bacteria Habitat Extraction


Title	End-to-End System for Bacteria Habitat Extraction
Authors	Farrokh Mehryary, Kai Hakala, Suwisa Kaewphan, Jari Bj{"o}rne, Tapio Salakoski, Filip Ginter
Abstract	We introduce an end-to-end system capable of named-entity detection, normalization and relation extraction for extracting information about bacteria and their habitats from biomedical literature. Our system is based on deep learning, CRF classifiers and vector space models. We train and evaluate the system on the BioNLP 2016 Shared Task Bacteria Biotope data. The official evaluation shows that the joint performance of our entity detection and relation extraction models outperforms the winning team of the Shared Task by 19pp on F1-score, establishing a new top score for the task. We also achieve state-of-the-art results in the normalization task. Our system is open source and freely available at \url{https://github.com/TurkuNLP/BHE}.
Tasks	Named Entity Recognition, Relation Extraction
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2310/
PDF	https://www.aclweb.org/anthology/W17-2310
PWC	https://paperswithcode.com/paper/end-to-end-system-for-bacteria-habitat
Repo	https://github.com/TurkuNLP/BHE
Framework	none

UD Annotatrix: An annotation tool for Universal Dependencies


Title	UD Annotatrix: An annotation tool for Universal Dependencies
Authors	Francis M. Tyers, Mariya Sheyanova, Jonathan North Washington
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7604/
PDF	https://www.aclweb.org/anthology/W17-7604
PWC	https://paperswithcode.com/paper/ud-annotatrix-an-annotation-tool-for
Repo	https://github.com/jonorthwash/ud-annotatrix
Framework	none

Deep IV: A Flexible Approach for Counterfactual Prediction


Title	Deep IV: A Flexible Approach for Counterfactual Prediction
Authors	Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy
Abstract	Counterfactual prediction requires understanding causal relationships between so-called treatment and outcome variables. This paper provides a recipe for augmenting deep learning methods to accurately characterize such relationships in the presence of instrument variables (IVs) – sources of treatment randomization that are conditionally independent from the outcomes. Our IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework allows us to take advantage of off-the-shelf supervised learning techniques to estimate causal effects by adapting the loss function. Experiments show that it outperforms existing machine learning approaches.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=883
PDF	http://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf
PWC	https://paperswithcode.com/paper/deep-iv-a-flexible-approach-for
Repo	https://github.com/jhartford/DeepIV
Framework	none