July 26, 2019

2696 words 13 mins read

Paper Group NAWR 2

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. A Factored Neural Network Model for Characterizing Online Discussions in Vector Space. Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus. 3D CNNs on Distance Matrices for Human …

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction


Title	Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction
Authors	Christopher Bryant, Mariano Felice, Ted Briscoe
Abstract	Until now, error type performance for Grammatical Error Correction (GEC) systems could only be measured in terms of recall because system output is not annotated. To overcome this problem, we introduce ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework. This not only facilitates error type evaluation at different levels of granularity, but can also be used to reduce annotator workload and standardise existing GEC datasets. Human experts rated the automatic edits as {`}Good{''} or {`}Acceptable{''} in at least 95{%} of cases, so we applied ERRANT to the system output of the CoNLL-2014 shared task to carry out a detailed error type analysis for the first time.
Tasks	Grammatical Error Correction
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1074/
PDF	https://www.aclweb.org/anthology/P17-1074
PWC	https://paperswithcode.com/paper/automatic-annotation-and-evaluation-of-error
Repo	https://github.com/chrisjbryant/errant
Framework	none

LightGBM: A Highly Efficient Gradient Boosting Decision Tree


Title	LightGBM: A Highly Efficient Gradient Boosting Decision Tree
Authors	Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu
Abstract	Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: \emph{Gradient-based One-Side Sampling} (GOSS) and \emph{Exclusive Feature Bundling} (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much). We call our new GBDT implementation with GOSS and EFB \emph{LightGBM}. Our experiments on multiple public datasets show that, LightGBM speeds up the training process of conventional GBDT by up to over 20 times while achieving almost the same accuracy.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree
PDF	http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf
PWC	https://paperswithcode.com/paper/lightgbm-a-highly-efficient-gradient-boosting
Repo	https://github.com/Microsoft/LightGBM
Framework	none

A Factored Neural Network Model for Characterizing Online Discussions in Vector Space


Title	A Factored Neural Network Model for Characterizing Online Discussions in Vector Space
Authors	Hao Cheng, Hao Fang, Mari Ostendorf
Abstract	We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums. The model links different context with related language factors in the embedding space, providing a way to interpret the factored embeddings. Evaluated on a community endorsement prediction task using a large collection of topic-varying Reddit discussions, the factored embeddings consistently achieve improvement over other text representations. Qualitative analysis shows that the model captures community style and topic, as well as response trigger patterns.
Tasks	Feature Engineering
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1243/
PDF	https://www.aclweb.org/anthology/D17-1243
PWC	https://paperswithcode.com/paper/a-factored-neural-network-model-for
Repo	https://github.com/hao-cheng/factored_neural
Framework	tf

Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus


Title	Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus
Authors	Courtney Napoles, Joel Tetreault, Aasish Pappu, Enrica Rosato, Brian Provenzale
Abstract	This work presents a dataset and annotation scheme for the new task of identifying {``}good{''} conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse. \|
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-0802/
PDF	https://www.aclweb.org/anthology/W17-0802
PWC	https://paperswithcode.com/paper/finding-good-conversations-online-the-yahoo
Repo	https://github.com/cnap/ynacc
Framework	none

3D CNNs on Distance Matrices for Human Action Recognition


Title	3D CNNs on Distance Matrices for Human Action Recognition
Authors	Alejandro Hernandez Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Francesc Moreno-Noguer
Abstract	In this paper we are interested in recognizing human actions from sequences of 3D skeleton data. For this purpose we combine a 3D Convolutional Neural Network with body representations based on Euclidean Distance Matrices (EDMs), which have been recently shown to be very effective to capture the geometric structure of the human pose. One inherent limitation of the EDMs, however, is that they are defined up to a permutation of the skeleton joints, i.e., randomly shuffling the ordering of the joints yields many different representations. In oder to address this issue we introduce a novel architecture that simultaneously, and in an end-to-end manner, learns an optimal transformation of the joints, while optimizing the rest of parameters of the convolutional network. The proposed approach achieves state-of-the-art results on 3 benchmarks, including the recent NTU RGB-D dataset, for which we improve on previous LSTM-based methods by more than 10 percentage points, also surpassing other CNN-based methods while using almost 1000 times fewer parameters.
Tasks	Skeleton Based Action Recognition, Temporal Action Localization
Published	2017-10-23
URL	https://doi.org/10.1145/3123266.3123299
PDF	http://www.iri.upc.edu/files/scidoc/1954-3D-CNNs-on-distance-matrices-for-human-action-recognition.pdf
PWC	https://paperswithcode.com/paper/3d-cnns-on-distance-matrices-for-human-action
Repo	https://github.com/magnux/DMNN
Framework	none

Abnormal event detection on BMTT-PETS 2017 surveillance challenge


Title	Abnormal event detection on BMTT-PETS 2017 surveillance challenge
Authors	Kothapalli Vignesh, Gaurav Yadav, Amit Sethi
Abstract	In this paper, we have proposed a method to detect abnormal events for human group activities. Our main contribution is to develop a strategy that learns with very few videos by isolating the action and by using supervised learning. First, we subtract the background of each frame by modeling each pixel as a mixture of Gaussians (MoG) to concatenate the higher order learning only on the foreground. Next, features are extracted from each frame using a convolutional neural network (CNN) that is trained to classify between normal and abnormal frames. These feature vectors are fed into long short term memory (LSTM) network to learn the long-term dependencies between frames. The LSTM is also trained to classify abnormal frames, while extracting the temporal features of the frames. Finally, we classify the frames as abnormal or normal depending on the output of a linear SVM, whose input are the features computed by the LSTM
Tasks	Abnormal Event Detection In Video, Anomaly Detection In Surveillance Videos
Published	2017-07-26
URL	http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/html/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/papers/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/abnormal-event-detection-on-bmtt-pets-2017
Repo	https://github.com/gauraviitg/BMTT-PETS-2017-surveillance-challenge
Framework	none

Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models


Title	Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models
Authors	Harsh Jhamtani, Varun Gangal, Eduard Hovy, Eric Nyberg
Abstract	Variations in writing styles are commonly used to adapt the content to a specific context, audience, or purpose. However, applying stylistic variations is still by and large a manual process, and there have been little efforts towards automating it. In this paper we explore automated methods to transform text from modern English to Shakespearean English using an end to end trainable neural model with pointers to enable copy action. To tackle limited amount of parallel data, we pre-train embeddings of words by leveraging external dictionaries mapping Shakespearean words to modern English words as well as additional text. Our methods are able to get a BLEU score of 31+, an improvement of {\mbox{$\approx$}} 6 points above the strongest baseline. We publicly release our code to foster further research in this area.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4902/
PDF	https://www.aclweb.org/anthology/W17-4902
PWC	https://paperswithcode.com/paper/shakespearizing-modern-language-using-copy
Repo	https://github.com/harsh19/Shakespearizing-Modern-English
Framework	tf

Spatiotemporal Multiplier Networks for Video Action Recognition


Title	Spatiotemporal Multiplier Networks for Video Action Recognition
Authors	Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
Abstract	This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification accuracy. To capture long-term dependencies we inject identity mapping kernels for learning temporal relationships. Our architecture is fully convolutional in spacetime and able to evaluate a video in a single forward pass. Empirical investigation reveals that our model produces state-of-the-art results on two standard action recognition datasets.
Tasks	Action Recognition In Videos, Temporal Action Localization
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-multiplier-networks-for-video
Repo	https://github.com/feichtenhofer/st-resnet
Framework	none

Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things


Title	Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things
Authors	Ashish Kumar, Saurabh Goyal, Manik Varma
Abstract	This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash. Bonsai maintains prediction accuracy while minimizing model size and prediction costs by: (a) developing a tree model which learns a single, shallow, sparse tree with powerful nodes; (b) sparsely projecting all data into a low-dimensional space in which the tree is learnt; and (c) jointly learning all tree and projection parameters. Experimental results on multiple benchmark datasets demonstrate that Bonsai can make predictions in milliseconds even on slow microcontrollers, can fit in KB of memory, has lower battery consumption than all other algorithms while achieving prediction accuracies that can be as much as 30% higher than state-of-the-art methods for resource-efficient machine learning. Bonsai is also shown to generalize to other resource constrained settings beyond IoT by generating significantly better search results as compared to Bing’s L3 ranker when the model size is restricted to 300 bytes. Bonsai’s code can be downloaded from (http://www.manikvarma.org/code/Bonsai/download.html).
Tasks	Action Classification
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=696
PDF	http://proceedings.mlr.press/v70/kumar17a/kumar17a.pdf
PWC	https://paperswithcode.com/paper/resource-efficient-machine-learning-in-2-kb
Repo	https://github.com/Microsoft/EdgeML
Framework	tf

Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT)


Title	Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT)
Authors	Lesly Miculicich Werlen, Andrei Popescu-Belis
Abstract	In this paper, we define and assess a reference-based metric to evaluate the accuracy of pronoun translation (APT). The metric automatically aligns a candidate and a reference translation using GIZA++ augmented with specific heuristics, and then counts the number of identical or different pronouns, with provision for legitimate variations and omitted pronouns. All counts are then combined into one score. The metric is applied to the results of seven systems (including the baseline) that participated in the DiscoMT 2015 shared task on pronoun translation from English to French. The APT metric reaches around 0.993-0.999 Pearson correlation with human judges (depending on the parameters of APT), while other automatic metrics such as BLEU, METEOR, or those specific to pronouns used at DiscoMT 2015 reach only 0.972-0.986 Pearson correlation.
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4802/
PDF	https://www.aclweb.org/anthology/W17-4802
PWC	https://paperswithcode.com/paper/validation-of-an-automatic-metric-for-the
Repo	https://github.com/idiap/APT
Framework	none

DeepCD: Learning Deep Complementary Descriptors for Patch Representations


Title	DeepCD: Learning Deep Complementary Descriptors for Patch Representations
Authors	Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, Yung-Yu Chuang
Abstract	This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for a patch by employing deep learning techniques. It can be achieved by taking any descriptor learning architecture for learning a leading descriptor and augmenting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary property, a new network layer, called data-dependent modulation (DDM) layer, is introduced for adaptively learning the augmented network stream with the emphasis on the training data that are not well handled by the leading stream. By optimizing the proposed joint loss function with late fusion, the obtained descriptors are complementary to each other and their fusion improves performance. Experiments on several problems and datasets show that the proposed method is simple yet effective, outperforming state-of-the-art methods.
Tasks
Published	2017-10-01
URL	http://openaccess.thecvf.com/content_iccv_2017/html/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2017/papers/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.pdf
PWC	https://paperswithcode.com/paper/deepcd-learning-deep-complementary
Repo	https://github.com/shamangary/DeepCD
Framework	none

The JAIST Machine Translation Systems for WMT 17


Title	The JAIST Machine Translation Systems for WMT 17
Authors	Hai-Long Trieu, Trung-Tin Pham, Le-Minh Nguyen
Abstract
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4741/
PDF	https://www.aclweb.org/anthology/W17-4741
PWC	https://paperswithcode.com/paper/the-jaist-machine-translation-systems-for-wmt
Repo	https://github.com/nguyenlab/WMT17-JAIST
Framework	none

Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings


Title	Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings
Authors	Terrence Szymanski
Abstract	This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies ({`}word $w_1$ is to word $w_2$ as word $w_3$ is to word $w_4${''}) through vector addition. Here, I show that temporal word analogies ({`}word $w_1$ at time $t_\alpha$ is like word $w_2$ at time $t_\beta${''}) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as {`}Ronald Reagan in 1987 is like Bill Clinton in 1997{''}, or {`}Walkman in 1987 is like iPod in 2007{''}.
Tasks	Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2071/
PDF	https://www.aclweb.org/anthology/P17-2071
PWC	https://paperswithcode.com/paper/temporal-word-analogies-identifying-lexical
Repo	https://github.com/tdszyman/twapy
Framework	none

An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark


Title	An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark
Authors	Sergio Ramírez-Gallego, Héctor Mouriño-Talín, David Martínez-Rego, Verónica Bolón-Canedo, José Manuel Benítez, Amparo Alonso-Betanzos, Francisco Herrera
Abstract	With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Of the many techniques available, feature selection (FS) is of growing interest for its ability to identify both relevant features and frequently repeated instances in huge datasets. We aim to demonstrate that standard FS methods can be parallelized in big data platforms like Apache Spark so as to boost both performance and accuracy. We propose a distributed implementation of a generic FS framework that includes a broad group of well-known information theory-based methods. Experimental results for a broad set of real-world datasets show that our distributed framework is capable of rapidly dealing with ultrahigh-dimensional datasets as well as those with a huge number of samples, outperforming the sequential version in all the cases studied.
Tasks	Dimensionality Reduction, Feature Selection
Published	2017-07-06
URL	https://ieeexplore.ieee.org/abstract/document/7970198
PDF	https://sci2s.ugr.es/sites/default/files/bbvasoftware/publications/07970198.pdf
PWC	https://paperswithcode.com/paper/an-information-theory-based-feature-selection
Repo	https://github.com/sramirez/spark-infotheoretic-feature-selection
Framework	none

imputeTS: Time Series Missing Value Imputation in R


Title	imputeTS: Time Series Missing Value Imputation in R
Authors	Steffen Moritz, Thomas Bartz-Beielstein
Abstract	The imputeTS package specializes on univariate time series imputation. It offers multiple state-of-the-art imputation algorithm implementations along with plotting functions for time series missing data statistics. While imputation in general is a well-known problem and widely covered by R packages, finding packages able to fill missing values in univariate time series is more complicated. The reason for this lies in the fact that most imputation algorithms rely on inter-attribute correlations, while univariate time series imputation instead needs to employ time dependencies. This paper provides an introduction to the imputeTS package and its provided algorithms and tools. Furthermore, it gives a short overview about univariate time series imputation in R.
Tasks	Imputation, Multivariate Time Series Imputation, Time Series
Published	2017-06-01
URL	http://doi.org/10.32614/RJ-2017-009
PDF	https://journal.r-project.org/archive/2017/RJ-2017-009/RJ-2017-009.pdf
PWC	https://paperswithcode.com/paper/imputets-time-series-missing-value-imputation
Repo	https://github.com/SteffenMoritz/imputeTS
Framework	none