July 26, 2019

2696 words 13 mins read

Paper Group NAWR 2

Paper Group NAWR 2

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. A Factored Neural Network Model for Characterizing Online Discussions in Vector Space. Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus. 3D CNNs on Distance Matrices for Human …

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction

Title Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction
Authors Christopher Bryant, Mariano Felice, Ted Briscoe
Abstract Until now, error type performance for Grammatical Error Correction (GEC) systems could only be measured in terms of recall because system output is not annotated. To overcome this problem, we introduce ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework. This not only facilitates error type evaluation at different levels of granularity, but can also be used to reduce annotator workload and standardise existing GEC datasets. Human experts rated the automatic edits as {}Good{''} or {}Acceptable{''} in at least 95{%} of cases, so we applied ERRANT to the system output of the CoNLL-2014 shared task to carry out a detailed error type analysis for the first time.
Tasks Grammatical Error Correction
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1074/
PDF https://www.aclweb.org/anthology/P17-1074
PWC https://paperswithcode.com/paper/automatic-annotation-and-evaluation-of-error
Repo https://github.com/chrisjbryant/errant
Framework none

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Title LightGBM: A Highly Efficient Gradient Boosting Decision Tree
Authors Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu
Abstract Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: \emph{Gradient-based One-Side Sampling} (GOSS) and \emph{Exclusive Feature Bundling} (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much). We call our new GBDT implementation with GOSS and EFB \emph{LightGBM}. Our experiments on multiple public datasets show that, LightGBM speeds up the training process of conventional GBDT by up to over 20 times while achieving almost the same accuracy.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree
PDF http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf
PWC https://paperswithcode.com/paper/lightgbm-a-highly-efficient-gradient-boosting
Repo https://github.com/Microsoft/LightGBM
Framework none

A Factored Neural Network Model for Characterizing Online Discussions in Vector Space

Title A Factored Neural Network Model for Characterizing Online Discussions in Vector Space
Authors Hao Cheng, Hao Fang, Mari Ostendorf
Abstract We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums. The model links different context with related language factors in the embedding space, providing a way to interpret the factored embeddings. Evaluated on a community endorsement prediction task using a large collection of topic-varying Reddit discussions, the factored embeddings consistently achieve improvement over other text representations. Qualitative analysis shows that the model captures community style and topic, as well as response trigger patterns.
Tasks Feature Engineering
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1243/
PDF https://www.aclweb.org/anthology/D17-1243
PWC https://paperswithcode.com/paper/a-factored-neural-network-model-for
Repo https://github.com/hao-cheng/factored_neural
Framework tf

Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus

Title Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus
Authors Courtney Napoles, Joel Tetreault, Aasish Pappu, Enrica Rosato, Brian Provenzale
Abstract This work presents a dataset and annotation scheme for the new task of identifying {``}good{''} conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse. |
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-0802/
PDF https://www.aclweb.org/anthology/W17-0802
PWC https://paperswithcode.com/paper/finding-good-conversations-online-the-yahoo
Repo https://github.com/cnap/ynacc
Framework none

3D CNNs on Distance Matrices for Human Action Recognition

Title 3D CNNs on Distance Matrices for Human Action Recognition
Authors Alejandro Hernandez Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Francesc Moreno-Noguer
Abstract In this paper we are interested in recognizing human actions from sequences of 3D skeleton data. For this purpose we combine a 3D Convolutional Neural Network with body representations based on Euclidean Distance Matrices (EDMs), which have been recently shown to be very effective to capture the geometric structure of the human pose. One inherent limitation of the EDMs, however, is that they are defined up to a permutation of the skeleton joints, i.e., randomly shuffling the ordering of the joints yields many different representations. In oder to address this issue we introduce a novel architecture that simultaneously, and in an end-to-end manner, learns an optimal transformation of the joints, while optimizing the rest of parameters of the convolutional network. The proposed approach achieves state-of-the-art results on 3 benchmarks, including the recent NTU RGB-D dataset, for which we improve on previous LSTM-based methods by more than 10 percentage points, also surpassing other CNN-based methods while using almost 1000 times fewer parameters.
Tasks Skeleton Based Action Recognition, Temporal Action Localization
Published 2017-10-23
URL https://doi.org/10.1145/3123266.3123299
PDF http://www.iri.upc.edu/files/scidoc/1954-3D-CNNs-on-distance-matrices-for-human-action-recognition.pdf
PWC https://paperswithcode.com/paper/3d-cnns-on-distance-matrices-for-human-action
Repo https://github.com/magnux/DMNN
Framework none

Abnormal event detection on BMTT-PETS 2017 surveillance challenge

Title Abnormal event detection on BMTT-PETS 2017 surveillance challenge
Authors Kothapalli Vignesh, Gaurav Yadav, Amit Sethi
Abstract In this paper, we have proposed a method to detect abnormal events for human group activities. Our main contribution is to develop a strategy that learns with very few videos by isolating the action and by using supervised learning. First, we subtract the background of each frame by modeling each pixel as a mixture of Gaussians (MoG) to concatenate the higher order learning only on the foreground. Next, features are extracted from each frame using a convolutional neural network (CNN) that is trained to classify between normal and abnormal frames. These feature vectors are fed into long short term memory (LSTM) network to learn the long-term dependencies between frames. The LSTM is also trained to classify abnormal frames, while extracting the temporal features of the frames. Finally, we classify the frames as abnormal or normal depending on the output of a linear SVM, whose input are the features computed by the LSTM
Tasks Abnormal Event Detection In Video, Anomaly Detection In Surveillance Videos
Published 2017-07-26
URL http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/html/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/papers/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/abnormal-event-detection-on-bmtt-pets-2017
Repo https://github.com/gauraviitg/BMTT-PETS-2017-surveillance-challenge
Framework none

Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models

Title Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models
Authors Harsh Jhamtani, Varun Gangal, Eduard Hovy, Eric Nyberg
Abstract Variations in writing styles are commonly used to adapt the content to a specific context, audience, or purpose. However, applying stylistic variations is still by and large a manual process, and there have been little efforts towards automating it. In this paper we explore automated methods to transform text from modern English to Shakespearean English using an end to end trainable neural model with pointers to enable copy action. To tackle limited amount of parallel data, we pre-train embeddings of words by leveraging external dictionaries mapping Shakespearean words to modern English words as well as additional text. Our methods are able to get a BLEU score of 31+, an improvement of {\mbox{$\approx$}} 6 points above the strongest baseline. We publicly release our code to foster further research in this area.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4902/
PDF https://www.aclweb.org/anthology/W17-4902
PWC https://paperswithcode.com/paper/shakespearizing-modern-language-using-copy
Repo https://github.com/harsh19/Shakespearizing-Modern-English
Framework tf

Spatiotemporal Multiplier Networks for Video Action Recognition

Title Spatiotemporal Multiplier Networks for Video Action Recognition
Authors Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes
Abstract This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification accuracy. To capture long-term dependencies we inject identity mapping kernels for learning temporal relationships. Our architecture is fully convolutional in spacetime and able to evaluate a video in a single forward pass. Empirical investigation reveals that our model produces state-of-the-art results on two standard action recognition datasets.
Tasks Action Recognition In Videos, Temporal Action Localization
Published 2017-07-01
URL http://openaccess.thecvf.com/content_cvpr_2017/html/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017/papers/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/spatiotemporal-multiplier-networks-for-video
Repo https://github.com/feichtenhofer/st-resnet
Framework none

Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things

Title Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things
Authors Ashish Kumar, Saurabh Goyal, Manik Varma
Abstract This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash. Bonsai maintains prediction accuracy while minimizing model size and prediction costs by: (a) developing a tree model which learns a single, shallow, sparse tree with powerful nodes; (b) sparsely projecting all data into a low-dimensional space in which the tree is learnt; and (c) jointly learning all tree and projection parameters. Experimental results on multiple benchmark datasets demonstrate that Bonsai can make predictions in milliseconds even on slow microcontrollers, can fit in KB of memory, has lower battery consumption than all other algorithms while achieving prediction accuracies that can be as much as 30% higher than state-of-the-art methods for resource-efficient machine learning. Bonsai is also shown to generalize to other resource constrained settings beyond IoT by generating significantly better search results as compared to Bing’s L3 ranker when the model size is restricted to 300 bytes. Bonsai’s code can be downloaded from (http://www.manikvarma.org/code/Bonsai/download.html).
Tasks Action Classification
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=696
PDF http://proceedings.mlr.press/v70/kumar17a/kumar17a.pdf
PWC https://paperswithcode.com/paper/resource-efficient-machine-learning-in-2-kb
Repo https://github.com/Microsoft/EdgeML
Framework tf

Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT)

Title Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT)
Authors Lesly Miculicich Werlen, Andrei Popescu-Belis
Abstract In this paper, we define and assess a reference-based metric to evaluate the accuracy of pronoun translation (APT). The metric automatically aligns a candidate and a reference translation using GIZA++ augmented with specific heuristics, and then counts the number of identical or different pronouns, with provision for legitimate variations and omitted pronouns. All counts are then combined into one score. The metric is applied to the results of seven systems (including the baseline) that participated in the DiscoMT 2015 shared task on pronoun translation from English to French. The APT metric reaches around 0.993-0.999 Pearson correlation with human judges (depending on the parameters of APT), while other automatic metrics such as BLEU, METEOR, or those specific to pronouns used at DiscoMT 2015 reach only 0.972-0.986 Pearson correlation.
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4802/
PDF https://www.aclweb.org/anthology/W17-4802
PWC https://paperswithcode.com/paper/validation-of-an-automatic-metric-for-the
Repo https://github.com/idiap/APT
Framework none

DeepCD: Learning Deep Complementary Descriptors for Patch Representations

Title DeepCD: Learning Deep Complementary Descriptors for Patch Representations
Authors Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, Yung-Yu Chuang
Abstract This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for a patch by employing deep learning techniques. It can be achieved by taking any descriptor learning architecture for learning a leading descriptor and augmenting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary property, a new network layer, called data-dependent modulation (DDM) layer, is introduced for adaptively learning the augmented network stream with the emphasis on the training data that are not well handled by the leading stream. By optimizing the proposed joint loss function with late fusion, the obtained descriptors are complementary to each other and their fusion improves performance. Experiments on several problems and datasets show that the proposed method is simple yet effective, outperforming state-of-the-art methods.
Tasks
Published 2017-10-01
URL http://openaccess.thecvf.com/content_iccv_2017/html/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2017/papers/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.pdf
PWC https://paperswithcode.com/paper/deepcd-learning-deep-complementary
Repo https://github.com/shamangary/DeepCD
Framework none

The JAIST Machine Translation Systems for WMT 17

Title The JAIST Machine Translation Systems for WMT 17
Authors Hai-Long Trieu, Trung-Tin Pham, Le-Minh Nguyen
Abstract
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4741/
PDF https://www.aclweb.org/anthology/W17-4741
PWC https://paperswithcode.com/paper/the-jaist-machine-translation-systems-for-wmt
Repo https://github.com/nguyenlab/WMT17-JAIST
Framework none

Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings

Title Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings
Authors Terrence Szymanski
Abstract This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies ({}word $w_1$ is to word $w_2$ as word $w_3$ is to word $w_4${''}) through vector addition. Here, I show that temporal word analogies ({}word $w_1$ at time $t_\alpha$ is like word $w_2$ at time $t_\beta${''}) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as {}Ronald Reagan in 1987 is like Bill Clinton in 1997{''}, or {}Walkman in 1987 is like iPod in 2007{''}.
Tasks Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2071/
PDF https://www.aclweb.org/anthology/P17-2071
PWC https://paperswithcode.com/paper/temporal-word-analogies-identifying-lexical
Repo https://github.com/tdszyman/twapy
Framework none

An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark

Title An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark
Authors Sergio Ramírez-Gallego, Héctor Mouriño-Talín, David Martínez-Rego, Verónica Bolón-Canedo, José Manuel Benítez, Amparo Alonso-Betanzos, Francisco Herrera
Abstract With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Of the many techniques available, feature selection (FS) is of growing interest for its ability to identify both relevant features and frequently repeated instances in huge datasets. We aim to demonstrate that standard FS methods can be parallelized in big data platforms like Apache Spark so as to boost both performance and accuracy. We propose a distributed implementation of a generic FS framework that includes a broad group of well-known information theory-based methods. Experimental results for a broad set of real-world datasets show that our distributed framework is capable of rapidly dealing with ultrahigh-dimensional datasets as well as those with a huge number of samples, outperforming the sequential version in all the cases studied.
Tasks Dimensionality Reduction, Feature Selection
Published 2017-07-06
URL https://ieeexplore.ieee.org/abstract/document/7970198
PDF https://sci2s.ugr.es/sites/default/files/bbvasoftware/publications/07970198.pdf
PWC https://paperswithcode.com/paper/an-information-theory-based-feature-selection
Repo https://github.com/sramirez/spark-infotheoretic-feature-selection
Framework none

imputeTS: Time Series Missing Value Imputation in R

Title imputeTS: Time Series Missing Value Imputation in R
Authors Steffen Moritz, Thomas Bartz-Beielstein
Abstract The imputeTS package specializes on univariate time series imputation. It offers multiple state-of-the-art imputation algorithm implementations along with plotting functions for time series missing data statistics. While imputation in general is a well-known problem and widely covered by R packages, finding packages able to fill missing values in univariate time series is more complicated. The reason for this lies in the fact that most imputation algorithms rely on inter-attribute correlations, while univariate time series imputation instead needs to employ time dependencies. This paper provides an introduction to the imputeTS package and its provided algorithms and tools. Furthermore, it gives a short overview about univariate time series imputation in R.
Tasks Imputation, Multivariate Time Series Imputation, Time Series
Published 2017-06-01
URL http://doi.org/10.32614/RJ-2017-009
PDF https://journal.r-project.org/archive/2017/RJ-2017-009/RJ-2017-009.pdf
PWC https://paperswithcode.com/paper/imputets-time-series-missing-value-imputation
Repo https://github.com/SteffenMoritz/imputeTS
Framework none
comments powered by Disqus