Paper Group NAWR 2
Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. A Factored Neural Network Model for Characterizing Online Discussions in Vector Space. Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus. 3D CNNs on Distance Matrices for Human …
Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction
Title | Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction |
Authors | Christopher Bryant, Mariano Felice, Ted Briscoe |
Abstract | Until now, error type performance for Grammatical Error Correction (GEC) systems could only be measured in terms of recall because system output is not annotated. To overcome this problem, we introduce ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework. This not only facilitates error type evaluation at different levels of granularity, but can also be used to reduce annotator workload and standardise existing GEC datasets. Human experts rated the automatic edits as {}Good{''} or { }Acceptable{''} in at least 95{%} of cases, so we applied ERRANT to the system output of the CoNLL-2014 shared task to carry out a detailed error type analysis for the first time. |
Tasks | Grammatical Error Correction |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1074/ |
https://www.aclweb.org/anthology/P17-1074 | |
PWC | https://paperswithcode.com/paper/automatic-annotation-and-evaluation-of-error |
Repo | https://github.com/chrisjbryant/errant |
Framework | none |
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
Title | LightGBM: A Highly Efficient Gradient Boosting Decision Tree |
Authors | Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu |
Abstract | Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: \emph{Gradient-based One-Side Sampling} (GOSS) and \emph{Exclusive Feature Bundling} (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much). We call our new GBDT implementation with GOSS and EFB \emph{LightGBM}. Our experiments on multiple public datasets show that, LightGBM speeds up the training process of conventional GBDT by up to over 20 times while achieving almost the same accuracy. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree |
http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf | |
PWC | https://paperswithcode.com/paper/lightgbm-a-highly-efficient-gradient-boosting |
Repo | https://github.com/Microsoft/LightGBM |
Framework | none |
A Factored Neural Network Model for Characterizing Online Discussions in Vector Space
Title | A Factored Neural Network Model for Characterizing Online Discussions in Vector Space |
Authors | Hao Cheng, Hao Fang, Mari Ostendorf |
Abstract | We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums. The model links different context with related language factors in the embedding space, providing a way to interpret the factored embeddings. Evaluated on a community endorsement prediction task using a large collection of topic-varying Reddit discussions, the factored embeddings consistently achieve improvement over other text representations. Qualitative analysis shows that the model captures community style and topic, as well as response trigger patterns. |
Tasks | Feature Engineering |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1243/ |
https://www.aclweb.org/anthology/D17-1243 | |
PWC | https://paperswithcode.com/paper/a-factored-neural-network-model-for |
Repo | https://github.com/hao-cheng/factored_neural |
Framework | tf |
Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus
Title | Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus |
Authors | Courtney Napoles, Joel Tetreault, Aasish Pappu, Enrica Rosato, Brian Provenzale |
Abstract | This work presents a dataset and annotation scheme for the new task of identifying {``}good{''} conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse. | |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-0802/ |
https://www.aclweb.org/anthology/W17-0802 | |
PWC | https://paperswithcode.com/paper/finding-good-conversations-online-the-yahoo |
Repo | https://github.com/cnap/ynacc |
Framework | none |
3D CNNs on Distance Matrices for Human Action Recognition
Title | 3D CNNs on Distance Matrices for Human Action Recognition |
Authors | Alejandro Hernandez Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Francesc Moreno-Noguer |
Abstract | In this paper we are interested in recognizing human actions from sequences of 3D skeleton data. For this purpose we combine a 3D Convolutional Neural Network with body representations based on Euclidean Distance Matrices (EDMs), which have been recently shown to be very effective to capture the geometric structure of the human pose. One inherent limitation of the EDMs, however, is that they are defined up to a permutation of the skeleton joints, i.e., randomly shuffling the ordering of the joints yields many different representations. In oder to address this issue we introduce a novel architecture that simultaneously, and in an end-to-end manner, learns an optimal transformation of the joints, while optimizing the rest of parameters of the convolutional network. The proposed approach achieves state-of-the-art results on 3 benchmarks, including the recent NTU RGB-D dataset, for which we improve on previous LSTM-based methods by more than 10 percentage points, also surpassing other CNN-based methods while using almost 1000 times fewer parameters. |
Tasks | Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2017-10-23 |
URL | https://doi.org/10.1145/3123266.3123299 |
http://www.iri.upc.edu/files/scidoc/1954-3D-CNNs-on-distance-matrices-for-human-action-recognition.pdf | |
PWC | https://paperswithcode.com/paper/3d-cnns-on-distance-matrices-for-human-action |
Repo | https://github.com/magnux/DMNN |
Framework | none |
Abnormal event detection on BMTT-PETS 2017 surveillance challenge
Title | Abnormal event detection on BMTT-PETS 2017 surveillance challenge |
Authors | Kothapalli Vignesh, Gaurav Yadav, Amit Sethi |
Abstract | In this paper, we have proposed a method to detect abnormal events for human group activities. Our main contribution is to develop a strategy that learns with very few videos by isolating the action and by using supervised learning. First, we subtract the background of each frame by modeling each pixel as a mixture of Gaussians (MoG) to concatenate the higher order learning only on the foreground. Next, features are extracted from each frame using a convolutional neural network (CNN) that is trained to classify between normal and abnormal frames. These feature vectors are fed into long short term memory (LSTM) network to learn the long-term dependencies between frames. The LSTM is also trained to classify abnormal frames, while extracting the temporal features of the frames. Finally, we classify the frames as abnormal or normal depending on the output of a linear SVM, whose input are the features computed by the LSTM |
Tasks | Abnormal Event Detection In Video, Anomaly Detection In Surveillance Videos |
Published | 2017-07-26 |
URL | http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/html/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.html |
http://openaccess.thecvf.com/content_cvpr_2017_workshops/w34/papers/Vignesh_Abnormal_Event_Detection_CVPR_2017_paper.pdf | |
PWC | https://paperswithcode.com/paper/abnormal-event-detection-on-bmtt-pets-2017 |
Repo | https://github.com/gauraviitg/BMTT-PETS-2017-surveillance-challenge |
Framework | none |
Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models
Title | Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models |
Authors | Harsh Jhamtani, Varun Gangal, Eduard Hovy, Eric Nyberg |
Abstract | Variations in writing styles are commonly used to adapt the content to a specific context, audience, or purpose. However, applying stylistic variations is still by and large a manual process, and there have been little efforts towards automating it. In this paper we explore automated methods to transform text from modern English to Shakespearean English using an end to end trainable neural model with pointers to enable copy action. To tackle limited amount of parallel data, we pre-train embeddings of words by leveraging external dictionaries mapping Shakespearean words to modern English words as well as additional text. Our methods are able to get a BLEU score of 31+, an improvement of {\mbox{$\approx$}} 6 points above the strongest baseline. We publicly release our code to foster further research in this area. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4902/ |
https://www.aclweb.org/anthology/W17-4902 | |
PWC | https://paperswithcode.com/paper/shakespearizing-modern-language-using-copy |
Repo | https://github.com/harsh19/Shakespearizing-Modern-English |
Framework | tf |
Spatiotemporal Multiplier Networks for Video Action Recognition
Title | Spatiotemporal Multiplier Networks for Video Action Recognition |
Authors | Christoph Feichtenhofer, Axel Pinz, Richard P. Wildes |
Abstract | This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification accuracy. To capture long-term dependencies we inject identity mapping kernels for learning temporal relationships. Our architecture is fully convolutional in spacetime and able to evaluate a video in a single forward pass. Empirical investigation reveals that our model produces state-of-the-art results on two standard action recognition datasets. |
Tasks | Action Recognition In Videos, Temporal Action Localization |
Published | 2017-07-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2017/html/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.html |
http://openaccess.thecvf.com/content_cvpr_2017/papers/Feichtenhofer_Spatiotemporal_Multiplier_Networks_CVPR_2017_paper.pdf | |
PWC | https://paperswithcode.com/paper/spatiotemporal-multiplier-networks-for-video |
Repo | https://github.com/feichtenhofer/st-resnet |
Framework | none |
Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things
Title | Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things |
Authors | Ashish Kumar, Saurabh Goyal, Manik Varma |
Abstract | This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash. Bonsai maintains prediction accuracy while minimizing model size and prediction costs by: (a) developing a tree model which learns a single, shallow, sparse tree with powerful nodes; (b) sparsely projecting all data into a low-dimensional space in which the tree is learnt; and (c) jointly learning all tree and projection parameters. Experimental results on multiple benchmark datasets demonstrate that Bonsai can make predictions in milliseconds even on slow microcontrollers, can fit in KB of memory, has lower battery consumption than all other algorithms while achieving prediction accuracies that can be as much as 30% higher than state-of-the-art methods for resource-efficient machine learning. Bonsai is also shown to generalize to other resource constrained settings beyond IoT by generating significantly better search results as compared to Bing’s L3 ranker when the model size is restricted to 300 bytes. Bonsai’s code can be downloaded from (http://www.manikvarma.org/code/Bonsai/download.html). |
Tasks | Action Classification |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=696 |
http://proceedings.mlr.press/v70/kumar17a/kumar17a.pdf | |
PWC | https://paperswithcode.com/paper/resource-efficient-machine-learning-in-2-kb |
Repo | https://github.com/Microsoft/EdgeML |
Framework | tf |
Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT)
Title | Validation of an Automatic Metric for the Accuracy of Pronoun Translation (APT) |
Authors | Lesly Miculicich Werlen, Andrei Popescu-Belis |
Abstract | In this paper, we define and assess a reference-based metric to evaluate the accuracy of pronoun translation (APT). The metric automatically aligns a candidate and a reference translation using GIZA++ augmented with specific heuristics, and then counts the number of identical or different pronouns, with provision for legitimate variations and omitted pronouns. All counts are then combined into one score. The metric is applied to the results of seven systems (including the baseline) that participated in the DiscoMT 2015 shared task on pronoun translation from English to French. The APT metric reaches around 0.993-0.999 Pearson correlation with human judges (depending on the parameters of APT), while other automatic metrics such as BLEU, METEOR, or those specific to pronouns used at DiscoMT 2015 reach only 0.972-0.986 Pearson correlation. |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4802/ |
https://www.aclweb.org/anthology/W17-4802 | |
PWC | https://paperswithcode.com/paper/validation-of-an-automatic-metric-for-the |
Repo | https://github.com/idiap/APT |
Framework | none |
DeepCD: Learning Deep Complementary Descriptors for Patch Representations
Title | DeepCD: Learning Deep Complementary Descriptors for Patch Representations |
Authors | Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, Yung-Yu Chuang |
Abstract | This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for a patch by employing deep learning techniques. It can be achieved by taking any descriptor learning architecture for learning a leading descriptor and augmenting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary property, a new network layer, called data-dependent modulation (DDM) layer, is introduced for adaptively learning the augmented network stream with the emphasis on the training data that are not well handled by the leading stream. By optimizing the proposed joint loss function with late fusion, the obtained descriptors are complementary to each other and their fusion improves performance. Experiments on several problems and datasets show that the proposed method is simple yet effective, outperforming state-of-the-art methods. |
Tasks | |
Published | 2017-10-01 |
URL | http://openaccess.thecvf.com/content_iccv_2017/html/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.html |
http://openaccess.thecvf.com/content_ICCV_2017/papers/Yang_DeepCD_Learning_Deep_ICCV_2017_paper.pdf | |
PWC | https://paperswithcode.com/paper/deepcd-learning-deep-complementary |
Repo | https://github.com/shamangary/DeepCD |
Framework | none |
The JAIST Machine Translation Systems for WMT 17
Title | The JAIST Machine Translation Systems for WMT 17 |
Authors | Hai-Long Trieu, Trung-Tin Pham, Le-Minh Nguyen |
Abstract | |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4741/ |
https://www.aclweb.org/anthology/W17-4741 | |
PWC | https://paperswithcode.com/paper/the-jaist-machine-translation-systems-for-wmt |
Repo | https://github.com/nguyenlab/WMT17-JAIST |
Framework | none |
Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings
Title | Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings |
Authors | Terrence Szymanski |
Abstract | This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies ({}word $w_1$ is to word $w_2$ as word $w_3$ is to word $w_4${''}) through vector addition. Here, I show that temporal word analogies ({ }word $w_1$ at time $t_\alpha$ is like word $w_2$ at time $t_\beta${''}) can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as {}Ronald Reagan in 1987 is like Bill Clinton in 1997{''}, or { }Walkman in 1987 is like iPod in 2007{''}. |
Tasks | Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2071/ |
https://www.aclweb.org/anthology/P17-2071 | |
PWC | https://paperswithcode.com/paper/temporal-word-analogies-identifying-lexical |
Repo | https://github.com/tdszyman/twapy |
Framework | none |
An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark
Title | An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark |
Authors | Sergio Ramírez-Gallego, Héctor Mouriño-Talín, David Martínez-Rego, Verónica Bolón-Canedo, José Manuel Benítez, Amparo Alonso-Betanzos, Francisco Herrera |
Abstract | With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Of the many techniques available, feature selection (FS) is of growing interest for its ability to identify both relevant features and frequently repeated instances in huge datasets. We aim to demonstrate that standard FS methods can be parallelized in big data platforms like Apache Spark so as to boost both performance and accuracy. We propose a distributed implementation of a generic FS framework that includes a broad group of well-known information theory-based methods. Experimental results for a broad set of real-world datasets show that our distributed framework is capable of rapidly dealing with ultrahigh-dimensional datasets as well as those with a huge number of samples, outperforming the sequential version in all the cases studied. |
Tasks | Dimensionality Reduction, Feature Selection |
Published | 2017-07-06 |
URL | https://ieeexplore.ieee.org/abstract/document/7970198 |
https://sci2s.ugr.es/sites/default/files/bbvasoftware/publications/07970198.pdf | |
PWC | https://paperswithcode.com/paper/an-information-theory-based-feature-selection |
Repo | https://github.com/sramirez/spark-infotheoretic-feature-selection |
Framework | none |
imputeTS: Time Series Missing Value Imputation in R
Title | imputeTS: Time Series Missing Value Imputation in R |
Authors | Steffen Moritz, Thomas Bartz-Beielstein |
Abstract | The imputeTS package specializes on univariate time series imputation. It offers multiple state-of-the-art imputation algorithm implementations along with plotting functions for time series missing data statistics. While imputation in general is a well-known problem and widely covered by R packages, finding packages able to fill missing values in univariate time series is more complicated. The reason for this lies in the fact that most imputation algorithms rely on inter-attribute correlations, while univariate time series imputation instead needs to employ time dependencies. This paper provides an introduction to the imputeTS package and its provided algorithms and tools. Furthermore, it gives a short overview about univariate time series imputation in R. |
Tasks | Imputation, Multivariate Time Series Imputation, Time Series |
Published | 2017-06-01 |
URL | http://doi.org/10.32614/RJ-2017-009 |
https://journal.r-project.org/archive/2017/RJ-2017-009/RJ-2017-009.pdf | |
PWC | https://paperswithcode.com/paper/imputets-time-series-missing-value-imputation |
Repo | https://github.com/SteffenMoritz/imputeTS |
Framework | none |