Paper Group NANR 72
DeepMiner at SemEval-2018 Task 1: Emotion Intensity Recognition Using Deep Representation Learning. CBFC: a parallel L2 speech corpus for Korean and French learners. Learning with Structured Representations for Negation Scope Extraction. A Morphology-Based Representation Model for LSTM-Based Dependency Parsing of Agglutinative Languages. Fixing Wei …
DeepMiner at SemEval-2018 Task 1: Emotion Intensity Recognition Using Deep Representation Learning
Title | DeepMiner at SemEval-2018 Task 1: Emotion Intensity Recognition Using Deep Representation Learning |
Authors | Habibeh Naderi, Behrouz Haji Soleimani, Saif Mohammad, Svetlana Kiritchenko, Stan Matwin |
Abstract | In this paper, we propose a regression system to infer the emotion intensity of a tweet. We develop a multi-aspect feature learning mechanism to capture the most discriminative semantic features of a tweet as well as the emotion information conveyed by each word in it. We combine six types of feature groups: (1) a tweet representation learned by an LSTM deep neural network on the training data, (2) a tweet representation learned by an LSTM network on a large corpus of tweets that contain emotion words (a distant supervision corpus), (3) word embeddings trained on the distant supervision corpus and averaged over all words in a tweet, (4) word and character n-grams, (5) features derived from various sentiment and emotion lexicons, and (6) other hand-crafted features. As part of the word embedding training, we also learn the distributed representations of multi-word expressions (MWEs) and negated forms of words. An SVR regressor is then trained over the full set of features. We evaluate the effectiveness of our ensemble feature sets on the SemEval-2018 Task 1 datasets and achieve a Pearson correlation of 72{%} on the task of tweet emotion intensity prediction. |
Tasks | Feature Engineering, Representation Learning, Sentiment Analysis, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1045/ |
https://www.aclweb.org/anthology/S18-1045 | |
PWC | https://paperswithcode.com/paper/deepminer-at-semeval-2018-task-1-emotion |
Repo | |
Framework | |
CBFC: a parallel L2 speech corpus for Korean and French learners
Title | CBFC: a parallel L2 speech corpus for Korean and French learners |
Authors | Hiyon Yoo, Inyoung Kim |
Abstract | |
Tasks | Language Acquisition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1054/ |
https://www.aclweb.org/anthology/L18-1054 | |
PWC | https://paperswithcode.com/paper/cbfc-a-parallel-l2-speech-corpus-for-korean |
Repo | |
Framework | |
Learning with Structured Representations for Negation Scope Extraction
Title | Learning with Structured Representations for Negation Scope Extraction |
Authors | Hao Li, Wei Lu |
Abstract | We report an empirical study on the task of negation scope extraction given the negation cue. Our key observation is that certain useful information such as features related to negation cue, long-distance dependencies as well as some latent structural information can be exploited for such a task. We design approaches based on conditional random fields (CRF), semi-Markov CRF, as well as latent-variable CRF models to capture such information. Extensive experiments on several standard datasets demonstrate that our approaches are able to achieve better results than existing approaches reported in the literature. |
Tasks | Sentiment Analysis |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2085/ |
https://www.aclweb.org/anthology/P18-2085 | |
PWC | https://paperswithcode.com/paper/learning-with-structured-representations-for |
Repo | |
Framework | |
A Morphology-Based Representation Model for LSTM-Based Dependency Parsing of Agglutinative Languages
Title | A Morphology-Based Representation Model for LSTM-Based Dependency Parsing of Agglutinative Languages |
Authors | {\c{S}}aziye Bet{"u}l {"O}zate{\c{s}}, Arzucan {"O}zg{"u}r, Tunga G{"u}ng{"o}r, Balk{\i}z {"O}zt{"u}rk |
Abstract | We propose two word representation models for agglutinative languages that better capture the similarities between words which have similar tasks in sentences. Our models highlight the morphological features in words and embed morphological information into their dense representations. We have tested our models on an LSTM-based dependency parser with character-based word embeddings proposed by Ballesteros et al. (2015). We participated in the CoNLL 2018 Shared Task on multilingual parsing from raw text to universal dependencies as the BOUN team. We show that our morphology-based embedding models improve the parsing performance for most of the agglutinative languages. |
Tasks | Dependency Parsing, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2024/ |
https://www.aclweb.org/anthology/K18-2024 | |
PWC | https://paperswithcode.com/paper/a-morphology-based-representation-model-for |
Repo | |
Framework | |
Fixing Weight Decay Regularization in Adam
Title | Fixing Weight Decay Regularization in Adam |
Authors | Ilya Loshchilov, Frank Hutter |
Abstract | We note that common implementations of adaptive gradient algorithms, such as Adam, limit the potential benefit of weight decay regularization, because the weights do not decay multiplicatively (as would be expected for standard weight decay) but by an additive constant factor. We propose a simple way to resolve this issue by decoupling weight decay and the optimization steps taken w.r.t. the loss function. We provide empirical evidence that our proposed modification (i) decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam, and (ii) substantially improves Adam’s generalization performance, allowing it to compete with SGD with momentum on image classification datasets (on which it was previously typically outperformed by the latter). We also demonstrate that longer optimization runs require smaller weight decay values for optimal results and introduce a normalized variant of weight decay to reduce this dependence. Finally, we propose a version of Adam with warm restarts (AdamWR) that has strong anytime performance while achieving state-of-the-art results on CIFAR-10 and ImageNet32x32. Our source code will become available after the review process. |
Tasks | Image Classification |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rk6qdGgCZ |
https://openreview.net/pdf?id=rk6qdGgCZ | |
PWC | https://paperswithcode.com/paper/fixing-weight-decay-regularization-in-adam |
Repo | |
Framework | |
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI
Title | Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI |
Authors | |
Abstract | |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5700/ |
https://www.aclweb.org/anthology/W18-5700 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2018-emnlp-workshop-scai |
Repo | |
Framework | |
Zero-shot Cross Language Text Classification
Title | Zero-shot Cross Language Text Classification |
Authors | Dan Svenstrup, Jonas Meinertz Hansen, Ole Winther |
Abstract | Labeled text classification datasets are typically only available in a few select languages. In order to train a model for e.g news categorization in a language $L_t$ without a suitable text classification dataset there are two options. The first option is to create a new labeled dataset by hand, and the second option is to transfer label information from an existing labeled dataset in a source language $L_s$ to the target language $L_t$. In this paper we propose a method for sharing label information across languages by means of a language independent text encoder. The encoder will give almost identical representations to multilingual versions of the same text. This means that labeled data in one language can be used to train a classifier that works for the rest of the languages. The encoder is trained independently of any concrete classification task and can therefore subsequently be used for any classification task. We show that it is possible to obtain good performance even in the case where only a comparable corpus of texts is available. |
Tasks | Text Classification |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=S1XXq6lRW |
https://openreview.net/pdf?id=S1XXq6lRW | |
PWC | https://paperswithcode.com/paper/zero-shot-cross-language-text-classification |
Repo | |
Framework | |
A Gold Anaphora Annotation Layer on an Eye Movement Corpus
Title | A Gold Anaphora Annotation Layer on an Eye Movement Corpus |
Authors | Olga Seminck, Pascal Amsili |
Abstract | |
Tasks | Abstract Anaphora Resolution |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1555/ |
https://www.aclweb.org/anthology/L18-1555 | |
PWC | https://paperswithcode.com/paper/a-gold-anaphora-annotation-layer-on-an-eye |
Repo | |
Framework | |
SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering
Title | SYSTRAN Participation to the WMT2018 Shared Task on Parallel Corpus Filtering |
Authors | MinhQuang Pham, Josep Crego, Jean Senellart |
Abstract | This paper describes the participation of SYSTRAN to the shared task on parallel corpus filtering at the Third Conference on Machine Translation (WMT 2018). We participate for the first time using a neural sentence similarity classifier which aims at predicting the relatedness of sentence pairs in a multilingual context. The paper describes the main characteristics of our approach and discusses the results obtained on the data sets published for the shared task. |
Tasks | Feature Engineering, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6485/ |
https://www.aclweb.org/anthology/W18-6485 | |
PWC | https://paperswithcode.com/paper/systran-participation-to-the-wmt2018-shared |
Repo | |
Framework | |
MITRE at SemEval-2018 Task 11: Commonsense Reasoning without Commonsense Knowledge
Title | MITRE at SemEval-2018 Task 11: Commonsense Reasoning without Commonsense Knowledge |
Authors | Elizabeth Merkhofer, John Henderson, David Bloom, Laura Strickhart, Guido Zarrella |
Abstract | This paper describes MITRE{'}s participation in SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge. The techniques explored range from simple bag-of-ngrams classifiers to neural architectures with varied attention and alignment mechanisms. Logistic regression ties the systems together into an ensemble submitted for evaluation. The resulting system answers reading comprehension questions with 82.27{%} accuracy. |
Tasks | Common Sense Reasoning, Information Retrieval, Reading Comprehension |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1181/ |
https://www.aclweb.org/anthology/S18-1181 | |
PWC | https://paperswithcode.com/paper/mitre-at-semeval-2018-task-11-commonsense |
Repo | |
Framework | |
Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks
Title | Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks |
Authors | Agastya Kalra, Abdullah Rashwan, Wei-Shou Hsu, Pascal Poupart, Prashant Doshi, Georgios Trimponias |
Abstract | Sum-product networks have recently emerged as an attractive representation due to their dual view as a special type of deep neural network with clear semantics and a special type of probabilistic graphical model for which inference is always tractable. Those properties follow from some conditions (i.e., completeness and decomposability) that must be respected by the structure of the network. As a result, it is not easy to specify a valid sum-product network by hand and therefore structure learning techniques are typically used in practice. This paper describes a new online structure learning technique for feed-forward and recurrent SPNs. The algorithm is demonstrated on real-world datasets with continuous features for which it is not clear what network architecture might be best, including sequence datasets of varying length. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7926-online-structure-learning-for-feed-forward-and-recurrent-sum-product-networks |
http://papers.nips.cc/paper/7926-online-structure-learning-for-feed-forward-and-recurrent-sum-product-networks.pdf | |
PWC | https://paperswithcode.com/paper/online-structure-learning-for-feed-forward |
Repo | |
Framework | |
TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality
Title | TQ-AutoTest – An Automated Test Suite for (Machine) Translation Quality |
Authors | Vivien Macketanz, Renlong Ai, Aljoscha Burchardt, Hans Uszkoreit |
Abstract | |
Tasks | Machine Translation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1142/ |
https://www.aclweb.org/anthology/L18-1142 | |
PWC | https://paperswithcode.com/paper/tq-autotest-a-an-automated-test-suite-for |
Repo | |
Framework | |
GKR: the Graphical Knowledge Representation for semantic parsing
Title | GKR: the Graphical Knowledge Representation for semantic parsing |
Authors | Aikaterini-Lida Kalouli, Richard Crouch |
Abstract | This paper describes the first version of an open-source semantic parser that creates graphical representations of sentences to be used for further semantic processing, e.g. for natural language inference, reasoning and semantic similarity. The Graphical Knowledge Representation which is output by the parser is inspired by the Abstract Knowledge Representation, which separates out conceptual and contextual levels of representation that deal respectively with the subject matter of a sentence and its existential commitments. Our representation is a layered graph with each sub-graph holding different kinds of information, including one sub-graph for concepts and one for contexts. Our first evaluation of the system shows an F-score of 85{%} in accurately representing sentences as semantic graphs. |
Tasks | Natural Language Inference, Semantic Parsing, Semantic Similarity, Semantic Textual Similarity |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1304/ |
https://www.aclweb.org/anthology/W18-1304 | |
PWC | https://paperswithcode.com/paper/gkr-the-graphical-knowledge-representation |
Repo | |
Framework | |
Trajectory Convolution for Action Recognition
Title | Trajectory Convolution for Action Recognition |
Authors | Yue Zhao, Yuanjun Xiong, Dahua Lin |
Abstract | How to leverage the temporal dimension is a key question in video analysis. Recent works suggest an efficient approach to video feature learning, i.e., factorizing 3D convolutions into separate components respectively for spatial and temporal convolutions. The temporal convolution, however, comes with an implicit assumption – the feature maps across time steps are well aligned so that the features at the same locations can be aggregated. This assumption may be overly strong in practical applications, especially in action recognition where the motion serves as a crucial cue. In this work, we propose a new CNN architecture TrajectoryNet, which incorporates trajectory convolution, a new operation for integrating features along the temporal dimension, to replace the existing temporal convolution. This operation explicitly takes into account the changes in contents caused by deformation or motion, allowing the visual features to be aggregated along the the motion paths, trajectories. On two large-scale action recognition datasets, namely, Something-Something and Kinetics, the proposed network architecture achieves notable improvement over strong baselines. |
Tasks | Temporal Action Localization |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7489-trajectory-convolution-for-action-recognition |
http://papers.nips.cc/paper/7489-trajectory-convolution-for-action-recognition.pdf | |
PWC | https://paperswithcode.com/paper/trajectory-convolution-for-action-recognition |
Repo | |
Framework | |
Fast Stochastic AUC Maximization with $O(1/n)$-Convergence Rate
Title | Fast Stochastic AUC Maximization with $O(1/n)$-Convergence Rate |
Authors | Mingrui Liu, Xiaoxuan Zhang, Zaiyi Chen, Xiaoyu Wang, Tianbao Yang |
Abstract | In this paper, we consider statistical learning with AUC (area under ROC curve) maximization in the classical stochastic setting where one random data drawn from an unknown distribution is revealed at each iteration for updating the model. Although consistent convex surrogate losses for AUC maximization have been proposed to make the problem tractable, it remains an challenging problem to design fast optimization algorithms in the classical stochastic setting due to that the convex surrogate loss depends on random pairs of examples from positive and negative classes. Building on a saddle point formulation for a consistent square loss, this paper proposes a novel stochastic algorithm to improve the standard $O(1/\sqrt{n})$ convergence rate to $\widetilde O(1/n)$ convergence rate without strong convexity assumption or any favorable statistical assumptions (e.g., low noise), where $n$ is the number of random samples. To the best of our knowledge, this is the first stochastic algorithm for AUC maximization with a statistical convergence rate as fast as $O(1/n)$ up to a logarithmic factor. Extensive experiments on eight large-scale benchmark data sets demonstrate the superior performance of the proposed algorithm comparing with existing stochastic or online algorithms for AUC maximization. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1940 |
http://proceedings.mlr.press/v80/liu18g/liu18g.pdf | |
PWC | https://paperswithcode.com/paper/fast-stochastic-auc-maximization-with-o1n |
Repo | |
Framework | |