January 24, 2020

2521 words 12 mins read

Paper Group NANR 247

DEEP-TRIM: REVISITING L1 REGULARIZATION FOR CONNECTION PRUNING OF DEEP NETWORK. Presenting TWITTIR`O-UD: An Italian Twitter Treebank in Universal Dependencies. NLP at SemEval-2019 Task 6: Detecting Offensive language using Neural Networks. Explainable Artificial Intelligence and its potential within Industry. Action Assessment by Joint Relation Gr …

DEEP-TRIM: REVISITING L1 REGULARIZATION FOR CONNECTION PRUNING OF DEEP NETWORK


Title	DEEP-TRIM: REVISITING L1 REGULARIZATION FOR CONNECTION PRUNING OF DEEP NETWORK
Authors	Chih-Kuan Yeh, Ian E.H. Yen, Hong-You Chen, Chun-Pei Yang, Shou-De Lin, Pradeep Ravikumar
Abstract	State-of-the-art deep neural networks (DNNs) typically have tens of millions of parameters, which might not fit into the upper levels of the memory hierarchy, thus increasing the inference time and energy consumption significantly, and prohibiting their use on edge devices such as mobile phones. The compression of DNN models has therefore become an active area of research recently, with \emph{connection pruning} emerging as one of the most successful strategies. A very natural approach is to prune connections of DNNs via $\ell_1$ regularization, but recent empirical investigations have suggested that this does not work as well in the context of DNN compression. In this work, we revisit this simple strategy and analyze it rigorously, to show that: (a) any \emph{stationary point} of an $\ell_1$-regularized layerwise-pruning objective has its number of non-zero elements bounded by the number of penalized prediction logits, regardless of the strength of the regularization; (b) successful pruning highly relies on an accurate optimization solver, and there is a trade-off between compression speed and distortion of prediction accuracy, controlled by the strength of regularization. Our theoretical results thus suggest that $\ell_1$ pruning could be successful provided we use an accurate optimization solver. We corroborate this in our experiments, where we show that simple $\ell_1$ regularization with an Adamax-L1(cumulative) solver gives pruning ratio competitive to the state-of-the-art.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=r1exVhActQ
PDF	https://openreview.net/pdf?id=r1exVhActQ
PWC	https://paperswithcode.com/paper/deep-trim-revisiting-l1-regularization-for
Repo
Framework

Presenting TWITTIR`O-UD: An Italian Twitter Treebank in Universal Dependencies


Title	Presenting TWITTIR`O-UD: An Italian Twitter Treebank in Universal Dependencies
Authors	Aless Cignarella, ra Teresa, Cristina Bosco, Paolo Rosso
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7723/
PDF	https://www.aclweb.org/anthology/W19-7723
PWC	https://paperswithcode.com/paper/presenting-twittiro-ud-an-italian-twitter
Repo
Framework

NLP at SemEval-2019 Task 6: Detecting Offensive language using Neural Networks


Title	NLP at SemEval-2019 Task 6: Detecting Offensive language using Neural Networks
Authors	Prashant Kapil, Asif Ekbal, Dipankar Das
Abstract	In this paper we built several deep learning architectures to participate in shared task OffensEval: Identifying and categorizing Offensive language in Social media by semEval-2019. The dataset was annotated with three level annotation schemes and task was to detect between offensive and not offensive, categorization and target identification in offensive contents. Deep learning models with POS information as feature were also leveraged for classification. The three best models that performed best on individual sub tasks are stacking of CNN-Bi-LSTM with Attention, BiLSTM with POS information added with word features and Bi-LSTM for third task. Our models achieved a Macro F1 score of 0.7594, 0.5378 and 0.4588 in Task(A,B,C) respectively with rank of 33rd, 54th and 52nd out of 103, 75 and 65 submissions.The three best models that performed best on individual sub task are using Neural Networks.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2105/
PDF	https://www.aclweb.org/anthology/S19-2105
PWC	https://paperswithcode.com/paper/nlp-at-semeval-2019-task-6-detecting
Repo
Framework

Explainable Artificial Intelligence and its potential within Industry


Title	Explainable Artificial Intelligence and its potential within Industry
Authors	Saad Mahamood
Abstract
Tasks
Published	2019-01-01
URL	https://www.aclweb.org/anthology/W19-8401/
PDF	https://www.aclweb.org/anthology/W19-8401
PWC	https://paperswithcode.com/paper/explainable-artificial-intelligence-and-its
Repo
Framework

Action Assessment by Joint Relation Graphs


Title	Action Assessment by Joint Relation Graphs
Authors	Jia-Hui Pan, Jibin Gao, Wei-Shi Zheng
Abstract	We present a new model to assess the performance of actions from videos, through graph-based joint relation modelling. Previous works mainly focused on the whole scene including the performer’s body and background, yet they ignored the detailed joint interactions. This is insufficient for fine-grained, accurate action assessment, because the action quality of each joint is dependent of its neighbouring joints. Therefore, we propose to learn the detailed joint motion based on the joint relations. We build trainable Joint Relation Graphs, and analyze joint motion on them. We propose two novel modules, the Joint Commonality Module and the Joint Difference Module, for joint motion learning. The Joint Commonality Module models the general motion for certain body parts, and the Joint Difference Module models the motion differences within body parts. We evaluate our method on six public Olympic actions for performance assessment. Our method outperforms previous approaches (+0.0912) and the whole-scene analysis (+0.0623) in the Spearman’s Rank Correlation. We also demonstrate our model’s ability to interpret the action assessment process.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Pan_Action_Assessment_by_Joint_Relation_Graphs_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Pan_Action_Assessment_by_Joint_Relation_Graphs_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/action-assessment-by-joint-relation-graphs
Repo
Framework

A Progressive Model to Enable Continual Learning for Semantic Slot Filling


Title	A Progressive Model to Enable Continual Learning for Semantic Slot Filling
Authors	Yilin Shen, Xiangyu Zeng, Hongxia Jin
Abstract	Semantic slot filling is one of the major tasks in spoken language understanding (SLU). After a slot filling model is trained on precollected data, it is crucial to continually improve the model after deployment to learn users{'} new expressions. As the data amount grows, it becomes infeasible to either store such huge data and repeatedly retrain the model on all data or fine tune the model only on new data without forgetting old expressions. In this paper, we introduce a novel progressive slot filling model, ProgModel. ProgModel consists of a novel context gate that transfers previously learned knowledge to a small size expanded component; and meanwhile enables this new component to be fast trained to learn from new data. As such, ProgModel learns the new knowledge by only using new data at each time and meanwhile preserves the previously learned expressions. Our experiments show that ProgModel needs much less training time and smaller model size to outperform various model fine tuning competitors by up to 4.24{%} and 3.03{%} on two benchmark datasets.
Tasks	Continual Learning, Slot Filling, Spoken Language Understanding
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1126/
PDF	https://www.aclweb.org/anthology/D19-1126
PWC	https://paperswithcode.com/paper/a-progressive-model-to-enable-continual
Repo
Framework

Large-Scale, Diverse, Paraphrastic Bitexts via Sampling and Clustering


Title	Large-Scale, Diverse, Paraphrastic Bitexts via Sampling and Clustering
Authors	J. Edward Hu, Abhinav Singh, Nils Holzenberger, Matt Post, Benjamin Van Durme
Abstract	Producing diverse paraphrases of a sentence is a challenging task. Natural paraphrase corpora are scarce and limited, while existing large-scale resources are automatically generated via back-translation and rely on beam search, which tends to lack diversity. We describe ParaBank 2, a new resource that contains multiple diverse sentential paraphrases, produced from a bilingual corpus using negative constraints, inference sampling, and clustering.We show that ParaBank 2 significantly surpasses prior work in both lexical and syntactic diversity while being meaning-preserving, as measured by human judgments and standardized metrics. Further, we illustrate how such paraphrastic resources may be used to refine contextualized encoders, leading to improvements in downstream tasks.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-1005/
PDF	https://www.aclweb.org/anthology/K19-1005
PWC	https://paperswithcode.com/paper/large-scale-diverse-paraphrastic-bitexts-via
Repo
Framework

Efficient Deep Approximation of GMMs


Title	Efficient Deep Approximation of GMMs
Authors	Shirin Jalali, Carl Nuzman, Iraj Saniee
Abstract	The universal approximation theorem states that any regular function can be approximated closely using a single hidden layer neural network. Some recent work has shown that, for some special functions, the number of nodes in such an approximation could be exponentially reduced with multi-layer neural networks. In this work, we extend this idea to a rich class of functions, namely the discriminant functions that arise in optimal Bayesian classification of Gaussian mixture models (GMMs) in $\mathds{R}^n$. We show that such functions can be approximated with arbitrary precision using $O(n)$ nodes in a neural network with two hidden layers (deep neural network), while in contrast, a neural network with a single hidden layer (shallow neural network) would require at least $O(\exp(n))$ nodes or exponentially large coefficients. Given the universality of the Gaussian distribution in the feature spaces of data, e.g., in speech, image and text, our results shed light on the observed efficiency of deep neural networks in practical classification problems.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8704-efficient-deep-approximation-of-gmms
PDF	http://papers.nips.cc/paper/8704-efficient-deep-approximation-of-gmms.pdf
PWC	https://paperswithcode.com/paper/efficient-deep-approximation-of-gmms
Repo
Framework

Aggregating Bidirectional Encoder Representations Using MatchLSTM for Sequence Matching


Title	Aggregating Bidirectional Encoder Representations Using MatchLSTM for Sequence Matching
Authors	Bo Shao, Yeyun Gong, Weizhen Qi, Nan Duan, Xiaola Lin
Abstract	In this work, we propose an aggregation method to combine the Bidirectional Encoder Representations from Transformer (BERT) with a MatchLSTM layer for Sequence Matching. Given a sentence pair, we extract the output representations of it from BERT. Then we extend BERT with a MatchLSTM layer to get further interaction of the sentence pair for sequence matching tasks. Taking natural language inference as an example, we split BERT output into two parts, which is from premise sentence and hypothesis sentence. At each position of the hypothesis sentence, both the weighted representation of the premise sentence and the representation of the current token are fed into LSTM. We jointly train the aggregation layer and pre-trained layer for sequence matching. We conduct an experiment on two publicly available datasets, WikiQA and SNLI. Experiments show that our model achieves significantly improvement compared with state-of-the-art methods on both datasets.
Tasks	Natural Language Inference
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1626/
PDF	https://www.aclweb.org/anthology/D19-1626
PWC	https://paperswithcode.com/paper/aggregating-bidirectional-encoder
Repo
Framework

Convolutional neural networks for low-resource morpheme segmentation: baseline or state-of-the-art?


Title	Convolutional neural networks for low-resource morpheme segmentation: baseline or state-of-the-art?
Authors	Alexey Sorokin
Abstract	We apply convolutional neural networks to the task of shallow morpheme segmentation using low-resource datasets for 5 different languages. We show that both in fully supervised and semi-supervised settings our model beats previous state-of-the-art approaches. We argue that convolutional neural networks reflect local nature of morpheme segmentation better than other semi-supervised approaches.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4218/
PDF	https://www.aclweb.org/anthology/W19-4218
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-low
Repo
Framework

Incivility Detection in Online Comments


Title	Incivility Detection in Online Comments
Authors	Farig Sadeque, Stephen Rains, Yotam Shmargad, Kate Kenski, Kevin Coe, Steven Bethard
Abstract	Incivility in public discourse has been a major concern in recent times as it can affect the quality and tenacity of the discourse negatively. In this paper, we present neural models that can learn to detect name-calling and vulgarity from a newspaper comment section. We show that in contrast to prior work on detecting toxic language, fine-grained incivilities like namecalling cannot be accurately detected by simple models like logistic regression. We apply the models trained on the newspaper comments data to detect uncivil comments in a Russian troll dataset, and find that despite the change of domain, the model makes accurate predictions.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-1031/
PDF	https://www.aclweb.org/anthology/S19-1031
PWC	https://paperswithcode.com/paper/incivility-detection-in-online-comments
Repo
Framework

Evaluating BERT for natural language inference: A case study on the CommitmentBank


Title	Evaluating BERT for natural language inference: A case study on the CommitmentBank
Authors	Nanjiang Jiang, Marie-Catherine de Marneffe
Abstract	Natural language inference (NLI) datasets (e.g., MultiNLI) were collected by soliciting hypotheses for a given premise from annotators. Such data collection led to annotation artifacts: systems can identify the premise-hypothesis relationship without observing the premise (e.g., negation in hypothesis being indicative of contradiction). We address this problem by recasting the CommitmentBank for NLI, which contains items involving reasoning over the extent to which a speaker is committed to complements of clause-embedding verbs under entailment-canceling environments (conditional, negation, modal and question). Instead of being constructed to stand in certain relationships with the premise, hypotheses in the recast CommitmentBank are the complements of the clause-embedding verb in each premise, leading to no annotation artifacts in the hypothesis. A state-of-the-art BERT-based model performs well on the CommitmentBank with 85{%} F1. However analysis of model behavior shows that the BERT models still do not capture the full complexity of pragmatic reasoning, nor encode some of the linguistic generalizations, highlighting room for improvement.
Tasks	Natural Language Inference
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1630/
PDF	https://www.aclweb.org/anthology/D19-1630
PWC	https://paperswithcode.com/paper/evaluating-bert-for-natural-language
Repo
Framework


Title	Classifying Author Intention for Writer Feedback in Related Work
Authors	Arlene Casey, Bonnie Webber, Dorota Glowacka
Abstract	The ability to produce high-quality publishable material is critical to academic success but many Post-Graduate students struggle to learn to do so. While recent years have seen an increase in tools designed to provide feedback on aspects of writing, one aspect that has so far been neglected is the Related Work section of academic research papers. To address this, we have trained a supervised classifier on a corpus of 94 Related Work sections and evaluated it against a manually annotated gold standard. The classifier uses novel features pertaining to citation types and co-reference, along with patterns found from studying Related Works. We show that these novel features contribute to classifier performance with performance being favourable compared to other similar works that classify author intentions and consider feedback for academic writing.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1021/
PDF	https://www.aclweb.org/anthology/R19-1021
PWC	https://paperswithcode.com/paper/classifying-author-intention-for-writer
Repo
Framework

The Risk of Racial Bias in Hate Speech Detection


Title	The Risk of Racial Bias in Hate Speech Detection
Authors	Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. Smith
Abstract	We investigate how annotators{'} insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. We first uncover unexpected correlations between surface markers of African American English (AAE) and ratings of toxicity in several widely-used hate speech datasets. Then, we show that models trained on these corpora acquire and propagate these biases, such that AAE tweets and tweets by self-identified African Americans are up to two times more likely to be labelled as offensive compared to others. Finally, we propose dialect and race priming as ways to reduce the racial bias in annotation, showing that when annotators are made explicitly aware of an AAE tweet{'}s dialect they are significantly less likely to label the tweet as offensive.
Tasks	Hate Speech Detection
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1163/
PDF	https://www.aclweb.org/anthology/P19-1163
PWC	https://paperswithcode.com/paper/the-risk-of-racial-bias-in-hate-speech
Repo
Framework

Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling


Title	Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling
Authors	Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin
Abstract	A core problem of information retrieval (IR) is relevance matching, which is to rank documents by relevance to a user{'}s query. On the other hand, many NLP problems, such as question answering and paraphrase identification, can be considered variants of semantic matching, which is to measure the semantic distance between two pieces of short texts. While at a high level both relevance and semantic matching require modeling textual similarity, many existing techniques for one cannot be easily adapted to the other. To bridge this gap, we propose a novel model, HCAN (Hybrid Co-Attention Network), that comprises (1) a hybrid encoder module that includes ConvNet-based and LSTM-based encoders, (2) a relevance matching module that measures soft term matches with importance weighting at multiple granularities, and (3) a semantic matching module with co-attention mechanisms that capture context-aware semantic relatedness. Evaluations on multiple IR and NLP benchmarks demonstrate state-of-the-art effectiveness compared to approaches that do not exploit pretraining on external data. Extensive ablation studies suggest that relevance and semantic matching signals are complementary across many problem settings, regardless of the choice of underlying encoders.
Tasks	Information Retrieval, Paraphrase Identification, Question Answering
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1540/
PDF	https://www.aclweb.org/anthology/D19-1540
PWC	https://paperswithcode.com/paper/bridging-the-gap-between-relevance-matching
Repo
Framework