October 15, 2019

2545 words 12 mins read

Paper Group NANR 273

Distributed Clustering via LSH Based Data Partitioning. Lexical and Semantic Features for Cross-lingual Text Reuse Classification: an Experiment in English and Latin Paraphrases. Fast Node Embeddings: Learning Ego-Centric Representations. ECNU at SemEval-2018 Task 10: Evaluating Simple but Effective Features on Machine Learning Methods for Semantic …

Distributed Clustering via LSH Based Data Partitioning


Title	Distributed Clustering via LSH Based Data Partitioning
Authors	Aditya Bhaskara, Maheshakya Wijewardena
Abstract	Given the importance of clustering in the analysisof large scale data, distributed algorithms for formulations such as k-means, k-median, etc. have been extensively studied. A successful approach here has been the “reduce and merge” paradigm, in which each machine reduces its input size to {Õ}(k), and this data reduction continues (possibly iteratively) until all the data fits on one machine, at which point the problem is solved locally. This approach has the intrinsic bottleneck that each machine must solve a problem of size $\geq$ k, and needs to communicate at least $\Omega$(k) points to the other machines. We propose a novel data partitioning idea to overcome this bottleneck, and in effect, have different machines focus on “finding different clusters”. Under the assumption that we know the optimum value of the objective up to a poly(n) factor (arbitrary polynomial), we establish worst-case approximation guarantees for our method. We see that our algorithm results in lower communication as well as a near-optimal number of ‘rounds’ of computation (in the popular MapReduce framework).
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2443
PDF	http://proceedings.mlr.press/v80/bhaskara18a/bhaskara18a.pdf
PWC	https://paperswithcode.com/paper/distributed-clustering-via-lsh-based-data
Repo
Framework

Lexical and Semantic Features for Cross-lingual Text Reuse Classification: an Experiment in English and Latin Paraphrases


Title	Lexical and Semantic Features for Cross-lingual Text Reuse Classification: an Experiment in English and Latin Paraphrases
Authors	Maria Moritz, David Steding
Abstract
Tasks	Semantic Textual Similarity, Word Embeddings, Word Sense Disambiguation
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1311/
PDF	https://www.aclweb.org/anthology/L18-1311
PWC	https://paperswithcode.com/paper/lexical-and-semantic-features-for-cross
Repo
Framework

Fast Node Embeddings: Learning Ego-Centric Representations


Title	Fast Node Embeddings: Learning Ego-Centric Representations
Authors	Tiago Pimentel, Adriano Veloso, Nivio Ziviani
Abstract	Representation learning is one of the foundations of Deep Learning and allowed important improvements on several Machine Learning tasks, such as Neural Machine Translation, Question Answering and Speech Recognition. Recent works have proposed new methods for learning representations for nodes and edges in graphs. Several of these methods are based on the SkipGram algorithm, and they usually process a large number of multi-hop neighbors in order to produce the context from which node representations are learned. In this paper, we propose an effective and also efficient method for generating node embeddings in graphs that employs a restricted number of permutations over the immediate neighborhood of a node as context to generate its representation, thus ego-centric representations. We present a thorough evaluation showing that our method outperforms state-of-the-art methods in six different datasets related to the problems of link prediction and node classification, being one to three orders of magnitude faster than baselines when generating node embeddings for very large graphs.
Tasks	Link Prediction, Machine Translation, Node Classification, Question Answering, Representation Learning, Speech Recognition
Published	2018-01-01
URL	https://openreview.net/forum?id=SJyfrl-0b
PDF	https://openreview.net/pdf?id=SJyfrl-0b
PWC	https://paperswithcode.com/paper/fast-node-embeddings-learning-ego-centric
Repo
Framework

ECNU at SemEval-2018 Task 10: Evaluating Simple but Effective Features on Machine Learning Methods for Semantic Difference Detection


Title	ECNU at SemEval-2018 Task 10: Evaluating Simple but Effective Features on Machine Learning Methods for Semantic Difference Detection
Authors	Yunxiao Zhou, Man Lan, Yuanbin Wu
Abstract	This paper describes the system we submitted to Task 10 (Capturing Discriminative Attributes) in SemEval 2018. Given a triple (word1, word2, attribute), this task is to predict whether it exemplifies a semantic difference or not. We design and investigate several word embedding features, PMI features and WordNet features together with supervised machine learning methods to address this task. Officially released results show that our system ranks above average.
Tasks	Feature Engineering, Machine Translation, Semantic Textual Similarity, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1165/
PDF	https://www.aclweb.org/anthology/S18-1165
PWC	https://paperswithcode.com/paper/ecnu-at-semeval-2018-task-10-evaluating
Repo
Framework

Improving User Impression in Spoken Dialog System with Gradual Speech Form Control


Title	Improving User Impression in Spoken Dialog System with Gradual Speech Form Control
Authors	Yukiko Kageyama, Yuya Chiba, Takashi Nose, Akinori Ito
Abstract	This paper examines a method to improve the user impression of a spoken dialog system by introducing a mechanism that gradually changes form of utterances every time the user uses the system. In some languages, including Japanese, the form of utterances changes corresponding to social relationship between the talker and the listener. Thus, this mechanism can be effective to express the system{'}s intention to make social distance to the user closer; however, an actual effect of this method is not investigated enough when introduced to the dialog system. In this paper, we conduct dialog experiments and show that controlling the form of system utterances can improve the users{'} impression.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5026/
PDF	https://www.aclweb.org/anthology/W18-5026
PWC	https://paperswithcode.com/paper/improving-user-impression-in-spoken-dialog
Repo
Framework

AmritaNLP at SemEval-2018 Task 10: Capturing discriminative attributes using convolution neural network over global vector representation.


Title	AmritaNLP at SemEval-2018 Task 10: Capturing discriminative attributes using convolution neural network over global vector representation.
Authors	Vivek Vinayan, An Kumar M, , Soman K P
Abstract	The {``}Capturing Discriminative Attributes{''} sharedtask is the tenth task, conjoint with SemEval2018. The task is to predict if a word can capture distinguishing attributes of one word from another. We use GloVe word embedding, pre-trained on openly sourced corpus for this task. A base representation is initially established over varied dimensions. These representations are evaluated based on validation scores over two models, first on an SVM based classifier and second on a one dimension CNN model. The scores are used to further develop the representation with vector combinations, by considering various distance measures. These measures correspond to offset vectors which are concatenated as features, mainly to improve upon the F1score, with the best accuracy. The features are then further tuned on the validation scores, to achieve highest F1score. Our evaluation narrowed down to two representations, classified on CNN models, having a total dimension length of 1204 {&} 1203 for the final submissions. Of the two, the latter feature representation delivered our best F1score of 0.658024 (as per result). \|
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1166/
PDF	https://www.aclweb.org/anthology/S18-1166
PWC	https://paperswithcode.com/paper/amritanlp-at-semeval-2018-task-10-capturing
Repo
Framework

UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?


Title	UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?
Authors	Jonathan Beaulieu, Dennis Asamoah Owusu
Abstract	In this paper, we present our system for assigning an emoji to a tweet based on the text. Each tweet was originally posted with an emoji which the task providers removed. Our task was to decide out of 20 emojis, which originally came with the tweet. Two datasets were provided - one in English and the other in Spanish. We treated the task as a standard classification task with the emojis as our classes and the tweets as our documents. Our best performing system used a Bag of Words model with a Linear Support Vector Machine as its{'} classifier. We achieved a macro F1 score of 32.73{%} for the English data and 17.98{%} for the Spanish data.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1061/
PDF	https://www.aclweb.org/anthology/S18-1061
PWC	https://paperswithcode.com/paper/umduluth-cs8761-at-semeval-2018-task-2-emojis
Repo
Framework

Learning to Exploit Stability for 3D Scene Parsing


Title	Learning to Exploit Stability for 3D Scene Parsing
Authors	Yilun Du, Zhijian Liu, Hector Basevi, Ales Leonardis, Bill Freeman, Josh Tenenbaum, Jiajun Wu
Abstract	Human scene understanding uses a variety of visual and non-visual cues to perform inference on object types, poses, and relations. Physics is a rich and universal cue which we exploit to enhance scene understanding. We integrate the physical cue of stability into the learning process using a REINFORCE approach coupled to a physics engine, and apply this to the problem of producing the 3D bounding boxes and poses of objects in a scene. We first show that applying physics supervision to an existing scene understanding model increases performance, produces more stable predictions, and allows training to an equivalent performance level with fewer annotated training examples. We then present a novel architecture for 3D scene parsing named Prim R-CNN, learning to predict bounding boxes as well as their 3D size, translation, and rotation. With physics supervision, Prim R-CNN outperforms existing scene understanding approaches on this problem. Finally, we show that applying physics supervision on unlabeled real images improves real domain transfer of models training on synthetic data.
Tasks	Scene Parsing, Scene Understanding
Published	2018-12-01
URL	http://papers.nips.cc/paper/7444-learning-to-exploit-stability-for-3d-scene-parsing
PDF	http://papers.nips.cc/paper/7444-learning-to-exploit-stability-for-3d-scene-parsing.pdf
PWC	https://paperswithcode.com/paper/learning-to-exploit-stability-for-3d-scene
Repo
Framework

UMD at SemEval-2018 Task 10: Can Word Embeddings Capture Discriminative Attributes?


Title	UMD at SemEval-2018 Task 10: Can Word Embeddings Capture Discriminative Attributes?
Authors	Alex Zhang, er, Marine Carpuat
Abstract	We describe the University of Maryland{'}s submission to SemEval-018 Task 10, {``}Capturing Discriminative Attributes{''}: given word triples (w1, w2, d), the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2. Our study aims to determine whether word embeddings can address this challenging task. Our submission casts this problem as supervised binary classification using only word embedding features. Using a gaussian SVM model trained only on validation data results in an F-score of 60{%}. We also show that cosine similarity features are more effective, both in unsupervised systems (F-score of 65{%}) and supervised systems (F-score of 67{%}). \|
Tasks	Semantic Textual Similarity, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1170/
PDF	https://www.aclweb.org/anthology/S18-1170
PWC	https://paperswithcode.com/paper/umd-at-semeval-2018-task-10-can-word
Repo
Framework

NTU NLP Lab System at SemEval-2018 Task 10: Verifying Semantic Differences by Integrating Distributional Information and Expert Knowledge


Title	NTU NLP Lab System at SemEval-2018 Task 10: Verifying Semantic Differences by Integrating Distributional Information and Expert Knowledge
Authors	Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen
Abstract	This paper presents the NTU NLP Lab system for the SemEval-2018 Capturing Discriminative Attributes task. Word embeddings, pointwise mutual information (PMI), ConceptNet edges and shortest path lengths are utilized as input features to build binary classifiers to tell whether an attribute is discriminative for a pair of concepts. Our neural network model reaches about 73{%} F1 score on the test set and ranks the 3rd in the task. Though the attributes to deal with in this task are all visual, our models are not provided with any image data. The results indicate that visual information can be derived from textual data.
Tasks	Machine Translation, Semantic Textual Similarity, Sentiment Analysis, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1171/
PDF	https://www.aclweb.org/anthology/S18-1171
PWC	https://paperswithcode.com/paper/ntu-nlp-lab-system-at-semeval-2018-task-10
Repo
Framework

CSReader at SemEval-2018 Task 11: Multiple Choice Question Answering as Textual Entailment


Title	CSReader at SemEval-2018 Task 11: Multiple Choice Question Answering as Textual Entailment
Authors	Zhengping Jiang, Qi Sun
Abstract	In this document we present an end-to-end machine reading comprehension system that solves multiple choice questions with a textual entailment perspective. Since some of the knowledge required is not explicitly mentioned in the text, we try to exploit commonsense knowledge by using pretrained word embeddings during contextual embeddings and by dynamically generating a weighted representation of related script knowledge. In the model two kinds of prediction structure are ensembled, and the final accuracy of our system is 10 percent higher than the naiive baseline.
Tasks	Common Sense Reasoning, Language Modelling, Machine Reading Comprehension, Natural Language Inference, Question Answering, Reading Comprehension, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1176/
PDF	https://www.aclweb.org/anthology/S18-1176
PWC	https://paperswithcode.com/paper/csreader-at-semeval-2018-task-11-multiple
Repo
Framework

ECNU at SemEval-2018 Task 11: Using Deep Learning Method to Address Machine Comprehension Task


Title	ECNU at SemEval-2018 Task 11: Using Deep Learning Method to Address Machine Comprehension Task
Authors	Yixuan Sheng, Man Lan, Yuanbin Wu
Abstract	This paper describes the system we submitted to the Task 11 in SemEval 2018, i.e., Machine Comprehension using Commonsense Knowledge. Given a passage and some questions that each have two candidate answers, this task requires the participate system to select out one answer meet the meaning of original text or commonsense knowledge from the candidate answers. For this task, we use a deep learning method to obtain final predict answer by calculating relevance of choices representations and question-aware document representation.
Tasks	Machine Reading Comprehension, Reading Comprehension
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1175/
PDF	https://www.aclweb.org/anthology/S18-1175
PWC	https://paperswithcode.com/paper/ecnu-at-semeval-2018-task-11-using-deep
Repo
Framework

YNU-HPCC at Semeval-2018 Task 11: Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense Knowledge


Title	YNU-HPCC at Semeval-2018 Task 11: Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense Knowledge
Authors	Hang Yuan, Jin Wang, Xuejie Zhang
Abstract	This shared task is a typical question answering task. Compared with the normal question and answer system, it needs to give the answer to the question based on the text provided. The essence of the problem is actually reading comprehension. Typically, there are several questions for each text that correspond to it. And for each question, there are two candidate answers (and only one of them is correct). To solve this problem, the usual approach is to use convolutional neural networks (CNN) and recurrent neural network (RNN) or their improved models (such as long short-term memory (LSTM)). In this paper, an attention-based CNN-LSTM model is proposed for this task. By adding an attention mechanism and combining the two models, this experimental result has been significantly improved.
Tasks	Question Answering, Reading Comprehension
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1177/
PDF	https://www.aclweb.org/anthology/S18-1177
PWC	https://paperswithcode.com/paper/ynu-hpcc-at-semeval-2018-task-11-using-an
Repo
Framework

Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation


Title	Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation
Authors	Zhen Zhang, Mianzhi Wang, Yan Huang, Arye Nehorai
Abstract	Domain shift, which occurs when there is a mismatch between the distributions of training (source) and testing (target) datasets, usually results in poor performance of the trained model on the target domain. Existing algorithms typically solve this issue by reducing the distribution discrepancy in the input spaces. However, for kernel-based learning machines, performance highly depends on the statistical properties of data in reproducing kernel Hilbert spaces (RKHS). Motivated by these considerations, we propose a novel strategy for matching distributions in RKHS, which is done by aligning the RKHS covariance matrices (descriptors) across domains. This strategy is a generalization of the correlation alignment problem in Euclidean spaces to (potentially) infinite-dimensional feature spaces. In this paper, we provide two alignment approaches, for both of which we obtain closed-form expressions via kernel matrices. Furthermore, our approaches are scalable to large datasets since they can naturally handle out-of-sample instances. We conduct extensive experiments (248 domain adaptation tasks) to evaluate our approaches. Experiment results show that our approaches outperform other state-of-the-art methods in both accuracy and computationally efficiency.
Tasks	Domain Adaptation
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Aligning_Infinite-Dimensional_Covariance_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Aligning_Infinite-Dimensional_Covariance_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/aligning-infinite-dimensional-covariance
Repo
Framework

BLCU_NLP at SemEval-2018 Task 12: An Ensemble Model for Argument Reasoning Based on Hierarchical Attention


Title	BLCU_NLP at SemEval-2018 Task 12: An Ensemble Model for Argument Reasoning Based on Hierarchical Attention
Authors	Meiqian Zhao, Chunhua Liu, Lu Liu, Yan Zhao, Dong Yu
Abstract	To comprehend an argument and fill the gap between claims and reasons, it is vital to find the implicit supporting warrants behind. In this paper, we propose a hierarchical attention model to identify the right warrant which explains why the reason stands for the claim. Our model focuses not only on the similar part between warrants and other information but also on the contradictory part between two opposing warrants. In addition, we use the ensemble method for different models. Our model achieves an accuracy of 61{%}, ranking second in this task. Experimental results demonstrate that our model is effective to make correct choices.
Tasks	Common Sense Reasoning, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1186/
PDF	https://www.aclweb.org/anthology/S18-1186
PWC	https://paperswithcode.com/paper/blcu_nlp-at-semeval-2018-task-12-an-ensemble
Repo
Framework