October 15, 2019

1850 words 9 mins read

Paper Group NANR 59

Designing a Russian Idiom-Annotated Corpus. Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER. Can Domain Adaptation be Handled as Analogies?. Efficient Projection onto the Perfect Phylogeny Model. A Deep Predictive Coding Network for Learning Latent Representations. DMCB at SemEval-2018 Task 1: Transfer Learnin …

Designing a Russian Idiom-Annotated Corpus


Title	Designing a Russian Idiom-Annotated Corpus
Authors	Katsiaryna Aharodnik, Anna Feldman, Jing Peng
Abstract
Tasks	Machine Translation, Word Embeddings
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1402/
PDF	https://www.aclweb.org/anthology/L18-1402
PWC	https://paperswithcode.com/paper/designing-a-russian-idiom-annotated-corpus
Repo
Framework

Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER


Title	Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER
Authors	Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya
Abstract	Multilingual learning for Neural Named Entity Recognition (NNER) involves jointly training a neural network for multiple languages. Typically, the goal is improving the NER performance of one of the languages (the primary language) using the other assisting languages. We show that the divergence in the tag distributions of the common named entities between the primary and assisting languages can reduce the effectiveness of multilingual learning. To alleviate this problem, we propose a metric based on symmetric KL divergence to filter out the highly divergent training instances in the assisting language. We empirically show that our data selection strategy improves NER performance in many languages, including those with very limited training data.
Tasks	Domain Adaptation, Machine Translation, Named Entity Recognition
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2064/
PDF	https://www.aclweb.org/anthology/P18-2064
PWC	https://paperswithcode.com/paper/judicious-selection-of-training-data-in
Repo
Framework

Can Domain Adaptation be Handled as Analogies?


Title	Can Domain Adaptation be Handled as Analogies?
Authors	N{'u}ria Bel, Joel Pocostales
Abstract
Tasks	Aspect-Based Sentiment Analysis, Document Classification, Domain Adaptation, Sentiment Analysis, Text Classification, Word Embeddings
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1406/
PDF	https://www.aclweb.org/anthology/L18-1406
PWC	https://paperswithcode.com/paper/can-domain-adaptation-be-handled-as-analogies
Repo
Framework

Efficient Projection onto the Perfect Phylogeny Model


Title	Efficient Projection onto the Perfect Phylogeny Model
Authors	Bei Jia, Surjyendu Ray, Sam Safavi, José Bento
Abstract	Several algorithms build on the perfect phylogeny model to infer evolutionary trees. This problem is particularly hard when evolutionary trees are inferred from the fraction of genomes that have mutations in different positions, across different samples. Existing algorithms might do extensive searches over the space of possible trees. At the center of these algorithms is a projection problem that assigns a fitness cost to phylogenetic trees. In order to perform a wide search over the space of the trees, it is critical to solve this projection problem fast. In this paper, we use Moreau’s decomposition for proximal operators, and a tree reduction scheme, to develop a new algorithm to compute this projection. Our algorithm terminates with an exact solution in a finite number of steps, and is extremely fast. In particular, it can search over all evolutionary trees with fewer than 11 nodes, a size relevant for several biological problems (more than 2 billion trees) in about 2 hours.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7665-efficient-projection-onto-the-perfect-phylogeny-model
PDF	http://papers.nips.cc/paper/7665-efficient-projection-onto-the-perfect-phylogeny-model.pdf
PWC	https://paperswithcode.com/paper/efficient-projection-onto-the-perfect
Repo
Framework

A Deep Predictive Coding Network for Learning Latent Representations


Title	A Deep Predictive Coding Network for Learning Latent Representations
Authors	Shirin Dora, Cyriel Pennartz, Sander Bohte
Abstract	It has been argued that the brain is a prediction machine that continuously learns how to make better predictions about the stimuli received from the external environment. For this purpose, it builds a model of the world around us and uses this model to infer the external stimulus. Predictive coding has been proposed as a mechanism through which the brain might be able to build such a model of the external environment. However, it is not clear how predictive coding can be used to build deep neural network models of the brain while complying with the architectural constraints imposed by the brain. In this paper, we describe an algorithm to build a deep generative model using predictive coding that can be used to infer latent representations about the stimuli received from external environment. Specifically, we used predictive coding to train a deep neural network on real-world images in a unsupervised learning paradigm. To understand the capacity of the network with regards to modeling the external environment, we studied the latent representations generated by the model on images of objects that are never presented to the model during training. Despite the novel features of these objects the model is able to infer the latent representations for them. Furthermore, the reconstructions of the original images obtained from these latent representations preserve the important details of these objects.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=Hy8hkYeRb
PDF	https://openreview.net/pdf?id=Hy8hkYeRb
PWC	https://paperswithcode.com/paper/a-deep-predictive-coding-network-for-learning
Repo
Framework

DMCB at SemEval-2018 Task 1: Transfer Learning of Sentiment Classification Using Group LSTM for Emotion Intensity prediction


Title	DMCB at SemEval-2018 Task 1: Transfer Learning of Sentiment Classification Using Group LSTM for Emotion Intensity prediction
Authors	Youngmin Kim, Hyunju Lee
Abstract	This paper describes a system attended in the SemEval-2018 Task 1 {``}Affect in tweets{''} that predicts emotional intensities. We use Group LSTM with an attention model and transfer learning with sentiment classification data as a source data (SemEval 2017 Task 4a). A transfer model structure consists of a source domain and a target domain. Additionally, we try a new dropout that is applied to LSTMs in the Group LSTM. Our system ranked 8th at the subtask 1a (emotion intensity regression). We also show various results with different architectures in the source, target and transfer models. \|
Tasks	Sentiment Analysis, Transfer Learning, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1044/
PDF	https://www.aclweb.org/anthology/S18-1044
PWC	https://paperswithcode.com/paper/dmcb-at-semeval-2018-task-1-transfer-learning
Repo
Framework

DialCrowd: A toolkit for easy dialog system assessment


Title	DialCrowd: A toolkit for easy dialog system assessment
Authors	Kyusong Lee, Tiancheng Zhao, Alan W. Black, Maxine Eskenazi
Abstract	When creating a dialog system, developers need to test each version to ensure that it is performing correctly. Recently the trend has been to test on large datasets or to ask many users to try out a system. Crowdsourcing has solved the issue of finding users, but it presents new challenges such as how to use a crowdsourcing platform and what type of test is appropriate. DialCrowd has been designed to make system assessment easier and to ensure the quality of the result. This paper describes DialCrowd, what specific needs it fulfills and how it works. It then relates a test of DialCrowd by a group of dialog system developer.
Tasks	Chatbot
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5028/
PDF	https://www.aclweb.org/anthology/W18-5028
PWC	https://paperswithcode.com/paper/dialcrowd-a-toolkit-for-easy-dialog-system
Repo
Framework

Identification of Parallel Sentences in Comparable Monolingual Corpora from Different Registers


Title	Identification of Parallel Sentences in Comparable Monolingual Corpora from Different Registers
Authors	R{'e}mi Cardon, Natalia Grabar
Abstract	Parallel aligned sentences provide useful information for different NLP applications. Yet, this kind of data is seldom available, especially for languages other than English. We propose to exploit comparable corpora in French which are distinguished by their registers (specialized and simplified versions) to detect and align parallel sentences. These corpora are related to the biomedical area. Our purpose is to state whether a given pair of specialized and simplified sentences is to be aligned or not. Manually created reference data show 0.76 inter-annotator agreement. We exploit a set of features and several automatic classifiers. The automatic alignment reaches up to 0.93 Precision, Recall and F-measure. In order to better evaluate the method, it is applied to data in English from the \textit{SemEval} STS competitions. The same features and models are applied in monolingual and cross-lingual contexts, in which they show up to 0.90 and 0.73 F-measure, respectively.
Tasks	Information Retrieval, Machine Translation, Text Simplification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5610/
PDF	https://www.aclweb.org/anthology/W18-5610
PWC	https://paperswithcode.com/paper/identification-of-parallel-sentences-in
Repo
Framework

A Biresolution Spectral Framework for Product Quantization


Title	A Biresolution Spectral Framework for Product Quantization
Authors	Lopamudra Mukherjee, Sathya N. Ravi, Jiming Peng, Vikas Singh
Abstract	Product quantization (PQ) (and its variants) has been effec- tively used to encode high-dimensional data into compact codes for many problems in vision. In principle, PQ decomposes the given data into a number of lower-dimensional subspaces where the quantization proceeds independently for each subspace. While the original PQ approach does not explicitly optimize for these subspaces, later proposals have argued that the performance tends to benefit significantly if such subspaces are chosen in an optimal manner. Despite such consensus, existing approaches in the literature diverge in terms of which specific properties of these subspaces are desirable and how one should proceed to solve/optimize them. Nonetheless, despite the empirical support, there is less clarity regarding the theoretical properties that underlie these experimental benefits for quantization problems in general. In this paper, we study the quantization problem in the setting where subspaces are orthogonal and show that this problem is intricately related to a specific type of spectral decomposition of the data. This insight not only opens the door to a rich body of work in spectral analysis, but also leads to distinct computational benefits. Our resultant biresolution spectral formulation captures both the subspace projection error as well as the quantization error within the same framework. After a reformulation, the core steps of our algorithm involve a simple eigen decomposition step, which can be solved efficiently. We show that our method performs very favorably against a number of state of the art methods on standard data sets.
Tasks	Quantization
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Mukherjee_A_Biresolution_Spectral_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Mukherjee_A_Biresolution_Spectral_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/a-biresolution-spectral-framework-for-product
Repo
Framework

Risam'alheild: A Very Large Icelandic Text Corpus


Title	Risam'alheild: A Very Large Icelandic Text Corpus
Authors	Stein{\th}{'o}r Steingr{'\i}msson, Sigr{'u}n Helgad{'o}ttir, Eir{'\i}kur R{"o}gnvaldsson, Starka{\dh}ur Barkarson, J{'o}n Gu{\dh}nason
Abstract
Tasks	Machine Translation
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1690/
PDF	https://www.aclweb.org/anthology/L18-1690
PWC	https://paperswithcode.com/paper/risamalheild-a-very-large-icelandic-text
Repo
Framework


Title	Fine-Grained Emotion Detection in Health-Related Online Posts
Authors	Hamed Khanpour, Cornelia Caragea
Abstract	Detecting fine-grained emotions in online health communities provides insightful information about patients{'} emotional states. However, current computational approaches to emotion detection from health-related posts focus only on identifying messages that contain emotions, with no emphasis on the emotion type, using a set of handcrafted features. In this paper, we take a step further and propose to detect fine-grained emotion types from health-related posts and show how high-level and abstract features derived from deep neural networks combined with lexicon-based features can be employed to detect emotions.
Tasks	Emotion Recognition
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1147/
PDF	https://www.aclweb.org/anthology/D18-1147
PWC	https://paperswithcode.com/paper/fine-grained-emotion-detection-in-health
Repo
Framework

Crowdsourced Corpus of Sentence Simplification with Core Vocabulary


Title	Crowdsourced Corpus of Sentence Simplification with Core Vocabulary
Authors	Akihiro Katsuta, Kazuhide Yamamoto
Abstract
Tasks	Machine Translation, Text Simplification
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1072/
PDF	https://www.aclweb.org/anthology/L18-1072
PWC	https://paperswithcode.com/paper/crowdsourced-corpus-of-sentence
Repo
Framework

A Framework for Multi-Language Service Design with the Language Grid


Title	A Framework for Multi-Language Service Design with the Language Grid
Authors	Donghui Lin, Yohei Murakami, Toru Ishida
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1518/
PDF	https://www.aclweb.org/anthology/L18-1518
PWC	https://paperswithcode.com/paper/a-framework-for-multi-language-service-design
Repo
Framework

使用長短期記憶類神經網路建構中文語音辨識器之研究 (A study on Mandarin speech recognition using Long Short-Term Memory neural network) [In Chinese]


Title	使用長短期記憶類神經網路建構中文語音辨識器之研究 (A study on Mandarin speech recognition using Long Short-Term Memory neural network) [In Chinese]
Authors	Chien-hung Lai, Yih-Ru Wang
Abstract
Tasks	Speech Recognition
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1011/
PDF	https://www.aclweb.org/anthology/O18-1011
PWC	https://paperswithcode.com/paper/a12c-ece-eccc2e-aoa-eae3e34-ea-a1c-c-a-study
Repo
Framework

台語古詩朗誦系統A Taiwanese Text-to-Speech System for Ancient Poems[In Chinese]


Title	台語古詩朗誦系統A Taiwanese Text-to-Speech System for Ancient Poems[In Chinese]
Authors	Yu-Lin Tsai, Chao-Hsiang Huang, Chuan-Jie Lin
Abstract
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1019/
PDF	https://www.aclweb.org/anthology/O18-1019
PWC	https://paperswithcode.com/paper/aeaaeeac3ca-taiwanese-text-to-speech-system
Repo
Framework