Paper Group NANR 178
Generalizing Graph Matching beyond Quadratic Assignment Model. Design of a Tigrinya Language Speech Corpus for Speech Recognition. Analysis of Implicit Conditions in Database Search Dialogues. Neural Network Methods for Natural Language Processing by Yoav Goldberg. Festina Lente: A Farewell from the Editor. Identifying Opinion-Topics and Polarity o …
Generalizing Graph Matching beyond Quadratic Assignment Model
Title | Generalizing Graph Matching beyond Quadratic Assignment Model |
Authors | Tianshu Yu, Junchi Yan, Yilin Wang, Wei Liu, Baoxin Li |
Abstract | Graph matching has received persistent attention over decades, which can be formulated as a quadratic assignment problem (QAP). We show that a large family of functions, which we define as Separable Functions, can approximate discrete graph matching in the continuous domain asymptotically by varying the approximation controlling parameters. We also study the properties of global optimality and devise convex/concave-preserving extensions to the widely used Lawler’s QAP form. Our theoretical findings show the potential for deriving new algorithms and techniques for graph matching. We deliver solvers based on two specific instances of Separable Functions, and the state-of-the-art performance of our method is verified on popular benchmarks. |
Tasks | Graph Matching |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7365-generalizing-graph-matching-beyond-quadratic-assignment-model |
http://papers.nips.cc/paper/7365-generalizing-graph-matching-beyond-quadratic-assignment-model.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-graph-matching-beyond-quadratic |
Repo | |
Framework | |
Design of a Tigrinya Language Speech Corpus for Speech Recognition
Title | Design of a Tigrinya Language Speech Corpus for Speech Recognition |
Authors | Hafte Abera, Sebsibe H/Mariam |
Abstract | In this paper, we describe the first Tigrinya Languages speech corpora designed and development for speech recognition purposes. Tigrinya, often written as Tigrigna (ትግርኛ) /tɪˈɡrinjə/ belongs to the Semitic branch of the Afro-Asiatic languages where it shows the characteristic features of a Semitic language. It is spoken by ethnic Tigray-Tigrigna people in the Horn of Africa. The paper outlines different corpus designing process analysis of related work on speech corpora creation for different languages. The authors provide also procedures that were used for the creation of Tigrinya speech recognition corpus which is the under-resourced language. One hundred and thirty speakers, native to Tigrinya language, were recorded for training and test dataset set. Each speaker read 100 texts, which consisted of syllabically rich and balanced sentences. Ten thousand sets of sentences were used to prompt sheets. These sentences contained all of the contextual syllables and phones. |
Tasks | Speech Recognition |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3811/ |
https://www.aclweb.org/anthology/W18-3811 | |
PWC | https://paperswithcode.com/paper/design-of-a-tigrinya-language-speech-corpus |
Repo | |
Framework | |
Analysis of Implicit Conditions in Database Search Dialogues
Title | Analysis of Implicit Conditions in Database Search Dialogues |
Authors | Shun-ya Fukunaga, Hitoshi Nishikawa, Takenobu Tokunaga, Hikaru Yokono, Tetsuro Takahashi |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1434/ |
https://www.aclweb.org/anthology/L18-1434 | |
PWC | https://paperswithcode.com/paper/analysis-of-implicit-conditions-in-database |
Repo | |
Framework | |
Neural Network Methods for Natural Language Processing by Yoav Goldberg
Title | Neural Network Methods for Natural Language Processing by Yoav Goldberg |
Authors | Yang Liu, Meng Zhang |
Abstract | |
Tasks | Speech Recognition |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/J18-1008/ |
https://www.aclweb.org/anthology/J18-1008 | |
PWC | https://paperswithcode.com/paper/neural-network-methods-for-natural-language |
Repo | |
Framework | |
Festina Lente: A Farewell from the Editor
Title | Festina Lente: A Farewell from the Editor |
Authors | Paola Merlo |
Abstract | |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/J18-2007/ |
https://www.aclweb.org/anthology/J18-2007 | |
PWC | https://paperswithcode.com/paper/festina-lente-a-farewell-from-the-editor |
Repo | |
Framework | |
Identifying Opinion-Topics and Polarity of Parliamentary Debate Motions
Title | Identifying Opinion-Topics and Polarity of Parliamentary Debate Motions |
Authors | Gavin Abercrombie, Riza Theresa Batista-Navarro |
Abstract | Analysis of the topics mentioned and opinions expressed in parliamentary debate motions{–}or proposals{–}is difficult for human readers, but necessary for understanding and automatic processing of the content of the subsequent speeches. We present a dataset of debate motions with pre-existing {}\textit{policy}{'} labels, and investigate the utility of these labels for simultaneous topic and opinion polarity analysis. For topic detection, we apply one-versus-the-rest supervised topic classification, finding that good performance is achieved in predicting the \textit{policy} topics, and that textual features derived from the debate titles associated with the motions are particularly indicative of motion topic. We then examine whether the output could also be used to determine the positions taken by proposers towards the different policies by investigating how well humans agree in interpreting the opinion polarities of the motions. Finding very high levels of agreement, we conclude that the \textit{policies} used can be reliable labels for use in these tasks, and that successful topic detection can therefore provide opinion analysis of the motions { }for free{'}. |
Tasks | Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6241/ |
https://www.aclweb.org/anthology/W18-6241 | |
PWC | https://paperswithcode.com/paper/identifying-opinion-topics-and-polarity-of |
Repo | |
Framework | |
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Title | Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning |
Authors | Chen Shi, Qi Chen, Lei Sha, Sujian Li, Xu Sun, Houfeng Wang, Lintao Zhang |
Abstract | The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1{%}), and provide reasonable and instructive slot labeling results. |
Tasks | Active Learning |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1072/ |
https://www.aclweb.org/anthology/D18-1072 | |
PWC | https://paperswithcode.com/paper/auto-dialabel-labeling-dialogue-data-with |
Repo | |
Framework | |
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Title | Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018) |
Authors | |
Abstract | |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3900/ |
https://www.aclweb.org/anthology/W18-3900 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-fifth-workshop-on-nlp-for |
Repo | |
Framework | |
Supervised and Unsupervised Minimalist Quality Estimators: Vicomtech’s Participation in the WMT 2018 Quality Estimation Task
Title | Supervised and Unsupervised Minimalist Quality Estimators: Vicomtech’s Participation in the WMT 2018 Quality Estimation Task |
Authors | Thierry Etchegoyhen, Eva Mart{'\i}nez Garcia, Andoni Azpeitia |
Abstract | We describe Vicomtech{'}s participation in the WMT 2018 shared task on quality estimation, for which we submitted minimalist quality estimators. The core of our approach is based on two simple features: lexical translation overlaps and language model cross-entropy scores. These features are exploited in two system variants: uMQE is an unsupervised system, where the final quality score is obtained by averaging individual feature scores; sMQE is a supervised variant, where the final score is estimated by a Support Vector Regressor trained on the available annotated datasets. The main goal of our minimalist approach to quality estimation is to provide reliable estimators that require minimal deployment effort, few resources, and, in the case of uMQE, do not depend on costly data annotation or post-editing. Our approach was applied to all language pairs in sentence quality estimation, obtaining competitive results across the board. |
Tasks | Language Modelling, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6461/ |
https://www.aclweb.org/anthology/W18-6461 | |
PWC | https://paperswithcode.com/paper/supervised-and-unsupervised-minimalist |
Repo | |
Framework | |
Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation
Title | Discourse-Related Language Contrasts in English-Croatian Human and Machine Translation |
Authors | Margita {\v{S}}o{\v{s}}tari{'c}, Christian Hardmeier, Sara Stymne |
Abstract | We present an analysis of a number of coreference phenomena in English-Croatian human and machine translations. The aim is to shed light on the differences in the way these structurally different languages make use of discourse information and provide insights for discourse-aware machine translation system development. The phenomena are automatically identified in parallel data using annotation produced by parsers and word alignment tools, enabling us to pinpoint patterns of interest in both languages. We make the analysis more fine-grained by including three corpora pertaining to three different registers. In a second step, we create a test set with the challenging linguistic constructions and use it to evaluate the performance of three MT systems. We show that both SMT and NMT systems struggle with handling these discourse phenomena, even though NMT tends to perform somewhat better than SMT. By providing an overview of patterns frequently occurring in actual language use, as well as by pointing out the weaknesses of current MT systems that commonly mistranslate them, we hope to contribute to the effort of resolving the issue of discourse phenomena in MT applications. |
Tasks | Machine Translation, Word Alignment |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6305/ |
https://www.aclweb.org/anthology/W18-6305 | |
PWC | https://paperswithcode.com/paper/discourse-related-language-contrasts-in |
Repo | |
Framework | |
Please Clap: Modeling Applause in Campaign Speeches
Title | Please Clap: Modeling Applause in Campaign Speeches |
Authors | Jon Gillick, David Bamman |
Abstract | This work examines the rhetorical techniques that speakers employ during political campaigns. We introduce a new corpus of speeches from campaign events in the months leading up to the 2016 U.S. presidential election and develop new models for predicting moments of audience applause. In contrast to existing datasets, we tackle the challenge of working with transcripts that derive from uncorrected closed captioning, using associated audio recordings to automatically extract and align labels for instances of audience applause. In prediction experiments, we find that lexical features carry the most information, but that a variety of features are predictive, including prosody, long-term contextual dependencies, and theoretically motivated features designed to capture rhetorical techniques. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1009/ |
https://www.aclweb.org/anthology/N18-1009 | |
PWC | https://paperswithcode.com/paper/please-clap-modeling-applause-in-campaign |
Repo | |
Framework | |
Improving Neural Program Synthesis with Inferred Execution Traces
Title | Improving Neural Program Synthesis with Inferred Execution Traces |
Authors | Richard Shin, Illia Polosukhin, Dawn Song |
Abstract | The task of program synthesis, or automatically generating programs that are consistent with a provided specification, remains a challenging task in artificial intelligence. As in other fields of AI, deep learning-based end-to-end approaches have made great advances in program synthesis. However, more so than other fields such as computer vision, program synthesis provides greater opportunities to explicitly exploit structured information such as execution traces, which contain a superset of the information input/output pairs. While they are highly useful for program synthesis, as execution traces are more difficult to obtain than input/output pairs, we use the insight that we can split the process into two parts: infer the trace from the input/output example, then infer the program from the trace. This simple modification leads to state-of-the-art results in program synthesis in the Karel domain, improving accuracy to 81.3% from the 77.12% of prior work. |
Tasks | Program Synthesis |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8107-improving-neural-program-synthesis-with-inferred-execution-traces |
http://papers.nips.cc/paper/8107-improving-neural-program-synthesis-with-inferred-execution-traces.pdf | |
PWC | https://paperswithcode.com/paper/improving-neural-program-synthesis-with |
Repo | |
Framework | |
Modeling the Readability of German Targeting Adults and Children: An empirically broad analysis and its cross-corpus validation
Title | Modeling the Readability of German Targeting Adults and Children: An empirically broad analysis and its cross-corpus validation |
Authors | Zarah Wei{\ss}, Detmar Meurers |
Abstract | We analyze two novel data sets of German educational media texts targeting adults and children. The analysis is based on 400 automatically extracted measures of linguistic complexity from a wide range of linguistic domains. We show that both data sets exhibit broad linguistic adaptation to the target audience, which generalizes across both data sets. Our most successful binary classification model for German readability robustly shows high accuracy between 89.4{%}{–}98.9{%} for both data sets. To our knowledge, this comprehensive German readability model is the first for which robust cross-corpus performance has been shown. The research also contributes resources for German readability assessment that are externally validated as successful for different target audiences: we compiled a new corpus of German news broadcast subtitles, the Tagesschau/Logo corpus, and crawled a GEO/GEOlino corpus substantially enlarging the data compiled by Hancke et al. 2012. |
Tasks | Information Retrieval, Text Simplification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1026/ |
https://www.aclweb.org/anthology/C18-1026 | |
PWC | https://paperswithcode.com/paper/modeling-the-readability-of-german-targeting |
Repo | |
Framework | |
Learning Gaussian Policies from Smoothed Action Value Functions
Title | Learning Gaussian Policies from Smoothed Action Value Functions |
Authors | Ofir Nachum, Mohammad Norouzi, George Tucker, Dale Schuurmans |
Abstract | State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value used in SARSA. We show that such smoothed Q-values still satisfy a Bellman equation, making them naturally learnable from experience sampled from an environment. Moreover, the gradients of expected reward with respect to the mean and covariance of a parameterized Gaussian policy can be recovered from the gradient and Hessian of the smoothed Q-value function. Based on these relationships we develop new algorithms for training a Gaussian policy directly from a learned Q-value approximator. The approach is also amenable to proximal optimization techniques by augmenting the objective with a penalty on KL-divergence from a previous policy. We find that the ability to learn both a mean and covariance during training allows this approach to achieve strong results on standard continuous control benchmarks. |
Tasks | Continuous Control, Q-Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1nLkl-0Z |
https://openreview.net/pdf?id=B1nLkl-0Z | |
PWC | https://paperswithcode.com/paper/learning-gaussian-policies-from-smoothed |
Repo | |
Framework | |
Retrospective Encoders for Video Summarization
Title | Retrospective Encoders for Video Summarization |
Authors | Ke Zhang, Kristen Grauman, Fei Sha |
Abstract | Supervised learning techniques have shown substantial progress on video summarization. State-of-the-art approaches mostly regard the predicted summary and the human summary as two sequences (sets), and minimize discriminative losses that measure element-wise discrepancy. Such training objectives do not explicitly model how well the predicted summary preserves semantic information in the video. Moreover, those methods often demand a large amount of human generated summaries. In this paper, we propose a novel sequence-to-sequence learning model to address these deficiencies. The key idea is to complement the discriminative losses with another loss which measures if the predicted summary preserves the same information as in the original video. To this end, we propose to augment standard sequence learning models with an additional ``retrospective encoder’’ that embeds the predicted summary into an abstract semantic space. The embedding is then compared to the embedding of the original video in the same space. The intuition is that both embeddings ought to be close to each other for a video and its corresponding summary. Thus our approach adds to the discriminative loss a metric learning loss that minimizes the distance between such pairs while maximizing the distances between unmatched ones. One important advantage is that the metric learning loss readily allows learning from videos without human generated summaries. Extensive experimental results show that our model outperforms existing ones by a large margin in both supervised and semi-supervised settings. | |
Tasks | Metric Learning, Video Summarization |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Ke_Zhang_Retrospective_Encoders_for_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Ke_Zhang_Retrospective_Encoders_for_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/retrospective-encoders-for-video |
Repo | |
Framework | |