October 15, 2019

2269 words 11 mins read

Paper Group NANR 262

Exploiting Attention to Reveal Shortcomings in Memory Models. Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification. `Indicatements’ that character language models learn English morpho-syntactic units and regularities. Sequence-to-Segment Networks for Segment Detection. Evaluation of generative networks throu …

Exploiting Attention to Reveal Shortcomings in Memory Models


Title	Exploiting Attention to Reveal Shortcomings in Memory Models
Authors	Kaylee Burns, Aida Nematzadeh, Erin Grant, Alison Gopnik, Tom Griffiths
Abstract	The decision making processes of deep networks are difficult to understand and while their accuracy often improves with increased architectural complexity, so too does their opacity. Practical use of machine learning models, especially for question and answering applications, demands a system that is interpretable. We analyze the attention of a memory network model to reconcile contradictory performance on a challenging question-answering dataset that is inspired by theory-of-mind experiments. We equate success on questions to task classification, which explains not only test-time failures but also how well the model generalizes to new training conditions.
Tasks	Decision Making, Question Answering
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-5454/
PDF	https://www.aclweb.org/anthology/W18-5454
PWC	https://paperswithcode.com/paper/exploiting-attention-to-reveal-shortcomings
Repo
Framework

Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification


Title	Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification
Authors	Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang
Abstract	We propose a novel deep network called Mancs that solves the person re-identification problem from the following aspects: fully utilizing the attention mechanism for the person misalignment problem and properly sampling for the ranking loss to obtain more stable person representation. Technically, we contribute a novel fully attentional block which is deeply supervised and can be plugged into any CNN, and a novel curriculum sampling method which is effective for training ranking losses. The learning tasks are integrated into a unified framework and jointly optimized. Experiments have been carried out on Market1501, CUHK03 and DukeMTMC. All the results show that Mancs can significantly outperform the previous state-of-the-arts. In addition, the effectiveness of the newly proposed ideas has been confirmed by extensive ablation studies.
Tasks	Person Re-Identification
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/mancs-a-multi-task-attentional-network-with
Repo
Framework

`Indicatements’ that character language models learn English morpho-syntactic units and regularities


Title	`Indicatements’ that character language models learn English morpho-syntactic units and regularities \|
Authors	Yova Kementchedjhieva, Adam Lopez
Abstract	Character language models have access to surface morphological patterns, but it is not clear whether or \textit{how} they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction.
Tasks	Feature Engineering, Language Modelling, Machine Translation, Morphological Tagging, Speech Recognition
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-5417/
PDF	https://www.aclweb.org/anthology/W18-5417
PWC	https://paperswithcode.com/paper/indicatements-that-character-language-models-1
Repo
Framework

Sequence-to-Segment Networks for Segment Detection


Title	Sequence-to-Segment Networks for Segment Detection
Authors	Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras
Abstract	Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments. To address this problem, we propose the Sequence-to-Segment Network (S$^2$N), a novel end-to-end sequential encoder-decoder architecture. S$^2$N first encodes the input into a sequence of hidden states that progressively capture both local and holistic information. It then employs a novel decoding architecture, called Segment Detection Unit (SDU), that integrates the decoder state and encoder hidden states to detect segments sequentially. During training, we formulate the assignment of predicted segments to ground truth as bipartite matching and use the Earth Mover’s Distance to calculate the localization errors. We experiment with S$^2$N on temporal action proposal generation and video summarization and show that S$^2$N achieves state-of-the-art performance on both tasks.
Tasks	Temporal Action Proposal Generation, Video Summarization
Published	2018-12-01
URL	http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection
PDF	http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection.pdf
PWC	https://paperswithcode.com/paper/sequence-to-segment-networks-for-segment
Repo
Framework

Evaluation of generative networks through their data augmentation capacity


Title	Evaluation of generative networks through their data augmentation capacity
Authors	Timothée Lesort, Florian Bordes, Jean-Francois Goudou, David Filliat
Abstract	Generative networks are known to be difficult to assess. Recent works on generative models, especially on generative adversarial networks, produce nice samples of varied categories of images. But the validation of their quality is highly dependent on the method used. A good generator should generate data which contain meaningful and varied information and that fit the distribution of a dataset. This paper presents a new method to assess a generator. Our approach is based on training a classifier with a mixture of real and generated samples. We train a generative model over a labeled training set, then we use this generative model to sample new data points that we mix with the original training data. This mixture of real and generated data is thus used to train a classifier which is afterwards tested on a given labeled test dataset. We compare this result with the score of the same classifier trained on the real training data mixed with noise. By computing the classifier’s accuracy with different ratios of samples from both distributions (real and generated) we are able to estimate if the generator successfully fits and is able to generalize the distribution of the dataset. Our experiments compare the result of different generators from the VAE and GAN framework on MNIST and fashion MNIST dataset.
Tasks	Data Augmentation
Published	2018-01-01
URL	https://openreview.net/forum?id=HJ1HFlZAb
PDF	https://openreview.net/pdf?id=HJ1HFlZAb
PWC	https://paperswithcode.com/paper/evaluation-of-generative-networks-through
Repo
Framework

Mapping Texts to Scripts: An Entailment Study


Title	Mapping Texts to Scripts: An Entailment Study
Authors	Simon Ostermann, Hannah Seitz, Stefan Thater, Manfred Pinkal
Abstract
Tasks	Natural Language Inference
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1512/
PDF	https://www.aclweb.org/anthology/L18-1512
PWC	https://paperswithcode.com/paper/mapping-texts-to-scripts-an-entailment-study
Repo
Framework


Title	Predicting Foreign Language Usage from English-Only Social Media Posts
Authors	Svitlana Volkova, Stephen Ranshous, Lawrence Phillips
Abstract	Social media is known for its multi-cultural and multilingual interactions, a natural product of which is code-mixing. Multilingual speakers mix languages they tweet to address a different audience, express certain feelings, or attract attention. This paper presents a large-scale analysis of 6 million tweets produced by 27 thousand multilingual users speaking 12 other languages besides English. We rely on this corpus to build predictive models to infer non-English languages that users speak exclusively from their English tweets. Unlike native language identification task, we rely on large amounts of informal social media communications rather than ESL essays. We contrast the predictive power of the state-of-the-art machine learning models trained on lexical, syntactic, and stylistic signals with neural network models learned from word, character and byte representations extracted from English only tweets. We report that content, style and syntax are the most predictive of non-English languages that users speak on Twitter. Neural network models learned from byte representations of user content combined with transfer learning yield the best performance. Finally, by analyzing cross-lingual transfer {–} the influence of non-English languages on various levels of linguistic performance in English, we present novel findings on stylistic and syntactic variations across speakers of 12 languages in social media.
Tasks	Cross-Lingual Transfer, Language Identification, Native Language Identification, Transfer Learning
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2096/
PDF	https://www.aclweb.org/anthology/N18-2096
PWC	https://paperswithcode.com/paper/predicting-foreign-language-usage-from
Repo
Framework

Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)


Title	Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)
Authors	Trefor Evans, Prasanth Nair
Abstract	We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points.
Tasks	Bayesian Inference, Gaussian Processes
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2139
PDF	http://proceedings.mlr.press/v80/evans18a/evans18a.pdf
PWC	https://paperswithcode.com/paper/scalable-gaussian-processes-with-grid
Repo
Framework

Rectify Heterogeneous Models with Semantic Mapping


Title	Rectify Heterogeneous Models with Semantic Mapping
Authors	Han-Jia Ye, De-Chuan Zhan, Yuan Jiang, Zhi-Hua Zhou
Abstract	On the way to the robust learner for real-world applications, there are still great challenges, including considering unknown environments with limited data. Learnware (Zhou; 2016) describes a novel perspective, and claims that learning models should have reusable and evolvable properties. We propose to Encode Meta InformaTion of features (EMIT), as the model specification for characterizing the changes, which grants the model evolvability to bridge heterogeneous feature spaces. Then, pre-trained models from related tasks can be Reused by our REctiFy via heterOgeneous pRedictor Mapping (REFORM}) framework. In summary, the pre-trained model is adapted to a new environment with different features, through model refining on only a small amount of training data in the current task. Experimental results over both synthetic and real-world tasks with diverse feature configurations validate the effectiveness and practical utility of the proposed framework.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1971
PDF	http://proceedings.mlr.press/v80/ye2018c/ye2018c.pdf
PWC	https://paperswithcode.com/paper/rectify-heterogeneous-models-with-semantic
Repo
Framework

Simplified Abugidas


Title	Simplified Abugidas
Authors	Chenchen Ding, Masao Utiyama, Eiichiro Sumita
Abstract	An abugida is a writing system where the consonant letters represent syllables with a default vowel and other vowels are denoted by diacritics. We investigate the feasibility of recovering the original text written in an abugida after omitting subordinate diacritics and merging consonant letters with similar phonetic values. This is crucial for developing more efficient input methods by reducing the complexity in abugidas. Four abugidas in the southern Brahmic family, i.e., Thai, Burmese, Khmer, and Lao, were studied using a newswire 20,000-sentence dataset. We compared the recovery performance of a support vector machine and an LSTM-based recurrent neural network, finding that the abugida graphemes could be recovered with 94{%} - 97{%} accuracy at the top-1 level and 98{%} - 99{%} at the top-4 level, even after omitting most diacritics (10 - 30 types) and merging the remaining 30 - 50 characters into 21 graphemes.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2078/
PDF	https://www.aclweb.org/anthology/P18-2078
PWC	https://paperswithcode.com/paper/simplified-abugidas
Repo
Framework

EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.


Title	EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Authors	C{'e}dric Fayet, Arnaud Delhay, Damien Lolive, Pierre-Fran{\c{c}}ois Marteau
Abstract
Tasks	Action Detection, Anomaly Detection, Fraud Detection, Intrusion Detection
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1339/
PDF	https://www.aclweb.org/anthology/L18-1339
PWC	https://paperswithcode.com/paper/emoly-emotion-and-anomaly-a-new-corpus-for
Repo
Framework

Work Smart - Reducing Effort in Short-Answer Grading


Title	Work Smart - Reducing Effort in Short-Answer Grading
Authors	Margot Mieskes, Ulrike Pad{'o}
Abstract
Tasks	Active Learning, Reading Comprehension
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-7107/
PDF	https://www.aclweb.org/anthology/W18-7107
PWC	https://paperswithcode.com/paper/work-smart-reducing-effort-in-short-answer
Repo
Framework

A Study of the Importance of External Knowledge in the Named Entity Recognition Task


Title	A Study of the Importance of External Knowledge in the Named Entity Recognition Task
Authors	Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum
Abstract	In this work, we discuss the importance of external knowledge for performing Named Entity Recognition (NER). We present a novel modular framework that divides the knowledge into four categories according to the depth of knowledge they convey. Each category consists of a set of features automatically generated from different information sources, such as a knowledge-base, a list of names, or document-specific semantic annotations. Further, we show the effects on performance when incrementally adding deeper knowledge and discuss effectiveness/efficiency trade-offs.
Tasks	Named Entity Recognition, Question Answering
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2039/
PDF	https://www.aclweb.org/anthology/P18-2039
PWC	https://paperswithcode.com/paper/a-study-of-the-importance-of-external
Repo
Framework

A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings


Title	A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings
Authors	Nina Poerner, Florian Schiel
Abstract
Tasks	Chunking, Language Modelling, Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1452/
PDF	https://www.aclweb.org/anthology/L18-1452
PWC	https://paperswithcode.com/paper/a-web-service-for-pre-segmenting-very-long
Repo
Framework

Firing Bandits: Optimizing Crowdfunding


Title	Firing Bandits: Optimizing Crowdfunding
Authors	Lalit Jain, Kevin Jamieson
Abstract	In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem. In our setting, Bernoulli arms emit no rewards until their cumulative number of successes over any number of trials exceeds a fixed threshold and then provides no additional reward for any additional trials - a process reminiscent to that of a neuron firing once it reaches the action potential and then saturates. In the spirit of an infinite armed bandit problem, the player can add new arms whose expected probability of success is drawn iid from an unknown distribution – this endless supply of projects models the harsh reality that the number of projects seeking funding greatly exceeds the total capital available by lenders. Crowdfunding platforms naturally fall under this setting where the arms are potential projects, and their probability of success is the probability that a potential funder decides to fund it after reviewing it. The goal is to play arms (prioritize the display of projects on a webpage) to maximize the number of arms that reach the firing threshold (meet their goal amount) using as few total trials (number of impressions) as possible over all the played arms. We provide an algorithm for this setting and prove sublinear regret bounds.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2462
PDF	http://proceedings.mlr.press/v80/jain18a/jain18a.pdf
PWC	https://paperswithcode.com/paper/firing-bandits-optimizing-crowdfunding
Repo
Framework