October 15, 2019

2269 words 11 mins read

Paper Group NANR 262

Paper Group NANR 262

Exploiting Attention to Reveal Shortcomings in Memory Models. Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification. `Indicatements’ that character language models learn English morpho-syntactic units and regularities. Sequence-to-Segment Networks for Segment Detection. Evaluation of generative networks throu …

Exploiting Attention to Reveal Shortcomings in Memory Models

Title Exploiting Attention to Reveal Shortcomings in Memory Models
Authors Kaylee Burns, Aida Nematzadeh, Erin Grant, Alison Gopnik, Tom Griffiths
Abstract The decision making processes of deep networks are difficult to understand and while their accuracy often improves with increased architectural complexity, so too does their opacity. Practical use of machine learning models, especially for question and answering applications, demands a system that is interpretable. We analyze the attention of a memory network model to reconcile contradictory performance on a challenging question-answering dataset that is inspired by theory-of-mind experiments. We equate success on questions to task classification, which explains not only test-time failures but also how well the model generalizes to new training conditions.
Tasks Decision Making, Question Answering
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-5454/
PDF https://www.aclweb.org/anthology/W18-5454
PWC https://paperswithcode.com/paper/exploiting-attention-to-reveal-shortcomings
Repo
Framework

Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification

Title Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification
Authors Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang
Abstract We propose a novel deep network called Mancs that solves the person re-identification problem from the following aspects: fully utilizing the attention mechanism for the person misalignment problem and properly sampling for the ranking loss to obtain more stable person representation. Technically, we contribute a novel fully attentional block which is deeply supervised and can be plugged into any CNN, and a novel curriculum sampling method which is effective for training ranking losses. The learning tasks are integrated into a unified framework and jointly optimized. Experiments have been carried out on Market1501, CUHK03 and DukeMTMC. All the results show that Mancs can significantly outperform the previous state-of-the-arts. In addition, the effectiveness of the newly proposed ideas has been confirmed by extensive ablation studies.
Tasks Person Re-Identification
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/mancs-a-multi-task-attentional-network-with
Repo
Framework

`Indicatements’ that character language models learn English morpho-syntactic units and regularities

Title `Indicatements’ that character language models learn English morpho-syntactic units and regularities |
Authors Yova Kementchedjhieva, Adam Lopez
Abstract Character language models have access to surface morphological patterns, but it is not clear whether or \textit{how} they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction.
Tasks Feature Engineering, Language Modelling, Machine Translation, Morphological Tagging, Speech Recognition
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-5417/
PDF https://www.aclweb.org/anthology/W18-5417
PWC https://paperswithcode.com/paper/indicatements-that-character-language-models-1
Repo
Framework

Sequence-to-Segment Networks for Segment Detection

Title Sequence-to-Segment Networks for Segment Detection
Authors Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras
Abstract Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments. To address this problem, we propose the Sequence-to-Segment Network (S$^2$N), a novel end-to-end sequential encoder-decoder architecture. S$^2$N first encodes the input into a sequence of hidden states that progressively capture both local and holistic information. It then employs a novel decoding architecture, called Segment Detection Unit (SDU), that integrates the decoder state and encoder hidden states to detect segments sequentially. During training, we formulate the assignment of predicted segments to ground truth as bipartite matching and use the Earth Mover’s Distance to calculate the localization errors. We experiment with S$^2$N on temporal action proposal generation and video summarization and show that S$^2$N achieves state-of-the-art performance on both tasks.
Tasks Temporal Action Proposal Generation, Video Summarization
Published 2018-12-01
URL http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection
PDF http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection.pdf
PWC https://paperswithcode.com/paper/sequence-to-segment-networks-for-segment
Repo
Framework

Evaluation of generative networks through their data augmentation capacity

Title Evaluation of generative networks through their data augmentation capacity
Authors Timothée Lesort, Florian Bordes, Jean-Francois Goudou, David Filliat
Abstract Generative networks are known to be difficult to assess. Recent works on generative models, especially on generative adversarial networks, produce nice samples of varied categories of images. But the validation of their quality is highly dependent on the method used. A good generator should generate data which contain meaningful and varied information and that fit the distribution of a dataset. This paper presents a new method to assess a generator. Our approach is based on training a classifier with a mixture of real and generated samples. We train a generative model over a labeled training set, then we use this generative model to sample new data points that we mix with the original training data. This mixture of real and generated data is thus used to train a classifier which is afterwards tested on a given labeled test dataset. We compare this result with the score of the same classifier trained on the real training data mixed with noise. By computing the classifier’s accuracy with different ratios of samples from both distributions (real and generated) we are able to estimate if the generator successfully fits and is able to generalize the distribution of the dataset. Our experiments compare the result of different generators from the VAE and GAN framework on MNIST and fashion MNIST dataset.
Tasks Data Augmentation
Published 2018-01-01
URL https://openreview.net/forum?id=HJ1HFlZAb
PDF https://openreview.net/pdf?id=HJ1HFlZAb
PWC https://paperswithcode.com/paper/evaluation-of-generative-networks-through
Repo
Framework

Mapping Texts to Scripts: An Entailment Study

Title Mapping Texts to Scripts: An Entailment Study
Authors Simon Ostermann, Hannah Seitz, Stefan Thater, Manfred Pinkal
Abstract
Tasks Natural Language Inference
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1512/
PDF https://www.aclweb.org/anthology/L18-1512
PWC https://paperswithcode.com/paper/mapping-texts-to-scripts-an-entailment-study
Repo
Framework

Predicting Foreign Language Usage from English-Only Social Media Posts

Title Predicting Foreign Language Usage from English-Only Social Media Posts
Authors Svitlana Volkova, Stephen Ranshous, Lawrence Phillips
Abstract Social media is known for its multi-cultural and multilingual interactions, a natural product of which is code-mixing. Multilingual speakers mix languages they tweet to address a different audience, express certain feelings, or attract attention. This paper presents a large-scale analysis of 6 million tweets produced by 27 thousand multilingual users speaking 12 other languages besides English. We rely on this corpus to build predictive models to infer non-English languages that users speak exclusively from their English tweets. Unlike native language identification task, we rely on large amounts of informal social media communications rather than ESL essays. We contrast the predictive power of the state-of-the-art machine learning models trained on lexical, syntactic, and stylistic signals with neural network models learned from word, character and byte representations extracted from English only tweets. We report that content, style and syntax are the most predictive of non-English languages that users speak on Twitter. Neural network models learned from byte representations of user content combined with transfer learning yield the best performance. Finally, by analyzing cross-lingual transfer {–} the influence of non-English languages on various levels of linguistic performance in English, we present novel findings on stylistic and syntactic variations across speakers of 12 languages in social media.
Tasks Cross-Lingual Transfer, Language Identification, Native Language Identification, Transfer Learning
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-2096/
PDF https://www.aclweb.org/anthology/N18-2096
PWC https://paperswithcode.com/paper/predicting-foreign-language-usage-from
Repo
Framework

Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)

Title Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)
Authors Trefor Evans, Prasanth Nair
Abstract We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points.
Tasks Bayesian Inference, Gaussian Processes
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2139
PDF http://proceedings.mlr.press/v80/evans18a/evans18a.pdf
PWC https://paperswithcode.com/paper/scalable-gaussian-processes-with-grid
Repo
Framework

Rectify Heterogeneous Models with Semantic Mapping

Title Rectify Heterogeneous Models with Semantic Mapping
Authors Han-Jia Ye, De-Chuan Zhan, Yuan Jiang, Zhi-Hua Zhou
Abstract On the way to the robust learner for real-world applications, there are still great challenges, including considering unknown environments with limited data. Learnware (Zhou; 2016) describes a novel perspective, and claims that learning models should have reusable and evolvable properties. We propose to Encode Meta InformaTion of features (EMIT), as the model specification for characterizing the changes, which grants the model evolvability to bridge heterogeneous feature spaces. Then, pre-trained models from related tasks can be Reused by our REctiFy via heterOgeneous pRedictor Mapping (REFORM}) framework. In summary, the pre-trained model is adapted to a new environment with different features, through model refining on only a small amount of training data in the current task. Experimental results over both synthetic and real-world tasks with diverse feature configurations validate the effectiveness and practical utility of the proposed framework.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1971
PDF http://proceedings.mlr.press/v80/ye2018c/ye2018c.pdf
PWC https://paperswithcode.com/paper/rectify-heterogeneous-models-with-semantic
Repo
Framework

Simplified Abugidas

Title Simplified Abugidas
Authors Chenchen Ding, Masao Utiyama, Eiichiro Sumita
Abstract An abugida is a writing system where the consonant letters represent syllables with a default vowel and other vowels are denoted by diacritics. We investigate the feasibility of recovering the original text written in an abugida after omitting subordinate diacritics and merging consonant letters with similar phonetic values. This is crucial for developing more efficient input methods by reducing the complexity in abugidas. Four abugidas in the southern Brahmic family, i.e., Thai, Burmese, Khmer, and Lao, were studied using a newswire 20,000-sentence dataset. We compared the recovery performance of a support vector machine and an LSTM-based recurrent neural network, finding that the abugida graphemes could be recovered with 94{%} - 97{%} accuracy at the top-1 level and 98{%} - 99{%} at the top-4 level, even after omitting most diacritics (10 - 30 types) and merging the remaining 30 - 50 characters into 21 graphemes.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2078/
PDF https://www.aclweb.org/anthology/P18-2078
PWC https://paperswithcode.com/paper/simplified-abugidas
Repo
Framework

EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.

Title EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Authors C{'e}dric Fayet, Arnaud Delhay, Damien Lolive, Pierre-Fran{\c{c}}ois Marteau
Abstract
Tasks Action Detection, Anomaly Detection, Fraud Detection, Intrusion Detection
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1339/
PDF https://www.aclweb.org/anthology/L18-1339
PWC https://paperswithcode.com/paper/emoly-emotion-and-anomaly-a-new-corpus-for
Repo
Framework

Work Smart - Reducing Effort in Short-Answer Grading

Title Work Smart - Reducing Effort in Short-Answer Grading
Authors Margot Mieskes, Ulrike Pad{'o}
Abstract
Tasks Active Learning, Reading Comprehension
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-7107/
PDF https://www.aclweb.org/anthology/W18-7107
PWC https://paperswithcode.com/paper/work-smart-reducing-effort-in-short-answer
Repo
Framework

A Study of the Importance of External Knowledge in the Named Entity Recognition Task

Title A Study of the Importance of External Knowledge in the Named Entity Recognition Task
Authors Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum
Abstract In this work, we discuss the importance of external knowledge for performing Named Entity Recognition (NER). We present a novel modular framework that divides the knowledge into four categories according to the depth of knowledge they convey. Each category consists of a set of features automatically generated from different information sources, such as a knowledge-base, a list of names, or document-specific semantic annotations. Further, we show the effects on performance when incrementally adding deeper knowledge and discuss effectiveness/efficiency trade-offs.
Tasks Named Entity Recognition, Question Answering
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2039/
PDF https://www.aclweb.org/anthology/P18-2039
PWC https://paperswithcode.com/paper/a-study-of-the-importance-of-external
Repo
Framework

A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings

Title A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings
Authors Nina Poerner, Florian Schiel
Abstract
Tasks Chunking, Language Modelling, Speech Recognition
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1452/
PDF https://www.aclweb.org/anthology/L18-1452
PWC https://paperswithcode.com/paper/a-web-service-for-pre-segmenting-very-long
Repo
Framework

Firing Bandits: Optimizing Crowdfunding

Title Firing Bandits: Optimizing Crowdfunding
Authors Lalit Jain, Kevin Jamieson
Abstract In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem. In our setting, Bernoulli arms emit no rewards until their cumulative number of successes over any number of trials exceeds a fixed threshold and then provides no additional reward for any additional trials - a process reminiscent to that of a neuron firing once it reaches the action potential and then saturates. In the spirit of an infinite armed bandit problem, the player can add new arms whose expected probability of success is drawn iid from an unknown distribution – this endless supply of projects models the harsh reality that the number of projects seeking funding greatly exceeds the total capital available by lenders. Crowdfunding platforms naturally fall under this setting where the arms are potential projects, and their probability of success is the probability that a potential funder decides to fund it after reviewing it. The goal is to play arms (prioritize the display of projects on a webpage) to maximize the number of arms that reach the firing threshold (meet their goal amount) using as few total trials (number of impressions) as possible over all the played arms. We provide an algorithm for this setting and prove sublinear regret bounds.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2462
PDF http://proceedings.mlr.press/v80/jain18a/jain18a.pdf
PWC https://paperswithcode.com/paper/firing-bandits-optimizing-crowdfunding
Repo
Framework
comments powered by Disqus