Paper Group NANR 262
Exploiting Attention to Reveal Shortcomings in Memory Models. Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification. `Indicatements’ that character language models learn English morpho-syntactic units and regularities. Sequence-to-Segment Networks for Segment Detection. Evaluation of generative networks throu …
Exploiting Attention to Reveal Shortcomings in Memory Models
Title | Exploiting Attention to Reveal Shortcomings in Memory Models |
Authors | Kaylee Burns, Aida Nematzadeh, Erin Grant, Alison Gopnik, Tom Griffiths |
Abstract | The decision making processes of deep networks are difficult to understand and while their accuracy often improves with increased architectural complexity, so too does their opacity. Practical use of machine learning models, especially for question and answering applications, demands a system that is interpretable. We analyze the attention of a memory network model to reconcile contradictory performance on a challenging question-answering dataset that is inspired by theory-of-mind experiments. We equate success on questions to task classification, which explains not only test-time failures but also how well the model generalizes to new training conditions. |
Tasks | Decision Making, Question Answering |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5454/ |
https://www.aclweb.org/anthology/W18-5454 | |
PWC | https://paperswithcode.com/paper/exploiting-attention-to-reveal-shortcomings |
Repo | |
Framework | |
Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification
Title | Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification |
Authors | Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang |
Abstract | We propose a novel deep network called Mancs that solves the person re-identification problem from the following aspects: fully utilizing the attention mechanism for the person misalignment problem and properly sampling for the ranking loss to obtain more stable person representation. Technically, we contribute a novel fully attentional block which is deeply supervised and can be plugged into any CNN, and a novel curriculum sampling method which is effective for training ranking losses. The learning tasks are integrated into a unified framework and jointly optimized. Experiments have been carried out on Market1501, CUHK03 and DukeMTMC. All the results show that Mancs can significantly outperform the previous state-of-the-arts. In addition, the effectiveness of the newly proposed ideas has been confirmed by extensive ablation studies. |
Tasks | Person Re-Identification |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Cheng_Wang_Mancs_A_Multi-task_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/mancs-a-multi-task-attentional-network-with |
Repo | |
Framework | |
`Indicatements’ that character language models learn English morpho-syntactic units and regularities
Title | `Indicatements’ that character language models learn English morpho-syntactic units and regularities | |
Authors | Yova Kementchedjhieva, Adam Lopez |
Abstract | Character language models have access to surface morphological patterns, but it is not clear whether or \textit{how} they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction. |
Tasks | Feature Engineering, Language Modelling, Machine Translation, Morphological Tagging, Speech Recognition |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5417/ |
https://www.aclweb.org/anthology/W18-5417 | |
PWC | https://paperswithcode.com/paper/indicatements-that-character-language-models-1 |
Repo | |
Framework | |
Sequence-to-Segment Networks for Segment Detection
Title | Sequence-to-Segment Networks for Segment Detection |
Authors | Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras |
Abstract | Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments. To address this problem, we propose the Sequence-to-Segment Network (S$^2$N), a novel end-to-end sequential encoder-decoder architecture. S$^2$N first encodes the input into a sequence of hidden states that progressively capture both local and holistic information. It then employs a novel decoding architecture, called Segment Detection Unit (SDU), that integrates the decoder state and encoder hidden states to detect segments sequentially. During training, we formulate the assignment of predicted segments to ground truth as bipartite matching and use the Earth Mover’s Distance to calculate the localization errors. We experiment with S$^2$N on temporal action proposal generation and video summarization and show that S$^2$N achieves state-of-the-art performance on both tasks. |
Tasks | Temporal Action Proposal Generation, Video Summarization |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection |
http://papers.nips.cc/paper/7610-sequence-to-segment-networks-for-segment-detection.pdf | |
PWC | https://paperswithcode.com/paper/sequence-to-segment-networks-for-segment |
Repo | |
Framework | |
Evaluation of generative networks through their data augmentation capacity
Title | Evaluation of generative networks through their data augmentation capacity |
Authors | Timothée Lesort, Florian Bordes, Jean-Francois Goudou, David Filliat |
Abstract | Generative networks are known to be difficult to assess. Recent works on generative models, especially on generative adversarial networks, produce nice samples of varied categories of images. But the validation of their quality is highly dependent on the method used. A good generator should generate data which contain meaningful and varied information and that fit the distribution of a dataset. This paper presents a new method to assess a generator. Our approach is based on training a classifier with a mixture of real and generated samples. We train a generative model over a labeled training set, then we use this generative model to sample new data points that we mix with the original training data. This mixture of real and generated data is thus used to train a classifier which is afterwards tested on a given labeled test dataset. We compare this result with the score of the same classifier trained on the real training data mixed with noise. By computing the classifier’s accuracy with different ratios of samples from both distributions (real and generated) we are able to estimate if the generator successfully fits and is able to generalize the distribution of the dataset. Our experiments compare the result of different generators from the VAE and GAN framework on MNIST and fashion MNIST dataset. |
Tasks | Data Augmentation |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJ1HFlZAb |
https://openreview.net/pdf?id=HJ1HFlZAb | |
PWC | https://paperswithcode.com/paper/evaluation-of-generative-networks-through |
Repo | |
Framework | |
Mapping Texts to Scripts: An Entailment Study
Title | Mapping Texts to Scripts: An Entailment Study |
Authors | Simon Ostermann, Hannah Seitz, Stefan Thater, Manfred Pinkal |
Abstract | |
Tasks | Natural Language Inference |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1512/ |
https://www.aclweb.org/anthology/L18-1512 | |
PWC | https://paperswithcode.com/paper/mapping-texts-to-scripts-an-entailment-study |
Repo | |
Framework | |
Predicting Foreign Language Usage from English-Only Social Media Posts
Title | Predicting Foreign Language Usage from English-Only Social Media Posts |
Authors | Svitlana Volkova, Stephen Ranshous, Lawrence Phillips |
Abstract | Social media is known for its multi-cultural and multilingual interactions, a natural product of which is code-mixing. Multilingual speakers mix languages they tweet to address a different audience, express certain feelings, or attract attention. This paper presents a large-scale analysis of 6 million tweets produced by 27 thousand multilingual users speaking 12 other languages besides English. We rely on this corpus to build predictive models to infer non-English languages that users speak exclusively from their English tweets. Unlike native language identification task, we rely on large amounts of informal social media communications rather than ESL essays. We contrast the predictive power of the state-of-the-art machine learning models trained on lexical, syntactic, and stylistic signals with neural network models learned from word, character and byte representations extracted from English only tweets. We report that content, style and syntax are the most predictive of non-English languages that users speak on Twitter. Neural network models learned from byte representations of user content combined with transfer learning yield the best performance. Finally, by analyzing cross-lingual transfer {–} the influence of non-English languages on various levels of linguistic performance in English, we present novel findings on stylistic and syntactic variations across speakers of 12 languages in social media. |
Tasks | Cross-Lingual Transfer, Language Identification, Native Language Identification, Transfer Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2096/ |
https://www.aclweb.org/anthology/N18-2096 | |
PWC | https://paperswithcode.com/paper/predicting-foreign-language-usage-from |
Repo | |
Framework | |
Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF)
Title | Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF) |
Authors | Trefor Evans, Prasanth Nair |
Abstract | We introduce a kernel approximation strategy that enables computation of the Gaussian process log marginal likelihood and all hyperparameter derivatives in O(p) time. Our GRIEF kernel consists of p eigenfunctions found using a Nystrom approximation from a dense Cartesian product grid of inducing points. By exploiting algebraic properties of Kronecker and Khatri-Rao tensor products, computational complexity of the training procedure can be practically independent of the number of inducing points. This allows us to use arbitrarily many inducing points to achieve a globally accurate kernel approximation, even in high-dimensional problems. The fast likelihood evaluation enables type-I or II Bayesian inference on large-scale datasets. We benchmark our algorithms on real-world problems with up to two-million training points and 10^33 inducing points. |
Tasks | Bayesian Inference, Gaussian Processes |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2139 |
http://proceedings.mlr.press/v80/evans18a/evans18a.pdf | |
PWC | https://paperswithcode.com/paper/scalable-gaussian-processes-with-grid |
Repo | |
Framework | |
Rectify Heterogeneous Models with Semantic Mapping
Title | Rectify Heterogeneous Models with Semantic Mapping |
Authors | Han-Jia Ye, De-Chuan Zhan, Yuan Jiang, Zhi-Hua Zhou |
Abstract | On the way to the robust learner for real-world applications, there are still great challenges, including considering unknown environments with limited data. Learnware (Zhou; 2016) describes a novel perspective, and claims that learning models should have reusable and evolvable properties. We propose to Encode Meta InformaTion of features (EMIT), as the model specification for characterizing the changes, which grants the model evolvability to bridge heterogeneous feature spaces. Then, pre-trained models from related tasks can be Reused by our REctiFy via heterOgeneous pRedictor Mapping (REFORM}) framework. In summary, the pre-trained model is adapted to a new environment with different features, through model refining on only a small amount of training data in the current task. Experimental results over both synthetic and real-world tasks with diverse feature configurations validate the effectiveness and practical utility of the proposed framework. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1971 |
http://proceedings.mlr.press/v80/ye2018c/ye2018c.pdf | |
PWC | https://paperswithcode.com/paper/rectify-heterogeneous-models-with-semantic |
Repo | |
Framework | |
Simplified Abugidas
Title | Simplified Abugidas |
Authors | Chenchen Ding, Masao Utiyama, Eiichiro Sumita |
Abstract | An abugida is a writing system where the consonant letters represent syllables with a default vowel and other vowels are denoted by diacritics. We investigate the feasibility of recovering the original text written in an abugida after omitting subordinate diacritics and merging consonant letters with similar phonetic values. This is crucial for developing more efficient input methods by reducing the complexity in abugidas. Four abugidas in the southern Brahmic family, i.e., Thai, Burmese, Khmer, and Lao, were studied using a newswire 20,000-sentence dataset. We compared the recovery performance of a support vector machine and an LSTM-based recurrent neural network, finding that the abugida graphemes could be recovered with 94{%} - 97{%} accuracy at the top-1 level and 98{%} - 99{%} at the top-4 level, even after omitting most diacritics (10 - 30 types) and merging the remaining 30 - 50 characters into 21 graphemes. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2078/ |
https://www.aclweb.org/anthology/P18-2078 | |
PWC | https://paperswithcode.com/paper/simplified-abugidas |
Repo | |
Framework | |
EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context.
Title | EMO&LY (EMOtion and AnomaLY) : A new corpus for anomaly detection in an audiovisual stream with emotional context. |
Authors | C{'e}dric Fayet, Arnaud Delhay, Damien Lolive, Pierre-Fran{\c{c}}ois Marteau |
Abstract | |
Tasks | Action Detection, Anomaly Detection, Fraud Detection, Intrusion Detection |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1339/ |
https://www.aclweb.org/anthology/L18-1339 | |
PWC | https://paperswithcode.com/paper/emoly-emotion-and-anomaly-a-new-corpus-for |
Repo | |
Framework | |
Work Smart - Reducing Effort in Short-Answer Grading
Title | Work Smart - Reducing Effort in Short-Answer Grading |
Authors | Margot Mieskes, Ulrike Pad{'o} |
Abstract | |
Tasks | Active Learning, Reading Comprehension |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-7107/ |
https://www.aclweb.org/anthology/W18-7107 | |
PWC | https://paperswithcode.com/paper/work-smart-reducing-effort-in-short-answer |
Repo | |
Framework | |
A Study of the Importance of External Knowledge in the Named Entity Recognition Task
Title | A Study of the Importance of External Knowledge in the Named Entity Recognition Task |
Authors | Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum |
Abstract | In this work, we discuss the importance of external knowledge for performing Named Entity Recognition (NER). We present a novel modular framework that divides the knowledge into four categories according to the depth of knowledge they convey. Each category consists of a set of features automatically generated from different information sources, such as a knowledge-base, a list of names, or document-specific semantic annotations. Further, we show the effects on performance when incrementally adding deeper knowledge and discuss effectiveness/efficiency trade-offs. |
Tasks | Named Entity Recognition, Question Answering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2039/ |
https://www.aclweb.org/anthology/P18-2039 | |
PWC | https://paperswithcode.com/paper/a-study-of-the-importance-of-external |
Repo | |
Framework | |
A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings
Title | A Web Service for Pre-segmenting Very Long Transcribed Speech Recordings |
Authors | Nina Poerner, Florian Schiel |
Abstract | |
Tasks | Chunking, Language Modelling, Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1452/ |
https://www.aclweb.org/anthology/L18-1452 | |
PWC | https://paperswithcode.com/paper/a-web-service-for-pre-segmenting-very-long |
Repo | |
Framework | |
Firing Bandits: Optimizing Crowdfunding
Title | Firing Bandits: Optimizing Crowdfunding |
Authors | Lalit Jain, Kevin Jamieson |
Abstract | In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem. In our setting, Bernoulli arms emit no rewards until their cumulative number of successes over any number of trials exceeds a fixed threshold and then provides no additional reward for any additional trials - a process reminiscent to that of a neuron firing once it reaches the action potential and then saturates. In the spirit of an infinite armed bandit problem, the player can add new arms whose expected probability of success is drawn iid from an unknown distribution – this endless supply of projects models the harsh reality that the number of projects seeking funding greatly exceeds the total capital available by lenders. Crowdfunding platforms naturally fall under this setting where the arms are potential projects, and their probability of success is the probability that a potential funder decides to fund it after reviewing it. The goal is to play arms (prioritize the display of projects on a webpage) to maximize the number of arms that reach the firing threshold (meet their goal amount) using as few total trials (number of impressions) as possible over all the played arms. We provide an algorithm for this setting and prove sublinear regret bounds. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2462 |
http://proceedings.mlr.press/v80/jain18a/jain18a.pdf | |
PWC | https://paperswithcode.com/paper/firing-bandits-optimizing-crowdfunding |
Repo | |
Framework | |