January 25, 2020

2951 words 14 mins read

Paper Group NAWR 22

Paper Group NAWR 22

Are Red Roses Red? Evaluating Consistency of Question-Answering Models. Collect and Select: Semantic Alignment Metric Learning for Few-Shot Learning. Enhancing Opinion Role Labeling with Semantic-Aware Word Representations from Semantic Role Labeling. We Need to Talk about Standard Splits. Fast classification of small X-ray diffraction datasets usi …

Are Red Roses Red? Evaluating Consistency of Question-Answering Models

Title Are Red Roses Red? Evaluating Consistency of Question-Answering Models
Authors Marco Tulio Ribeiro, Carlos Guestrin, Sameer Singh
Abstract Although current evaluation of question-answering systems treats predictions in isolation, we need to consider the relationship between predictions to measure true understanding. A model should be penalized for answering {}no{''} to {}Is the rose red?{''} if it answers {}red{''} to {}What color is the rose?{''}. We propose a method to automatically extract such implications for instances from two QA datasets, VQA and SQuAD, which we then use to evaluate the consistency of models. Human evaluation shows these generated implications are well formed and valid. Consistency evaluation provides crucial insights into gaps in existing models, while retraining with implication-augmented data improves consistency on both synthetic and human-generated implications.
Tasks Question Answering, Visual Question Answering
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1621/
PDF https://www.aclweb.org/anthology/P19-1621
PWC https://paperswithcode.com/paper/are-red-roses-red-evaluating-consistency-of
Repo https://github.com/marcotcr/qa_consistency
Framework tf

Collect and Select: Semantic Alignment Metric Learning for Few-Shot Learning

Title Collect and Select: Semantic Alignment Metric Learning for Few-Shot Learning
Authors Fusheng Hao, Fengxiang He, Jun Cheng, Lei Wang, Jianzhong Cao, Dacheng Tao
Abstract Few-shot learning aims to learn latent patterns from few training examples and has shown promises in practice. However, directly calculating the distances between the query image and support image in existing methods may cause ambiguity because dominant objects can locate anywhere on images. To address this issue, this paper proposes a Semantic Alignment Metric Learning (SAML) method for few-shot learning that aligns the semantically relevant dominant objects through a “collect-and-select” strategy. Specifically, we first calculate a relation matrix (RM) to “collect” the distances of each local region pairs of the 3D tensor extracted from a query image and the mean tensor of the support images. Then, the attention technique is adapted to “select” the semantically relevant pairs and put more weights on them. Afterwards, a multi-layer perceptron (MLP) is utilized to map the reweighted RMs to their corresponding similarity scores. Theoretical analysis demonstrates the generalization ability of SAML and gives a theoretical guarantee. Empirical results demonstrate that semantic alignment is achieved. Extensive experiments on benchmark datasets validate the strengths of the proposed approach and demonstrate that SAML significantly outperforms the current state-of-the-art methods. The source code is available at https://github.com/haofusheng/SAML.
Tasks Few-Shot Learning, Metric Learning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Hao_Collect_and_Select_Semantic_Alignment_Metric_Learning_for_Few-Shot_Learning_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Hao_Collect_and_Select_Semantic_Alignment_Metric_Learning_for_Few-Shot_Learning_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/collect-and-select-semantic-alignment-metric
Repo https://github.com/haofusheng/SAML
Framework pytorch

Enhancing Opinion Role Labeling with Semantic-Aware Word Representations from Semantic Role Labeling

Title Enhancing Opinion Role Labeling with Semantic-Aware Word Representations from Semantic Role Labeling
Authors Meishan Zhang, Peili Liang, Guohong Fu
Abstract Opinion role labeling (ORL) is an important task for fine-grained opinion mining, which identifies important opinion arguments such as holder and target for a given opinion trigger. The task is highly correlative with semantic role labeling (SRL), which identifies important semantic arguments such as agent and patient for a given predicate. As predicate agents and patients usually correspond to opinion holders and targets respectively, SRL could be valuable for ORL. In this work, we propose a simple and novel method to enhance ORL by utilizing SRL, presenting semantic-aware word representations which are learned from SRL. The representations are then fed into a baseline neural ORL model as basic inputs. We verify the proposed method on a benchmark MPQA corpus. Experimental results show that the proposed method is highly effective. In addition, we compare the method with two representative methods of SRL integration as well, finding that our method can outperform the two methods significantly, achieving 1.47{%} higher F-scores than the better one.
Tasks Fine-Grained Opinion Analysis, Opinion Mining, Semantic Role Labeling
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1066/
PDF https://www.aclweb.org/anthology/N19-1066
PWC https://paperswithcode.com/paper/enhancing-opinion-role-labeling-with-semantic
Repo https://github.com/zhangmeishan/SRL4ORL
Framework pytorch

We Need to Talk about Standard Splits

Title We Need to Talk about Standard Splits
Authors Kyle Gorman, Steven Bedrick
Abstract It is standard practice in speech {&} language technology to rank systems according to their performance on a test set held out for evaluation. However, few researchers apply statistical tests to determine whether differences in performance are likely to arise by chance, and few examine the stability of system ranking across multiple training-testing splits. We conduct replication and reproduction experiments with nine part-of-speech taggers published between 2000 and 2018, each of which claimed state-of-the-art performance on a widely-used {``}standard split{''}. While we replicate results on the standard split, we fail to reliably reproduce some rankings when we repeat this analysis with randomly generated training-testing splits. We argue that randomly generated splits should be used in system evaluation. |
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1267/
PDF https://www.aclweb.org/anthology/P19-1267
PWC https://paperswithcode.com/paper/we-need-to-talk-about-standard-splits
Repo https://github.com/kylebgorman/SOTA-taggers
Framework none

Fast classification of small X-ray diffraction datasets using data augmentation and deep neural networks

Title Fast classification of small X-ray diffraction datasets using data augmentation and deep neural networks
Authors Felipe Oviedo, Zekun Ren, Shijing Sun, Charles Settens, Zhe Liu, Noor Titan Putri Hartono, Savitha Ramasamy, Brian L. DeCost, Siyu I. P. Tian, Giuseppe Romano, Aaron Gilad Kusne, Tonio Buonassisi
Abstract X-ray diffraction (XRD) data acquisition and analysis is among the most time-consuming steps in the development cycle of novel thin-film materials. We propose a machine learning-enabled approach to predict crystallographic dimensionality and space group from a limited number of thin-film XRD patterns. We overcome the scarce data problem intrinsic to novel materials development by coupling a supervised machine learning approach with a model-agnostic, physics-informed data augmentation strategy using simulated data from the Inorganic Crystal Structure Database (ICSD) and experimental data. As a test case, 115 thin-film metal-halides spanning three dimensionalities and seven space groups are synthesized and classified. After testing various algorithms, we develop and implement an all convolutional neural network, with cross-validated accuracies for dimensionality and space group classification of 93 and 89%, respectively. We propose average class activation maps, computed from a global average pooling layer, to allow high model interpretability by human experimentalists, elucidating the root causes of misclassification. Finally, we systematically evaluate the maximum XRD pattern step size (data acquisition rate) before loss of predictive accuracy occurs, and determine it to be 0.16° 2θ, which enables an XRD pattern to be obtained and classified in 5.5 min or less.
Tasks Data Augmentation, Interpretable Machine Learning, Material Classification, Material Recognition, Time Series Classification, X-Ray Diffraction (XRD)
Published 2019-05-17
URL https://www.nature.com/articles/s41524-019-0196-x
PDF https://www.nature.com/articles/s41524-019-0196-x.pdf
PWC https://paperswithcode.com/paper/fast-classification-of-small-x-ray-1
Repo https://github.com/PV-Lab/autoXRD
Framework tf

Pooled Contextualized Embeddings for Named Entity Recognition

Title Pooled Contextualized Embeddings for Named Entity Recognition
Authors Alan Akbik, Tanja Bergmann, Rol Vollgraf,
Abstract Contextual string embeddings are a recent type of contextualized word embedding that were shown to yield state-of-the-art results when utilized in a range of sequence labeling tasks. They are based on character-level language models which treat text as distributions over characters and are capable of generating embeddings for any string of characters within any textual context. However, such purely character-based approaches struggle to produce meaningful embeddings if a rare string is used in a underspecified context. To address this drawback, we propose a method in which we dynamically aggregate contextualized embeddings of each unique string that we encounter. We then use a pooling operation to distill a {''}global{''} word representation from all contextualized instances. We evaluate these {''}pooled contextualized embeddings{''} on common named entity recognition (NER) tasks such as CoNLL-03 and WNUT and show that our approach significantly improves the state-of-the-art for NER. We make all code and pre-trained models available to the research community for use and reproduction.
Tasks Named Entity Recognition
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1078/
PDF https://www.aclweb.org/anthology/N19-1078
PWC https://paperswithcode.com/paper/pooled-contextualized-embeddings-for-named
Repo https://github.com/zalandoresearch/flair
Framework pytorch

Deep Copycat Networks for Text-to-Text Generation

Title Deep Copycat Networks for Text-to-Text Generation
Authors Julia Ive, Pranava Madhyastha, Lucia Specia
Abstract Most text-to-text generation tasks, for example text summarisation and text simplification, require copying words from the input to the output. We introduce Copycat, a transformer-based pointer network for such tasks which obtains competitive results in abstractive text summarisation and generates more abstractive summaries. We propose a further extension of this architecture for automatic post-editing, where generation is conditioned over two inputs (source language and machine translation), and the model is capable of deciding where to copy information from. This approach achieves competitive performance when compared to state-of-the-art automated post-editing systems. More importantly, we show that it addresses a well-known limitation of automatic post-editing - overcorrecting translations - and that our novel mechanism for copying source language words improves the results.
Tasks Automatic Post-Editing, Machine Translation, Text Generation, Text Simplification
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1318/
PDF https://www.aclweb.org/anthology/D19-1318
PWC https://paperswithcode.com/paper/deep-copycat-networks-for-text-to-text
Repo https://github.com/ImperialNLP/CopyCat
Framework tf

SNU IDS at SemEval-2019 Task 3: Addressing Training-Test Class Distribution Mismatch in Conversational Classification

Title SNU IDS at SemEval-2019 Task 3: Addressing Training-Test Class Distribution Mismatch in Conversational Classification
Authors Sanghwan Bae, Jihun Choi, Sang-goo Lee
Abstract We present several techniques to tackle the mismatch in class distributions between training and test data in the Contextual Emotion Detection task of SemEval 2019, by extending the existing methods for class imbalance problem. Reducing the distance between the distribution of prediction and ground truth, they consistently show positive effects on the performance. Also we propose a novel neural architecture which utilizes representation of overall context as well as of each utterance. The combination of the methods and the models achieved micro F1 score of about 0.766 on the final evaluation.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2054/
PDF https://www.aclweb.org/anthology/S19-2054
PWC https://paperswithcode.com/paper/snu-ids-at-semeval-2019-task-3-addressing
Repo https://github.com/baaesh/semeval19_task3
Framework pytorch

Robust Person Re-Identification by Modelling Feature Uncertainty

Title Robust Person Re-Identification by Modelling Feature Uncertainty
Authors Tianyuan Yu, Da Li, Yongxin Yang, Timothy M. Hospedales, Tao Xiang
Abstract We aim to learn deep person re-identification (ReID) models that are robust against noisy training data. Two types of noise are prevalent in practice: (1) label noise caused by human annotator errors and (2) data outliers caused by person detector errors or occlusion. Both types of noise pose serious problems for training ReID models, yet have been largely ignored so far. In this paper, we propose a novel deep network termed DistributionNet for robust ReID. Instead of representing each person image as a feature vector, DistributionNet models it as a Gaussian distribution with its variance representing the uncertainty of the extracted features. A carefully designed loss is formulated in DistributionNet to unevenly allocate uncertainty across training samples. Consequently, noisy samples are assigned large variance/uncertainty, which effectively alleviates their negative impacts on model fitting. Extensive experiments demonstrate that our model is more effective than alternative noise-robust deep models. The source code is available at: https://github.com/TianyuanYu/DistributionNet
Tasks Person Re-Identification
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Robust_Person_Re-Identification_by_Modelling_Feature_Uncertainty_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Yu_Robust_Person_Re-Identification_by_Modelling_Feature_Uncertainty_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/robust-person-re-identification-by-modelling
Repo https://github.com/TianyuanYu/DistributionNet
Framework tf

Continuous Quality Control and Advanced Text Segment Annotation with WAT-SL 2.0

Title Continuous Quality Control and Advanced Text Segment Annotation with WAT-SL 2.0
Authors Christina Lohr, Johannes Kiesel, Stephanie Luther, Johannes Hellrich, Tobias Kolditz, Benno Stein, Udo Hahn
Abstract Today{'}s widely used annotation tools were designed for annotating typically short textual mentions of entities or relations, making their interface cumbersome to use for long(er) stretches of text, e.g, sentences running over several lines in a document. They also lack systematic support for hierarchically structured labels, i.e., one label being conceptually more general than another (e.g., anamnesis in relation to family anamnesis). Moreover, as a more fundamental shortcoming of today{'}s tools, they provide no continuous quality con trol mechanisms for the annotation process, an essential feature to intrinsically support iterative cycles in the development of annotation guidelines. We alleviated these problems by developing WAT-SL 2.0, an open-source web-based annotation tool for long-segment labeling, hierarchically structured label sets and built-ins for quality control.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4025/
PDF https://www.aclweb.org/anthology/W19-4025
PWC https://paperswithcode.com/paper/continuous-quality-control-and-advanced-text
Repo https://github.com/webis-de/wat
Framework none

Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models

Title Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models
Authors Takashi Wada, Tomoharu Iwata, Yuji Matsumoto
Abstract Recently, a variety of unsupervised methods have been proposed that map pre-trained word embeddings of different languages into the same space without any parallel data. These methods aim to find a linear transformation based on the assumption that monolingual word embeddings are approximately isomorphic between languages. However, it has been demonstrated that this assumption holds true only on specific conditions, and with limited resources, the performance of these methods decreases drastically. To overcome this problem, we propose a new unsupervised multilingual embedding method that does not rely on such assumption and performs well under resource-poor scenarios, namely when only a small amount of monolingual data (i.e., 50k sentences) are available, or when the domains of monolingual data are different across languages. Our proposed model, which we call {`}Multilingual Neural Language Models{'}, shares some of the network parameters among multiple languages, and encodes sentences of multiple languages into the same space. The model jointly learns word embeddings of different languages in the same space, and generates multilingual embeddings without any parallel data or pre-training. Our experiments on word alignment tasks have demonstrated that, on the low-resource condition, our model substantially outperforms existing unsupervised and even supervised methods trained with 500 bilingual pairs of words. Our model also outperforms unsupervised methods given different-domain corpora across languages. Our code is publicly available. |
Tasks Word Alignment, Word Embeddings
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1300/
PDF https://www.aclweb.org/anthology/P19-1300
PWC https://paperswithcode.com/paper/unsupervised-multilingual-word-embedding-with
Repo https://github.com/twadada/multilingual-nlm
Framework pytorch

A Bayesian Theory of Conformity in Collective Decision Making

Title A Bayesian Theory of Conformity in Collective Decision Making
Authors Koosha Khalvati, Saghar Mirbagheri, Seongmin A. Park, Jean-Claude Dreher, Rajesh Pn Rao
Abstract In collective decision making, members of a group need to coordinate their actions in order to achieve a desirable outcome. When there is no direct communication between group members, one should decide based on inferring others’ intentions from their actions. The inference of others’ intentions is called “theory of mind” and can involve different levels of reasoning, from a single inference on a hidden variable to considering others partially or fully optimal and reasoning about their actions conditioned on one’s own actions (levels of “theory of mind”). In this paper, we present a new Bayesian theory of collective decision making based on a simple yet most commonly observed behavior: conformity. We show that such a Bayesian framework allows one to achieve any level of theory of mind in collective decision making. The viability of our framework is demonstrated on two different experiments, a consensus task with 120 subjects and a volunteer’s dilemma task with 29 subjects, each with multiple conditions.
Tasks Decision Making
Published 2019-12-01
URL http://papers.nips.cc/paper/9164-a-bayesian-theory-of-conformity-in-collective-decision-making
PDF http://papers.nips.cc/paper/9164-a-bayesian-theory-of-conformity-in-collective-decision-making.pdf
PWC https://paperswithcode.com/paper/a-bayesian-theory-of-conformity-in-collective
Repo https://github.com/kooosha/BayesianConformity
Framework none

Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration

Title Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration
Authors Clarice Poon, Jingwei Liang
Abstract The alternating direction method of multipliers (ADMM) is one of the most widely used first-order optimisation methods in the literature owing to its simplicity, flexibility and efficiency. Over the years, numerous efforts are made to improve the performance of the method, such as the inertial technique. By studying the geometric properties of ADMM, we discuss the limitations of current inertial accelerated ADMM and then present and analyze an adaptive acceleration scheme for the method. Numerical experiments on problems arising from image processing, statistics and machine learning demonstrate the advantages of the proposed acceleration approach.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8955-trajectory-of-alternating-direction-method-of-multipliers-and-adaptive-acceleration
PDF http://papers.nips.cc/paper/8955-trajectory-of-alternating-direction-method-of-multipliers-and-adaptive-acceleration.pdf
PWC https://paperswithcode.com/paper/trajectory-of-alternating-direction-method-of
Repo https://github.com/jliang993/A3DMM
Framework none

Learning to Extract Flawless Slow Motion From Blurry Videos

Title Learning to Extract Flawless Slow Motion From Blurry Videos
Authors Meiguang Jin, Zhe Hu, Paolo Favaro
Abstract In this paper, we introduce the task of generating a sharp slow-motion video given a low frame rate blurry video. We propose a data-driven approach, where the training data is captured with a high frame rate camera and blurry images are simulated through an averaging process. While it is possible to train a neural network to recover the sharp frames from their average, there is no guarantee of the temporal smoothness for the formed video, as the frames are estimated independently. To address the temporal smoothness requirement we propose a system with two networks: One, DeblurNet, to predict sharp keyframes and the second, InterpNet, to predict intermediate frames between the generated keyframes. A smooth transition is ensured by interpolating between consecutive keyframes using InterpNet. Moreover, the proposed scheme enables further increase in frame rate without retraining the network, by applying InterpNet recursively between pairs of sharp frames. We evaluate the proposed method on several datasets, including a novel dataset captured with a Sony RX V camera. We also demonstrate its performance of increasing the frame rate up to 20 times on real blurry videos.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Jin_Learning_to_Extract_Flawless_Slow_Motion_From_Blurry_Videos_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Jin_Learning_to_Extract_Flawless_Slow_Motion_From_Blurry_Videos_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/learning-to-extract-flawless-slow-motion-from
Repo https://github.com/MeiguangJin/slow-motion
Framework pytorch

Unsupervised Co-Learning on G-Manifolds Across Irreducible Representations

Title Unsupervised Co-Learning on G-Manifolds Across Irreducible Representations
Authors Yifeng Fan, Tingran Gao, Zhizhen Jane Zhao
Abstract We introduce a novel co-learning paradigm for manifolds naturally admitting an action of a transformation group $\mathcal{G}$, motivated by recent developments on learning a manifold from attached fibre bundle structures. We utilize a representation theoretic mechanism that canonically associates multiple independent vector bundles over a common base manifold, which provides multiple views for the geometry of the underlying manifold. The consistency across these fibre bundles provide a common base for performing unsupervised manifold co-learning through the redundancy created artificially across irreducible representations of the transformation group. We demonstrate the efficacy of our proposed algorithmic paradigm through drastically improved robust nearest neighbor identification in cryo-electron microscopy image analysis and the clustering accuracy in community detection.
Tasks Community Detection
Published 2019-12-01
URL http://papers.nips.cc/paper/9105-unsupervised-co-learning-on-g-manifolds-across-irreducible-representations
PDF http://papers.nips.cc/paper/9105-unsupervised-co-learning-on-g-manifolds-across-irreducible-representations.pdf
PWC https://paperswithcode.com/paper/unsupervised-co-learning-on-g-manifolds
Repo https://github.com/frankfyf/G-manifold-learning
Framework none
comments powered by Disqus