January 26, 2020

2865 words 14 mins read

Paper Group ANR 1415

A Study of BERT for Non-Factoid Question-Answering under Passage Length Constraints. Multi-modal Discriminative Model for Vision-and-Language Navigation. Mean-field inference methods for neural networks. AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving. Multi-hop Question Answering via Reasoning Chains. Diachronic Topics …

A Study of BERT for Non-Factoid Question-Answering under Passage Length Constraints


Title	A Study of BERT for Non-Factoid Question-Answering under Passage Length Constraints
Authors	Yosi Mass, Haggai Roitman, Shai Erera, Or Rivlin, Bar Weiner, David Konopnicki
Abstract	We study the use of BERT for non-factoid question-answering, focusing on the passage re-ranking task under varying passage lengths. To this end, we explore the fine-tuning of BERT in different learning-to-rank setups, comprising both point-wise and pair-wise methods, resulting in substantial improvements over the state-of-the-art. We then analyze the effectiveness of BERT for different passage lengths and suggest how to cope with large passages.
Tasks	Learning-To-Rank, Passage Re-Ranking, Question Answering
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06780v1
PDF	https://arxiv.org/pdf/1908.06780v1.pdf
PWC	https://paperswithcode.com/paper/a-study-of-bert-for-non-factoid-question
Repo
Framework


Title	Multi-modal Discriminative Model for Vision-and-Language Navigation
Authors	Haoshuo Huang, Vihan Jain, Harsh Mehta, Jason Baldridge, Eugene Ie
Abstract	Vision-and-Language Navigation (VLN) is a natural language grounding task where agents have to interpret natural language instructions in the context of visual scenes in a dynamic environment to achieve prescribed navigation goals. Successful agents must have the ability to parse natural language of varying linguistic styles, ground them in potentially unfamiliar scenes, plan and react with ambiguous environmental feedback. Generalization ability is limited by the amount of human annotated data. In particular, \emph{paired} vision-language sequence data is expensive to collect. We develop a discriminator that evaluates how well an instruction explains a given path in VLN task using multi-modal alignment. Our study reveals that only a small fraction of the high-quality augmented data from \citet{Fried:2018:Speaker}, as scored by our discriminator, is useful for training VLN agents with similar performance on previously unseen environments. We also show that a VLN agent warm-started with pre-trained components from the discriminator outperforms the benchmark success rates of 35.5 by 10% relative measure on previously unseen environments.
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13358v1
PDF	https://arxiv.org/pdf/1905.13358v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-discriminative-model-for-vision
Repo
Framework

Mean-field inference methods for neural networks


Title	Mean-field inference methods for neural networks
Authors	Marylou Gabrié
Abstract	Machine learning algorithms relying on deep neural networks recently allowed a great leap forward in artificial intelligence. Despite the popularity of their applications, the efficiency of these algorithms remains largely unexplained from a theoretical point of view. The mathematical description of learning problems involves very large collections of interacting random variables, difficult to handle analytically as well as numerically. This complexity is precisely the object of study of statistical physics. Its mission, originally pointed towards natural systems, is to understand how macroscopic behaviors arise from microscopic laws. Mean-field methods are one type of approximation strategy developed in this view. We review a selection of classical mean-field methods and recent progress relevant for inference in neural networks. In particular, we remind the principles of derivations of high-temperature expansions, the replica method and message passing algorithms, highlighting their equivalences and complementarities. We also provide references for past and current directions of research on neural networks relying on mean-field methods.
Tasks
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00890v2
PDF	https://arxiv.org/pdf/1911.00890v2.pdf
PWC	https://paperswithcode.com/paper/mean-field-inference-methods-for-neural
Repo
Framework

AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving


Title	AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving
Authors	Sumanth Chennupati, Ganesh Sistu, Senthil Yogamani, Samir Rawashdeh
Abstract	Decision making in automated driving is highly specific to the environment and thus semantic segmentation plays a key role in recognizing the objects in the environment around the car. Pixel level classification once considered a challenging task which is now becoming mature to be productized in a car. However, semantic annotation is time consuming and quite expensive. Synthetic datasets with domain adaptation techniques have been used to alleviate the lack of large annotated datasets. In this work, we explore an alternate approach of leveraging the annotations of other tasks to improve semantic segmentation. Recently, multi-task learning became a popular paradigm in automated driving which demonstrates joint learning of multiple tasks improves overall performance of each tasks. Motivated by this, we use auxiliary tasks like depth estimation to improve the performance of semantic segmentation task. We propose adaptive task loss weighting techniques to address scale issues in multi-task loss functions which become more crucial in auxiliary tasks. We experimented on automotive datasets including SYNTHIA and KITTI and obtained 3% and 5% improvement in accuracy respectively.
Tasks	Decision Making, Depth Estimation, Domain Adaptation, Multi-Task Learning, Semantic Segmentation
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05808v1
PDF	http://arxiv.org/pdf/1901.05808v1.pdf
PWC	https://paperswithcode.com/paper/auxnet-auxiliary-tasks-enhanced-semantic
Repo
Framework

Multi-hop Question Answering via Reasoning Chains


Title	Multi-hop Question Answering via Reasoning Chains
Authors	Jifan Chen, Shih-ting Lin, Greg Durrett
Abstract	Multi-hop question answering requires models to gather information from different parts of a text to answer a question. Most current approaches learn to address this task in an end-to-end way with neural networks, without maintaining an explicit representation of the reasoning process. We propose a method to extract a discrete reasoning chain over the text, which consists of a series of sentences leading to the answer. We then feed the extracted chains to a BERT-based QA model to do final answer prediction. Critically, we do not rely on gold annotated chains or “supporting facts:” at training time, we derive pseudogold reasoning chains using heuristics based on named entity recognition and coreference resolution. Nor do we rely on these annotations at test time, as our model learns to extract chains from raw text alone. We test our approach on two recently proposed large multi-hop question answering datasets: WikiHop and HotpotQA, and achieve state-of-art performance on WikiHop and strong performance on HotpotQA. Our analysis shows the properties of chains that are crucial for high performance: in particular, modeling extraction sequentially is important, as is dealing with each candidate sentence in a context-aware way. Furthermore, human evaluation shows that our extracted chains allow humans to give answers with high confidence, indicating that these are a strong intermediate abstraction for this task.
Tasks	Coreference Resolution, Named Entity Recognition, Question Answering
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02610v1
PDF	https://arxiv.org/pdf/1910.02610v1.pdf
PWC	https://paperswithcode.com/paper/multi-hop-question-answering-via-reasoning
Repo
Framework

Diachronic Topics in New High German Poetry


Title	Diachronic Topics in New High German Poetry
Authors	Thomas N. Haider
Abstract	Statistical topic models are increasingly and popularly used by Digital Humanities scholars to perform distant reading tasks on literary data. It allows us to estimate what people talk about. Especially Latent Dirichlet Allocation (LDA) has shown its usefulness, as it is unsupervised, robust, easy to use, scalable, and it offers interpretable results. In a preliminary study, we apply LDA to a corpus of New High German poetry (textgrid, with 51k poems, 8m token), and use the distribution of topics over documents for a classification of poems into time periods and for authorship attribution.
Tasks	Topic Models
Published	2019-09-24
URL	https://arxiv.org/abs/1909.11189v1
PDF	https://arxiv.org/pdf/1909.11189v1.pdf
PWC	https://paperswithcode.com/paper/diachronic-topics-in-new-high-german-poetry
Repo
Framework

Variational Bayesian Context-aware Representation for Grocery Recommendation


Title	Variational Bayesian Context-aware Representation for Grocery Recommendation
Authors	Zaiqiao Meng, Richard McCreadie, Craig Macdonald, Iadh Ounis
Abstract	Grocery recommendation is an important recommendation use-case, which aims to predict which items a user might choose to buy in the future, based on their shopping history. However, existing methods only represent each user and item by single deterministic points in a low-dimensional continuous space. In addition, most of these methods are trained by maximizing the co-occurrence likelihood with a simple Skip-gram-based formulation, which limits the expressive ability of their embeddings and the resulting recommendation performance. In this paper, we propose the Variational Bayesian Context-Aware Representation (VBCAR) model for grocery recommendation, which is a novel variational Bayesian model that learns the user and item latent vectors by leveraging basket context information from past user-item interactions. We train our VBCAR model based on the Bayesian Skip-gram framework coupled with the amortized variational inference so that it can learn more expressive latent representations that integrate both the non-linearity and Bayesian behaviour. Experiments conducted on a large real-world grocery recommendation dataset show that our proposed VBCAR model can significantly outperform existing state-of-the-art grocery recommendation methods.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07705v2
PDF	https://arxiv.org/pdf/1909.07705v2.pdf
PWC	https://paperswithcode.com/paper/variational-bayesian-context-aware
Repo
Framework

What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues


Title	What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues
Authors	Xintong Yu, Hongming Zhang, Yangqiu Song, Yan Song, Changshui Zhang
Abstract	Grounding a pronoun to a visual object it refers to requires complex reasoning from various information sources, especially in conversational scenarios. For example, when people in a conversation talk about something all speakers can see, they often directly use pronouns (e.g., it) to refer to it without previous introduction. This fact brings a huge challenge for modern natural language understanding systems, particularly conventional context-based pronoun coreference models. To tackle this challenge, in this paper, we formally define the task of visual-aware pronoun coreference resolution (PCR) and introduce VisPro, a large-scale dialogue PCR dataset, to investigate whether and how the visual information can help resolve pronouns in dialogues. We then propose a novel visual-aware PCR model, VisCoref, for this task and conduct comprehensive experiments and case studies on our dataset. Results demonstrate the importance of the visual information in this PCR case and show the effectiveness of the proposed model.
Tasks	Coreference Resolution
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00421v1
PDF	https://arxiv.org/pdf/1909.00421v1.pdf
PWC	https://paperswithcode.com/paper/what-you-see-is-what-you-get-visual-pronoun
Repo
Framework

Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network


Title	Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network
Authors	Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou
Abstract	Inter-sentence relation extraction deals with a number of complex semantic relationships in documents, which require local, non-local, syntactic and semantic dependencies. Existing methods do not fully exploit such dependencies. We present a novel inter-sentence relation extraction model that builds a labelled edge graph convolutional neural network model on a document-level graph. The graph is constructed using various inter- and intra-sentence dependencies to capture local and non-local dependency information. In order to predict the relation of an entity pair, we utilise multi-instance learning with bi-affine pairwise scoring. Experimental results show that our model achieves comparable performance to the state-of-the-art neural models on two biochemistry datasets. Our analysis shows that all the types in the graph are effective for inter-sentence relation extraction.
Tasks	Relation Extraction
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04684v1
PDF	https://arxiv.org/pdf/1906.04684v1.pdf
PWC	https://paperswithcode.com/paper/inter-sentence-relation-extraction-with
Repo
Framework

Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients


Title	Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients
Authors	Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu
Abstract	Hamiltonian Monte Carlo (HMC) is a state-of-the-art Markov chain Monte Carlo sampling algorithm for drawing samples from smooth probability densities over continuous spaces. We study the variant most widely used in practice, Metropolized HMC with the St"{o}rmer-Verlet or leapfrog integrator, and make two primary contributions. First, we provide a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of stepsize and number of leapfrog steps. This bound gives a precise quantification of the faster convergence of Metropolized HMC relative to simpler MCMC algorithms such as the Metropolized random walk, or Metropolized Langevin algorithm. Second, we provide a general framework for sharpening mixing time bounds Markov chains initialized at a substantial distance from the target distribution over continuous spaces. We apply this sharpening device to the Metropolized random walk and Langevin algorithms, thereby obtaining improved mixing time bounds from a non-warm initial distribution.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12247v1
PDF	https://arxiv.org/pdf/1905.12247v1.pdf
PWC	https://paperswithcode.com/paper/fast-mixing-of-metropolized-hamiltonian-monte
Repo
Framework

Ellipsis and Coreference Resolution as Question Answering


Title	Ellipsis and Coreference Resolution as Question Answering
Authors	Rahul Aralikatte, Matthew Lamm, Daniel Hardt, Anders Søgaard
Abstract	Coreference and many forms of ellipsis are similar to reading comprehension questions, in that in order to resolve these, we need to identify an appropriate text span in the previous discourse. This paper exploits this analogy and proposes to use an architecture developed for machine comprehension for ellipsis and coreference resolution. We present both single-task and joint models and evaluate them across standard benchmarks, outperforming the current state of the art for ellipsis by up to 48.5% error reduction – and for coreference by 37.5% error reduction.
Tasks	Coreference Resolution, Question Answering
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11141v1
PDF	https://arxiv.org/pdf/1908.11141v1.pdf
PWC	https://paperswithcode.com/paper/ellipsis-and-coreference-resolution-as
Repo
Framework

3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation


Title	3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation
Authors	Yunlu Chen, Thomas Mensink, Efstratios Gavves
Abstract	A key challenge for RGB-D segmentation is how to effectively incorporate 3D geometric information from the depth channel into 2D appearance features. We propose to model the effective receptive field of 2D convolution based on the scale and locality from the 3D neighborhood. Standard convolutions are local in the image space ($u, v$), often with a fixed receptive field of 3x3 pixels. We propose to define convolutions local with respect to the corresponding point in the 3D real-world space ($x, y, z$), where the depth channel is used to adapt the receptive field of the convolution, which yields the resulting filters invariant to scale and focusing on the certain range of depth. We introduce 3D Neighborhood Convolution (3DN-Conv), a convolutional operator around 3D neighborhoods. Further, we can use estimated depth to use our RGB-D based semantic segmentation model from RGB input. Experimental results validate that our proposed 3DN-Conv operator improves semantic segmentation, using either ground-truth depth (RGB-D) or estimated depth (RGB).
Tasks	Semantic Segmentation
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01460v1
PDF	https://arxiv.org/pdf/1910.01460v1.pdf
PWC	https://paperswithcode.com/paper/3d-neighborhood-convolution-learning-depth
Repo
Framework

A Three-Feature Model to Predict Colour Change Blindness


Title	A Three-Feature Model to Predict Colour Change Blindness
Authors	Steven Le Moan, Marius Pedersen
Abstract	Change blindness is a striking shortcoming of our visual system which is exploited in the popular “Spot the difference” game. It makes us unable to notice large visual changes happening right before our eyes and illustrates the fact that we see much less than we think we do. We introduce a fully automated model to predict colour change blindness in cartoon images based on two low-level image features and observer experience. Using linear regression with only three parameters, the predictions of the proposed model correlate significantly with measured detection times. We also demonstrate the efficacy of the model to classify stimuli in terms of difficulty.
Tasks
Published	2019-08-25
URL	https://arxiv.org/abs/1909.04147v1
PDF	https://arxiv.org/pdf/1909.04147v1.pdf
PWC	https://paperswithcode.com/paper/a-three-feature-model-to-predict-colour
Repo
Framework

When does Diversity Help Generalization in Classification Ensembles?


Title	When does Diversity Help Generalization in Classification Ensembles?
Authors	Yijun Bian, Huanhuan Chen
Abstract	Ensembles, as a widely used and effective technique in the machine learning community, succeed within a key element–“diversity.” The relationship between diversity and generalization, unfortunately, is not entirely understood and remains an open research issue. To reveal the effect of diversity on the generalization of classification ensembles, we investigate three issues on diversity, i.e., the measurement of diversity, the relationship between the proposed diversity and generalization error, and the utilization of this relationship for ensemble pruning. In the diversity measurement, we measure diversity by error decomposition inspired by regression ensembles, which decomposes the error of classification ensembles into accuracy and diversity. Then we formulate the relationship between the measured diversity and ensemble performance through the theorem of margin and generalization, and observe that the generalization error is reduced effectively only when the measured diversity is increased in a few specific ranges, while in other ranges larger diversity is less beneficial to increase generalization of an ensemble. Besides, we propose a pruning method based on diversity management to utilize this relationship, which could increase diversity appropriately and shrink the size of the ensemble with non-decreasing performance. The experiments validate the effectiveness of this proposed relationship between the proposed diversity and the ensemble generalization error.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13631v1
PDF	https://arxiv.org/pdf/1910.13631v1.pdf
PWC	https://paperswithcode.com/paper/when-does-diversity-help-generalization-in
Repo
Framework

NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language


Title	NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language
Authors	Leon Weber, Pasquale Minervini, Jannes Münchmeyer, Ulf Leser, Tim Rocktäschel
Abstract	Rule-based models are attractive for various tasks because they inherently lead to interpretable and explainable decisions and can easily incorporate prior knowledge. However, such systems are difficult to apply to problems involving natural language, due to its linguistic variability. In contrast, neural models can cope very well with ambiguity by learning distributed representations of words and their composition from data, but lead to models that are difficult to interpret. In this paper, we describe a model combining neural networks with logic programming in a novel manner for solving multi-hop reasoning tasks over natural language. Specifically, we propose to use a Prolog prover which we extend to utilize a similarity function over pretrained sentence encoders. We fine-tune the representations for the similarity function via backpropagation. This leads to a system that can apply rule-based reasoning to natural language, and induce domain-specific rules from training data. We evaluate the proposed system on two different question answering tasks, showing that it outperforms two baselines – BIDAF (Seo et al., 2016a) and FAST QA (Weissenborn et al., 2017b) on a subset of the WikiHop corpus and achieves competitive results on the MedHop data set (Welbl et al., 2017).
Tasks	Question Answering
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06187v1
PDF	https://arxiv.org/pdf/1906.06187v1.pdf
PWC	https://paperswithcode.com/paper/nlprolog-reasoning-with-weak-unification-for-1
Repo
Framework