January 31, 2020

2984 words 15 mins read

Paper Group AWR 390

Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers. A fast, complete, point cloud based loop closure for LiDAR odometry and mapping. Learning a smooth kernel regularizer for convolutional neural networks. Developing a Fine-Grained Corpus for a Less-resourced Language: the case of Kurdish. Evaluating Recurr …

Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers


Title	Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers
Authors	Divyat Mahajan, Chenhao Tan, Amit Sharma
Abstract	Explaining the output of a complex machine learning (ML) model often requires approximation using a simpler model. To construct interpretable explanations that are also consistent with the original ML model, counterfactual examples — showing how the model’s output changes with small perturbations to the input — have been proposed. This paper extends the work in counterfactual explanations by addressing the challenge of the feasibility of such examples. For explanations of ML models in critical domains such as healthcare, finance, etc, counterfactual examples are useful for an end-user only to the extent that perturbation of feature inputs is feasible in the real world. We formulate the problem of feasibility as preserving causal relationships among input features and present a method that uses (partial) structural causal models to generate actionable counterfactuals. When feasibility constraints may not be easily expressed, we propose an alternative method that optimizes for feasibility as people interact with its output and provide oracle-like feedback. Our experiments on MNIST, Bayesian networks and the widely used “Adult” dataset show that our proposed methods can generate counterfactual explanations that satisfy feasibility constraints better than previous approaches. Code repository can be accessed here: \textit{https://github.com/divyat09/cf-feasibility}
Tasks	Decision Making
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03277v2
PDF	https://arxiv.org/pdf/1912.03277v2.pdf
PWC	https://paperswithcode.com/paper/preserving-causal-constraints-in
Repo	https://github.com/microsoft/DiCE
Framework	tf

A fast, complete, point cloud based loop closure for LiDAR odometry and mapping


Title	A fast, complete, point cloud based loop closure for LiDAR odometry and mapping
Authors	Jiarong Lin, Fu Zhang
Abstract	This paper presents a loop closure method to correct the long-term drift in LiDAR odometry and mapping (LOAM). Our proposed method computes the 2D histogram of keyframes, a local map patch, and uses the normalized cross-correlation of the 2D histograms as the similarity metric between the current keyframe and those in the map. We show that this method is fast, invariant to rotation, and produces reliable and accurate loop detection. The proposed method is implemented with careful engineering and integrated into the LOAM algorithm, forming a complete and practical system ready to use. To benefit the community by serving a benchmark for loop closure, the entire system is made open source on Github
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11811v1
PDF	https://arxiv.org/pdf/1909.11811v1.pdf
PWC	https://paperswithcode.com/paper/a-fast-complete-point-cloud-based-loop
Repo	https://github.com/hku-mars/loam_livox
Framework	tf

Learning a smooth kernel regularizer for convolutional neural networks


Title	Learning a smooth kernel regularizer for convolutional neural networks
Authors	Reuben Feinman, Brenden M. Lake
Abstract	Modern deep neural networks require a tremendous amount of data to train, often needing hundreds or thousands of labeled examples to learn an effective representation. For these networks to work with less data, more structure must be built into their architectures or learned from previous experience. The learned weights of convolutional neural networks (CNNs) trained on large datasets for object recognition contain a substantial amount of structure. These representations have parallels to simple cells in the primary visual cortex, where receptive fields are smooth and contain many regularities. Incorporating smoothness constraints over the kernel weights of modern CNN architectures is a promising way to improve their sample complexity. We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights. The correlation parameters of this regularizer are learned from previous experience, yielding a method with a hierarchical Bayesian interpretation. We show that our correlated regularizer can help constrain models for visual recognition, improving over an L2 regularization baseline.
Tasks	L2 Regularization, Object Recognition
Published	2019-03-05
URL	http://arxiv.org/abs/1903.01882v1
PDF	http://arxiv.org/pdf/1903.01882v1.pdf
PWC	https://paperswithcode.com/paper/learning-a-smooth-kernel-regularizer-for
Repo	https://github.com/rfeinman/SK-regularization
Framework	tf

Developing a Fine-Grained Corpus for a Less-resourced Language: the case of Kurdish


Title	Developing a Fine-Grained Corpus for a Less-resourced Language: the case of Kurdish
Authors	Roshna Omer Abdulrahman, Hossein Hassani, Sina Ahmadi
Abstract	Kurdish is a less-resourced language consisting of different dialects written in various scripts. Approximately 30 million people in different countries speak the language. The lack of corpora is one of the main obstacles in Kurdish language processing. In this paper, we present KTC-the Kurdish Textbooks Corpus, which is composed of 31 K-12 textbooks in Sorani dialect. The corpus is normalized and categorized into 12 educational subjects containing 693,800 tokens (110,297 types). Our resource is publicly available for non-commercial use under the CC BY-NC-SA 4.0 license.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11467v1
PDF	https://arxiv.org/pdf/1909.11467v1.pdf
PWC	https://paperswithcode.com/paper/developing-a-fine-grained-corpus-for-a-less-1
Repo	https://github.com/KurdishBLARK/KTC
Framework	none

Evaluating Recurrent Neural Network Explanations


Title	Evaluating Recurrent Neural Network Explanations
Authors	Leila Arras, Ahmed Osman, Klaus-Robert Müller, Wojciech Samek
Abstract	Recently, several methods have been proposed to explain the predictions of recurrent neural networks (RNNs), in particular of LSTMs. The goal of these methods is to understand the network’s decisions by assigning to each input variable, e.g., a word, a relevance indicating to which extent it contributed to a particular prediction. In previous works, some of these methods were not yet compared to one another, or were evaluated only qualitatively. We close this gap by systematically and quantitatively comparing these methods in different settings, namely (1) a toy arithmetic task which we use as a sanity check, (2) a five-class sentiment prediction of movie reviews, and besides (3) we explore the usefulness of word relevances to build sentence-level representations. Lastly, using the method that performed best in our experiments, we show how specific linguistic phenomena such as the negation in sentiment analysis reflect in terms of relevance patterns, and how the relevance visualization can help to understand the misclassification of individual samples.
Tasks	Sentiment Analysis
Published	2019-04-26
URL	https://arxiv.org/abs/1904.11829v3
PDF	https://arxiv.org/pdf/1904.11829v3.pdf
PWC	https://paperswithcode.com/paper/evaluating-recurrent-neural-network
Repo	https://github.com/ArrasL/LRP_for_LSTM
Framework	none

Physics-Constrained Deep Learning for High-dimensional Surrogate Modeling and Uncertainty Quantification without Labeled Data


Title	Physics-Constrained Deep Learning for High-dimensional Surrogate Modeling and Uncertainty Quantification without Labeled Data
Authors	Yinhao Zhu, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis, Paris Perdikaris
Abstract	Surrogate modeling and uncertainty quantification tasks for PDE systems are most often considered as supervised learning problems where input and output data pairs are used for training. The construction of such emulators is by definition a small data problem which poses challenges to deep learning approaches that have been developed to operate in the big data regime. Even in cases where such models have been shown to have good predictive capability in high dimensions, they fail to address constraints in the data implied by the PDE model. This paper provides a methodology that incorporates the governing equations of the physical model in the loss/likelihood functions. The resulting physics-constrained, deep learning models are trained without any labeled data (e.g. employing only input data) and provide comparable predictive responses with data-driven models while obeying the constraints of the problem at hand. This work employs a convolutional encoder-decoder neural network approach as well as a conditional flow-based generative model for the solution of PDEs, surrogate model construction, and uncertainty quantification tasks. The methodology is posed as a minimization problem of the reverse Kullback-Leibler (KL) divergence between the model predictive density and the reference conditional density, where the later is defined as the Boltzmann-Gibbs distribution at a given inverse temperature with the underlying potential relating to the PDE system of interest. The generalization capability of these models to out-of-distribution input is considered. Quantification and interpretation of the predictive uncertainty is provided for a number of problems.
Tasks
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06314v1
PDF	http://arxiv.org/pdf/1901.06314v1.pdf
PWC	https://paperswithcode.com/paper/physics-constrained-deep-learning-for-high
Repo	https://github.com/cics-nd/pde-surrogate
Framework	pytorch

Dual Attention Networks for Visual Reference Resolution in Visual Dialog


Title	Dual Attention Networks for Visual Reference Resolution in Visual Dialog
Authors	Gi-Cheon Kang, Jaeseo Lim, Byoung-Tak Zhang
Abstract	Visual dialog (VisDial) is a task which requires an AI agent to answer a series of questions grounded in an image. Unlike in visual question answering (VQA), the series of questions should be able to capture a temporal context from a dialog history and exploit visually-grounded information. A problem called visual reference resolution involves these challenges, requiring the agent to resolve ambiguous references in a given question and find the references in a given image. In this paper, we propose Dual Attention Networks (DAN) for visual reference resolution. DAN consists of two kinds of attention networks, REFER and FIND. Specifically, REFER module learns latent relationships between a given question and a dialog history by employing a self-attention mechanism. FIND module takes image features and reference-aware representations (i.e., the output of REFER module) as input, and performs visual grounding via bottom-up attention mechanism. We qualitatively and quantitatively evaluate our model on VisDial v1.0 and v0.9 datasets, showing that DAN outperforms the previous state-of-the-art model by a significant margin.
Tasks	Question Answering, Visual Dialog, Visual Question Answering
Published	2019-02-25
URL	https://arxiv.org/abs/1902.09368v3
PDF	https://arxiv.org/pdf/1902.09368v3.pdf
PWC	https://paperswithcode.com/paper/dual-attention-networks-for-visual-reference
Repo	https://github.com/gicheonkang/DAN-VisDial
Framework	pytorch

Towards meta-interpretive learning of programming language semantics


Title	Towards meta-interpretive learning of programming language semantics
Authors	Sándor Bartha, James Cheney
Abstract	We introduce a new application for inductive logic programming: learning the semantics of programming languages from example evaluations. In this short paper, we explored a simplified task in this domain using the Metagol meta-interpretive learning system. We highlighted the challenging aspects of this scenario, including abstracting over function symbols, nonterminating examples, and learning non-observed predicates, and proposed extensions to Metagol helpful for overcoming these challenges, which may prove useful in other domains.
Tasks
Published	2019-07-20
URL	https://arxiv.org/abs/1907.08834v1
PDF	https://arxiv.org/pdf/1907.08834v1.pdf
PWC	https://paperswithcode.com/paper/towards-meta-interpretive-learning-of
Repo	https://github.com/barthasanyi/metagol_PLS
Framework	none

When to Trust Your Model: Model-Based Policy Optimization


Title	When to Trust Your Model: Model-Based Policy Optimization
Authors	Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine
Abstract	Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data. In this paper, we study the role of model usage in policy optimization both theoretically and empirically. We first formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement at each step. In practice, this analysis is overly pessimistic and suggests that real off-policy data is always preferable to model-generated on-policy data, but we show that an empirical estimate of model generalization can be incorporated into such analysis to justify model usage. Motivated by this analysis, we then demonstrate that a simple procedure of using short model-generated rollouts branched from real data has the benefits of more complicated model-based algorithms without the usual pitfalls. In particular, this approach surpasses the sample efficiency of prior model-based methods, matches the asymptotic performance of the best model-free algorithms, and scales to horizons that cause other model-based methods to fail entirely.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08253v2
PDF	https://arxiv.org/pdf/1906.08253v2.pdf
PWC	https://paperswithcode.com/paper/when-to-trust-your-model-model-based-policy
Repo	https://github.com/JannerM/mbpo
Framework	none

Context-Aware Visual Compatibility Prediction


Title	Context-Aware Visual Compatibility Prediction
Authors	Guillem Cucurull, Perouz Taslakian, David Vazquez
Abstract	How do we determine whether two or more clothing items are compatible or visually appealing? Part of the answer lies in understanding of visual aesthetics, and is biased by personal preferences shaped by social attitudes, time, and place. In this work we propose a method that predicts compatibility between two items based on their visual features, as well as their context. We define context as the products that are known to be compatible with each of these item. Our model is in contrast to other metric learning approaches that rely on pairwise comparisons between item features alone. We address the compatibility prediction problem using a graph neural network that learns to generate product embeddings conditioned on their context. We present results for two prediction tasks (fill in the blank and outfit compatibility) tested on two fashion datasets Polyvore and Fashion-Gen, and on a subset of the Amazon dataset; we achieve state of the art results when using context information and show how test performance improves as more context is used.
Tasks	Metric Learning
Published	2019-02-10
URL	http://arxiv.org/abs/1902.03646v2
PDF	http://arxiv.org/pdf/1902.03646v2.pdf
PWC	https://paperswithcode.com/paper/context-aware-visual-compatibility-prediction
Repo	https://github.com/gcucurull/visual-compatibility
Framework	tf

CoLight: Learning Network-level Cooperation for Traffic Signal Control


Title	CoLight: Learning Network-level Cooperation for Traffic Signal Control
Authors	Hua Wei, Nan Xu, Huichu Zhang, Guanjie Zheng, Xinshi Zang, Chacha Chen, Weinan Zhang, Yanmin Zhu, Kai Xu, Zhenhui Li
Abstract	Cooperation among the traffic signals enables vehicles to move through intersections more quickly. Conventional transportation approaches implement cooperation by pre-calculating the offsets between two intersections. Such pre-calculated offsets are not suitable for dynamic traffic environments. To enable cooperation of traffic signals, in this paper, we propose a model, CoLight, which uses graph attentional networks to facilitate communication. Specifically, for a target intersection in a network, CoLight can not only incorporate the temporal and spatial influences of neighboring intersections to the target intersection, but also build up index-free modeling of neighboring intersections. To the best of our knowledge, we are the first to use graph attentional networks in the setting of reinforcement learning for traffic signal control and to conduct experiments on the large-scale road network with hundreds of traffic signals. In experiments, we demonstrate that by learning the communication, the proposed model can achieve superior performance against the state-of-the-art methods.
Tasks	Multi-agent Reinforcement Learning
Published	2019-05-11
URL	https://arxiv.org/abs/1905.05717v2
PDF	https://arxiv.org/pdf/1905.05717v2.pdf
PWC	https://paperswithcode.com/paper/colight-learning-network-level-cooperation
Repo	https://github.com/PKU-AI-Edge/DGN
Framework	tf

Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper)


Title	Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper)
Authors	Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria
Abstract	Sarcasm is often expressed through several verbal and non-verbal cues, e.g., a change of tone, overemphasis in a word, a drawn-out syllable, or a straight looking face. Most of the recent work in sarcasm detection has been carried out on textual data. In this paper, we argue that incorporating multimodal cues can improve the automatic classification of sarcasm. As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUStARD), compiled from popular TV shows. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context of historical utterances in the dialogue, which provides additional information on the scenario where the utterance occurs. Our initial results show that the use of multimodal information can reduce the relative error rate of sarcasm detection by up to 12.9% in F-score when compared to the use of individual modalities. The full dataset is publicly available for use at https://github.com/soujanyaporia/MUStARD
Tasks	Sarcasm Detection
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01815v1
PDF	https://arxiv.org/pdf/1906.01815v1.pdf
PWC	https://paperswithcode.com/paper/towards-multimodal-sarcasm-detection-an
Repo	https://github.com/soujanyaporia/MUStARD
Framework	pytorch

Introducing MANtIS: a novel Multi-Domain Information Seeking Dialogues Dataset


Title	Introducing MANtIS: a novel Multi-Domain Information Seeking Dialogues Dataset
Authors	Gustavo Penha, Alexandru Balan, Claudia Hauff
Abstract	Conversational search is an approach to information retrieval (IR), where users engage in a dialogue with an agent in order to satisfy their information needs. Previous conceptual work described properties and actions a good agent should exhibit. Unlike them, we present a novel conceptual model defined in terms of conversational goals, which enables us to reason about current research practices in conversational search. Based on the literature, we elicit how existing tasks and test collections from the fields of IR, natural language processing (NLP) and dialogue systems (DS) fit into this model. We describe a set of characteristics that an ideal conversational search dataset should have. Lastly, we introduce MANtIS (the code and dataset are available at https://guzpenha.github.io/MANtIS/), a large-scale dataset containing multi-domain and grounded information seeking dialogues that fulfill all of our dataset desiderata. We provide baseline results for the conversation response ranking and user intent prediction tasks.
Tasks	Information Retrieval
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04639v1
PDF	https://arxiv.org/pdf/1912.04639v1.pdf
PWC	https://paperswithcode.com/paper/introducing-mantis-a-novel-multi-domain
Repo	https://github.com/Guzpenha/MANtIS
Framework	none

Metre as a stylometric feature in Latin hexameter poetry


Title	Metre as a stylometric feature in Latin hexameter poetry
Authors	Benjamin Nagy
Abstract	This paper demonstrates that metre is a privileged indicator of authorial style in classical Latin hexameter poetry. Using only metrical features, pairwise classification experiments are performed between 5 first-century authors (10 comparisons) using four different machine-learning models. The results showed a two-label classification accuracy of at least 95% with samples as small as ten lines and no greater than eighty lines (up to around 500 words). These sample sizes are an order of magnitude smaller than those typically recommended for BOW (‘bag of words’) or n-gram approaches, and the reported accuracy is outstanding. Additionally, this paper explores the potential for novelty (forgery) detection, or ‘one-class classification’. An analysis of the disputed Aldine Additamentum (Sil. Ital. Puni. 8:144-225) concludes (p=0.0013) that the metrical style differs significantly from that of the rest of the poem.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12478v2
PDF	https://arxiv.org/pdf/1911.12478v2.pdf
PWC	https://paperswithcode.com/paper/metre-as-a-stylometric-feature-in-latin
Repo	https://github.com/bnagy/hexml-paper
Framework	none

Realizing Petabyte Scale Acoustic Modeling


Title	Realizing Petabyte Scale Acoustic Modeling
Authors	Sree Hari Krishnan Parthasarathi, Nitin Sivakrishnan, Pranav Ladkat, Nikko Strom
Abstract	Large scale machine learning (ML) systems such as the Alexa automatic speech recognition (ASR) system continue to improve with increasing amounts of manually transcribed training data. Instead of scaling manual transcription to impractical levels, we utilize semi-supervised learning (SSL) to learn acoustic models (AM) from the vast firehose of untranscribed audio data. Learning an AM from 1 Million hours of audio presents unique ML and system design challenges. We present the design and evaluation of a highly scalable and resource efficient SSL system for AM. Employing the student/teacher learning paradigm, we focus on the student learning subsystem: a scalable and robust data pipeline that generates features and targets from raw audio, and an efficient model pipeline, including the distributed trainer, that builds a student model. Our evaluations show that, even without extensive hyper-parameter tuning, we obtain relative accuracy improvements in the 10 to 20$%$ range, with higher gains in noisier conditions. The end-to-end processing time of this SSL system was 12 days, and several components in this system can trivially scale linearly with more compute resources.
Tasks	Speech Recognition
Published	2019-04-24
URL	http://arxiv.org/abs/1904.10584v1
PDF	http://arxiv.org/pdf/1904.10584v1.pdf
PWC	https://paperswithcode.com/paper/realizing-petabyte-scale-acoustic-modeling
Repo	https://github.com/OpenSourceAI/sota_server
Framework	none