January 31, 2020

2920 words 14 mins read

Paper Group AWR 424

Paper Group AWR 424

On the computation of counterfactual explanations – A survey. The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction. Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae. Co-Separating Sounds of Visual Objects. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond. What’ …

On the computation of counterfactual explanations – A survey

Title On the computation of counterfactual explanations – A survey
Authors André Artelt, Barbara Hammer
Abstract Due to the increasing use of machine learning in practice it becomes more and more important to be able to explain the prediction and behavior of machine learning models. An instance of explanations are counterfactual explanations which provide an intuitive and useful explanations of machine learning models. In this survey we review model-specific methods for efficiently computing counterfactual explanations of many different machine learning models and propose methods for models that have not been considered in literature so far.
Tasks
Published 2019-11-15
URL https://arxiv.org/abs/1911.07749v1
PDF https://arxiv.org/pdf/1911.07749v1.pdf
PWC https://paperswithcode.com/paper/on-the-computation-of-counterfactual
Repo https://github.com/andreArtelt/OnTheComputationOfCounterfactualExplanations
Framework none

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

Title The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
Authors Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann
Abstract This paper studies the problem of predicting the distribution over multiple possible future paths of people as they move through various visual scenes. We make two main contributions. The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals. This provides the first benchmark for quantitative evaluation of the models to predict multi-future trajectories. The second contribution is a new model to generate multiple plausible future trajectories, which contains novel designs of using multi-scale location encodings and convolutional RNNs over graphs. We refer to our model as Multiverse. We show that our model achieves the best results on our dataset, as well as on the real-world VIRAT/ActEV dataset (which just contains one possible future).
Tasks Trajectory Prediction
Published 2019-12-13
URL https://arxiv.org/abs/1912.06445v3
PDF https://arxiv.org/pdf/1912.06445v3.pdf
PWC https://paperswithcode.com/paper/the-garden-of-forking-paths-towards-multi
Repo https://github.com/JunweiLiang/ForkingPaths
Framework none

Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae

Title Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae
Authors Piero Molino, Yang Wang, Jiawei Zhang
Abstract Embeddings are a fundamental component of many modern machine learning and natural language processing models. Understanding them and visualizing them is essential for gathering insights about the information they capture and the behavior of the models. State of the art in analyzing embeddings consists in projecting them in two-dimensional planes without any interpretable semantics associated to the axes of the projection, which makes detailed analyses and comparison among multiple sets of embeddings challenging. In this work, we propose to use explicit axes defined as algebraic formulae over embeddings to project them into a lower dimensional, but semantically meaningful subspace, as a simple yet effective analysis and visualization methodology. This methodology assigns an interpretable semantics to the measures of variability and the axes of visualizations, allowing for both comparisons among different sets of embeddings and fine-grained inspection of the embedding spaces. We demonstrate the power of the proposed methodology through a series of case studies that make use of visualizations constructed around the underlying methodology and through a user study. The results show how the methodology is effective at providing more profound insights than classical projection methods and how it is widely applicable to many other use cases.
Tasks Word Embeddings
Published 2019-05-28
URL https://arxiv.org/abs/1905.12099v1
PDF https://arxiv.org/pdf/1905.12099v1.pdf
PWC https://paperswithcode.com/paper/parallax-visualizing-and-understanding-the
Repo https://github.com/uber-research/parallax
Framework none

Co-Separating Sounds of Visual Objects

Title Co-Separating Sounds of Visual Objects
Authors Ruohan Gao, Kristen Grauman
Abstract Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel. Current methods for visually-guided audio source separation sidestep the issue by training with artificially mixed video clips, but this puts unwieldy restrictions on training data collection and may even prevent learning the properties of “true” mixed sounds. We introduce a co-separation training paradigm that permits learning object-level sounds from unlabeled multi-source videos. Our novel training objective requires that the deep neural network’s separated audio for similar-looking objects be consistently identifiable, while simultaneously reproducing accurate video-level audio tracks for each source training pair. Our approach disentangles sounds in realistic test videos, even in cases where an object was not observed individually during training. We obtain state-of-the-art results on visually-guided audio source separation and audio denoising for the MUSIC, AudioSet, and AV-Bench datasets.
Tasks Audio Denoising, Denoising
Published 2019-04-16
URL https://arxiv.org/abs/1904.07750v2
PDF https://arxiv.org/pdf/1904.07750v2.pdf
PWC https://paperswithcode.com/paper/co-separating-sounds-of-visual-objects
Repo https://github.com/rhgao/co-separation
Framework pytorch

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Title GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Authors Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu
Abstract The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks. The code and configurations are released at https://github.com/xvjiarui/GCNet.
Tasks Instance Segmentation, Object Detection, Object Recognition
Published 2019-04-25
URL http://arxiv.org/abs/1904.11492v1
PDF http://arxiv.org/pdf/1904.11492v1.pdf
PWC https://paperswithcode.com/paper/gcnet-non-local-networks-meet-squeeze
Repo https://github.com/xggIoU/GCNet_global_context_module_tensorflow
Framework tf

What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering

Title What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering
Authors Tushar Khot, Ashish Sabharwal, Peter Clark
Abstract Multi-hop textual question answering requires combining information from multiple sentences. We focus on a natural setting where, unlike typical reading comprehension, only partial information is provided with each question. The model must retrieve and use additional knowledge to correctly answer the question. To tackle this challenge, we develop a novel approach that explicitly identifies the knowledge gap between a key span in the provided knowledge and the answer choices. The model, GapQA, learns to fill this gap by determining the relationship between the span and an answer choice, based on retrieved knowledge targeting this gap. We propose jointly training a model to simultaneously fill this knowledge gap and compose it with the provided partial knowledge. On the OpenBookQA dataset, given partial knowledge, explicitly identifying what’s missing substantially outperforms previous approaches.
Tasks Question Answering, Reading Comprehension
Published 2019-09-19
URL https://arxiv.org/abs/1909.09253v1
PDF https://arxiv.org/pdf/1909.09253v1.pdf
PWC https://paperswithcode.com/paper/whats-missing-a-knowledge-gap-guided-approach
Repo https://github.com/allenai/missing-fact
Framework none

Stock Prices Prediction using Deep Learning Models

Title Stock Prices Prediction using Deep Learning Models
Authors Jialin Liu, Fei Chao, Yu-Chen Lin, Chih-Min Lin
Abstract Financial markets have a vital role in the development of modern society. They allow the deployment of economic resources. Changes in stock prices reflect changes in the market. In this study, we focus on predicting stock prices by deep learning model. This is a challenge task, because there is much noise and uncertainty in information that is related to stock prices. So this work uses sparse autoencoders with one-dimension (1-D) residual convolutional networks which is a deep learning model, to de-noise the data. Long-short term memory (LSTM) is then used to predict the stock price. The prices, indices and macroeconomic variables in past are the features used to predict the next day’s price. Experiment results show that 1-D residual convolutional networks can de-noise data and extract deep features better than a model that combines wavelet transforms (WT) and stacked autoencoders (SAEs). In addition, we compare the performances of model with two different forecast targets of stock price: absolute stock price and price rate of change. The results show that predicting stock price through price rate of change is better than predicting absolute prices directly.
Tasks
Published 2019-09-25
URL https://arxiv.org/abs/1909.12227v1
PDF https://arxiv.org/pdf/1909.12227v1.pdf
PWC https://paperswithcode.com/paper/stock-prices-prediction-using-deep-learning
Repo https://github.com/koos808/Papers_books_summary
Framework none

Semi-Supervised Learning by Augmented Distribution Alignment

Title Semi-Supervised Learning by Augmented Distribution Alignment
Authors Qin Wang, Wen Li, Luc Van Gool
Abstract In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data. To this end, we propose to align the empirical distributions of labeled and unlabeled data to alleviate the bias. On one hand, we adopt an adversarial training strategy to minimize the distribution distance between labeled and unlabeled data as inspired by domain adaptation works. On the other hand, to deal with the small sample size issue of labeled data, we also propose a simple interpolation strategy to generate pseudo training samples. Those two strategies can be easily implemented into existing deep neural networks. We demonstrate the effectiveness of our proposed approach on the benchmark SVHN and CIFAR10 datasets. Our code is available at \url{https://github.com/qinenergy/adanet}.
Tasks Domain Adaptation
Published 2019-05-20
URL https://arxiv.org/abs/1905.08171v2
PDF https://arxiv.org/pdf/1905.08171v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-by-augmented
Repo https://github.com/qinenergy/adanet
Framework tf

SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition

Title SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition
Authors Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Yi Jiang, Naiyan Wang, Zhaoxiang Zhang
Abstract A Simple and Versatile Framework for Object Detection and Instance Recognition
Tasks Autonomous Driving, Object Detection
Published 2019-03-14
URL http://arxiv.org/abs/1903.05831v1
PDF http://arxiv.org/pdf/1903.05831v1.pdf
PWC https://paperswithcode.com/paper/simpledet-a-simple-and-versatile-distributed
Repo https://github.com/tusimple/simpledet
Framework mxnet

HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead

Title HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead
Authors Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
Abstract Recent neural conversation models that attempted to incorporate emotion and generate empathetic responses either focused on conditioning the output to a given emotion, or incorporating the current user emotional state. While these approaches have been successful to some extent in generating more diverse and seemingly engaging utterances, they do not factor in how the user would feel towards the generated dialogue response. Hence, in this paper, we advocate such look-ahead of user emotion as the key to modeling and generating empathetic dialogue responses. We thus train a Sentiment Predictor to estimate the user sentiment look-ahead towards the generated system responses, which is then used as the reward function for generating more empathetic responses. Human evaluation results show that our model outperforms other baselines in empathy, relevance, and fluency.
Tasks
Published 2019-06-20
URL https://arxiv.org/abs/1906.08487v1
PDF https://arxiv.org/pdf/1906.08487v1.pdf
PWC https://paperswithcode.com/paper/happybot-generating-empathetic-dialogue
Repo https://github.com/HLTCHKUST/sentiment-lookahead
Framework pytorch

Transparent Classification with Multilayer Logical Perceptrons and Random Binarization

Title Transparent Classification with Multilayer Logical Perceptrons and Random Binarization
Authors Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang
Abstract Models with transparent inner structure and high classification performance are required to reduce potential risk and provide trust for users in domains like health care, finance, security, etc. However, existing models are hard to simultaneously satisfy the above two properties. In this paper, we propose a new hierarchical rule-based model for classification tasks, named Concept Rule Sets (CRS), which has both a strong expressive ability and a transparent inner structure. To address the challenge of efficiently learning the non-differentiable CRS model, we propose a novel neural network architecture, Multilayer Logical Perceptron (MLLP), which is a continuous version of CRS. Using MLLP and the Random Binarization (RB) method we proposed, we can search the discrete solution of CRS in continuous space using gradient descent and ensure the discrete CRS acts almost the same as the corresponding continuous MLLP. Experiments on 12 public data sets show that CRS outperforms the state-of-the-art approaches and the complexity of the learned CRS is close to the simple decision tree. Source code is available at https://github.com/12wang3/mllp.
Tasks
Published 2019-12-10
URL https://arxiv.org/abs/1912.04695v2
PDF https://arxiv.org/pdf/1912.04695v2.pdf
PWC https://paperswithcode.com/paper/transparent-classification-with-multilayer
Repo https://github.com/12wang3/mllp
Framework pytorch

Learning Efficient Detector with Semi-supervised Adaptive Distillation

Title Learning Efficient Detector with Semi-supervised Adaptive Distillation
Authors Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei Zhang, Yimin Chen
Abstract Knowledge Distillation (KD) has been used in image classification for model compression. However, rare studies apply this technology on single-stage object detectors. Focal loss shows that the accumulated errors of easily-classified samples dominate the overall loss in the training process. This problem is also encountered when applying KD in the detection task. For KD, the teacher-defined hard samples are far more important than any others. We propose ADL to address this issue by adaptively mimicking the teacher’s logits, with more attention paid on two types of hard samples: hard-to-learn samples predicted by teacher with low certainty and hard-to-mimic samples with a large gap between the teacher’s and the student’s prediction. ADL enlarges the distillation loss for hard-to-learn and hard-to-mimic samples and reduces distillation loss for the dominant easy samples, enabling distillation to work on the single-stage detector first time, even if the student and the teacher are identical. Besides, ADL is effective in both the supervised setting and the semi-supervised setting, even when the labeled data and unlabeled data are from different distributions. For distillation on unlabeled data, ADL achieves better performance than existing data distillation which simply utilizes hard targets, making the student detector surpass its teacher. On the COCO database, semi-supervised adaptive distillation (SAD) makes a student detector with a backbone of ResNet-50 surpasses its teacher with a backbone of ResNet-101, while the student has half of the teacher’s computation complexity. The code is avaiable at https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation
Tasks Image Classification, Model Compression
Published 2019-01-02
URL http://arxiv.org/abs/1901.00366v2
PDF http://arxiv.org/pdf/1901.00366v2.pdf
PWC https://paperswithcode.com/paper/learning-efficient-detector-with-semi
Repo https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation
Framework none

HoloGAN: Unsupervised learning of 3D representations from natural images

Title HoloGAN: Unsupervised learning of 3D representations from natural images
Authors Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, Yong-Liang Yang
Abstract We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.
Tasks Image Generation, Novel View Synthesis
Published 2019-04-02
URL https://arxiv.org/abs/1904.01326v2
PDF https://arxiv.org/pdf/1904.01326v2.pdf
PWC https://paperswithcode.com/paper/hologan-unsupervised-learning-of-3d
Repo https://github.com/thunguyenphuoc/HoloGAN
Framework tf

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding

Title Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding
Authors Zehao Yu, Jia Zheng, Dongze Lian, Zihan Zhou, Shenghua Gao
Abstract Single-image piece-wise planar 3D reconstruction aims to simultaneously segment plane instances and recover 3D plane parameters from an image. Most recent approaches leverage convolutional neural networks (CNNs) and achieve promising results. However, these methods are limited to detecting a fixed number of planes with certain learned order. To tackle this problem, we propose a novel two-stage method based on associative embedding, inspired by its recent success in instance segmentation. In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings. Then, the plane instances are obtained by grouping the embedding vectors in planar regions via an efficient mean shift clustering algorithm. In the second stage, we estimate the parameter for each plane instance by considering both pixel-level and instance-level consistencies. With the proposed method, we are able to detect an arbitrary number of planes. Extensive experiments on public datasets validate the effectiveness and efficiency of our method. Furthermore, our method runs at 30 fps at the testing time, thus could facilitate many real-time applications such as visual SLAM and human-robot interaction. Code is available at https://github.com/svip-lab/PlanarReconstruction.
Tasks 3D Plane Detection, 3D Reconstruction, Instance Segmentation, Semantic Segmentation
Published 2019-02-26
URL http://arxiv.org/abs/1902.09777v3
PDF http://arxiv.org/pdf/1902.09777v3.pdf
PWC https://paperswithcode.com/paper/single-image-piece-wise-planar-3d
Repo https://github.com/svip-lab/PlanarReconstruction
Framework pytorch

GraphNVP: An Invertible Flow Model for Generating Molecular Graphs

Title GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Authors Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
Abstract We propose GraphNVP, the first invertible, normalizing flow-based molecular graph generation model. We decompose the generation of a graph into two steps: generation of (i) an adjacency tensor and (ii) node attributes. This decomposition yields the exact likelihood maximization on graph-structured data, combined with two novel reversible flows. We empirically demonstrate that our model efficiently generates valid molecular graphs with almost no duplicated molecules. In addition, we observe that the learned latent space can be used to generate molecules with desired chemical properties.
Tasks Graph Generation
Published 2019-05-28
URL https://arxiv.org/abs/1905.11600v1
PDF https://arxiv.org/pdf/1905.11600v1.pdf
PWC https://paperswithcode.com/paper/graphnvp-an-invertible-flow-model-for
Repo https://github.com/pfnet-research/chainer-chemistry
Framework none
comments powered by Disqus