January 31, 2020

2920 words 14 mins read

Paper Group AWR 424

On the computation of counterfactual explanations – A survey. The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction. Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae. Co-Separating Sounds of Visual Objects. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond. What’ …

On the computation of counterfactual explanations – A survey


Title	On the computation of counterfactual explanations – A survey
Authors	André Artelt, Barbara Hammer
Abstract	Due to the increasing use of machine learning in practice it becomes more and more important to be able to explain the prediction and behavior of machine learning models. An instance of explanations are counterfactual explanations which provide an intuitive and useful explanations of machine learning models. In this survey we review model-specific methods for efficiently computing counterfactual explanations of many different machine learning models and propose methods for models that have not been considered in literature so far.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.07749v1
PDF	https://arxiv.org/pdf/1911.07749v1.pdf
PWC	https://paperswithcode.com/paper/on-the-computation-of-counterfactual
Repo	https://github.com/andreArtelt/OnTheComputationOfCounterfactualExplanations
Framework	none

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction


Title	The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
Authors	Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann
Abstract	This paper studies the problem of predicting the distribution over multiple possible future paths of people as they move through various visual scenes. We make two main contributions. The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals. This provides the first benchmark for quantitative evaluation of the models to predict multi-future trajectories. The second contribution is a new model to generate multiple plausible future trajectories, which contains novel designs of using multi-scale location encodings and convolutional RNNs over graphs. We refer to our model as Multiverse. We show that our model achieves the best results on our dataset, as well as on the real-world VIRAT/ActEV dataset (which just contains one possible future).
Tasks	Trajectory Prediction
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06445v3
PDF	https://arxiv.org/pdf/1912.06445v3.pdf
PWC	https://paperswithcode.com/paper/the-garden-of-forking-paths-towards-multi
Repo	https://github.com/JunweiLiang/ForkingPaths
Framework	none

Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae


Title	Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae
Authors	Piero Molino, Yang Wang, Jiawei Zhang
Abstract	Embeddings are a fundamental component of many modern machine learning and natural language processing models. Understanding them and visualizing them is essential for gathering insights about the information they capture and the behavior of the models. State of the art in analyzing embeddings consists in projecting them in two-dimensional planes without any interpretable semantics associated to the axes of the projection, which makes detailed analyses and comparison among multiple sets of embeddings challenging. In this work, we propose to use explicit axes defined as algebraic formulae over embeddings to project them into a lower dimensional, but semantically meaningful subspace, as a simple yet effective analysis and visualization methodology. This methodology assigns an interpretable semantics to the measures of variability and the axes of visualizations, allowing for both comparisons among different sets of embeddings and fine-grained inspection of the embedding spaces. We demonstrate the power of the proposed methodology through a series of case studies that make use of visualizations constructed around the underlying methodology and through a user study. The results show how the methodology is effective at providing more profound insights than classical projection methods and how it is widely applicable to many other use cases.
Tasks	Word Embeddings
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12099v1
PDF	https://arxiv.org/pdf/1905.12099v1.pdf
PWC	https://paperswithcode.com/paper/parallax-visualizing-and-understanding-the
Repo	https://github.com/uber-research/parallax
Framework	none

Co-Separating Sounds of Visual Objects


Title	Co-Separating Sounds of Visual Objects
Authors	Ruohan Gao, Kristen Grauman
Abstract	Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel. Current methods for visually-guided audio source separation sidestep the issue by training with artificially mixed video clips, but this puts unwieldy restrictions on training data collection and may even prevent learning the properties of “true” mixed sounds. We introduce a co-separation training paradigm that permits learning object-level sounds from unlabeled multi-source videos. Our novel training objective requires that the deep neural network’s separated audio for similar-looking objects be consistently identifiable, while simultaneously reproducing accurate video-level audio tracks for each source training pair. Our approach disentangles sounds in realistic test videos, even in cases where an object was not observed individually during training. We obtain state-of-the-art results on visually-guided audio source separation and audio denoising for the MUSIC, AudioSet, and AV-Bench datasets.
Tasks	Audio Denoising, Denoising
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07750v2
PDF	https://arxiv.org/pdf/1904.07750v2.pdf
PWC	https://paperswithcode.com/paper/co-separating-sounds-of-visual-objects
Repo	https://github.com/rhgao/co-separation
Framework	pytorch

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond


Title	GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Authors	Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu
Abstract	The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks. The code and configurations are released at https://github.com/xvjiarui/GCNet.
Tasks	Instance Segmentation, Object Detection, Object Recognition
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11492v1
PDF	http://arxiv.org/pdf/1904.11492v1.pdf
PWC	https://paperswithcode.com/paper/gcnet-non-local-networks-meet-squeeze
Repo	https://github.com/xggIoU/GCNet_global_context_module_tensorflow
Framework	tf

What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering


Title	What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering
Authors	Tushar Khot, Ashish Sabharwal, Peter Clark
Abstract	Multi-hop textual question answering requires combining information from multiple sentences. We focus on a natural setting where, unlike typical reading comprehension, only partial information is provided with each question. The model must retrieve and use additional knowledge to correctly answer the question. To tackle this challenge, we develop a novel approach that explicitly identifies the knowledge gap between a key span in the provided knowledge and the answer choices. The model, GapQA, learns to fill this gap by determining the relationship between the span and an answer choice, based on retrieved knowledge targeting this gap. We propose jointly training a model to simultaneously fill this knowledge gap and compose it with the provided partial knowledge. On the OpenBookQA dataset, given partial knowledge, explicitly identifying what’s missing substantially outperforms previous approaches.
Tasks	Question Answering, Reading Comprehension
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09253v1
PDF	https://arxiv.org/pdf/1909.09253v1.pdf
PWC	https://paperswithcode.com/paper/whats-missing-a-knowledge-gap-guided-approach
Repo	https://github.com/allenai/missing-fact
Framework	none

Stock Prices Prediction using Deep Learning Models


Title	Stock Prices Prediction using Deep Learning Models
Authors	Jialin Liu, Fei Chao, Yu-Chen Lin, Chih-Min Lin
Abstract	Financial markets have a vital role in the development of modern society. They allow the deployment of economic resources. Changes in stock prices reflect changes in the market. In this study, we focus on predicting stock prices by deep learning model. This is a challenge task, because there is much noise and uncertainty in information that is related to stock prices. So this work uses sparse autoencoders with one-dimension (1-D) residual convolutional networks which is a deep learning model, to de-noise the data. Long-short term memory (LSTM) is then used to predict the stock price. The prices, indices and macroeconomic variables in past are the features used to predict the next day’s price. Experiment results show that 1-D residual convolutional networks can de-noise data and extract deep features better than a model that combines wavelet transforms (WT) and stacked autoencoders (SAEs). In addition, we compare the performances of model with two different forecast targets of stock price: absolute stock price and price rate of change. The results show that predicting stock price through price rate of change is better than predicting absolute prices directly.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.12227v1
PDF	https://arxiv.org/pdf/1909.12227v1.pdf
PWC	https://paperswithcode.com/paper/stock-prices-prediction-using-deep-learning
Repo	https://github.com/koos808/Papers_books_summary
Framework	none

Semi-Supervised Learning by Augmented Distribution Alignment


Title	Semi-Supervised Learning by Augmented Distribution Alignment
Authors	Qin Wang, Wen Li, Luc Van Gool
Abstract	In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data. To this end, we propose to align the empirical distributions of labeled and unlabeled data to alleviate the bias. On one hand, we adopt an adversarial training strategy to minimize the distribution distance between labeled and unlabeled data as inspired by domain adaptation works. On the other hand, to deal with the small sample size issue of labeled data, we also propose a simple interpolation strategy to generate pseudo training samples. Those two strategies can be easily implemented into existing deep neural networks. We demonstrate the effectiveness of our proposed approach on the benchmark SVHN and CIFAR10 datasets. Our code is available at \url{https://github.com/qinenergy/adanet}.
Tasks	Domain Adaptation
Published	2019-05-20
URL	https://arxiv.org/abs/1905.08171v2
PDF	https://arxiv.org/pdf/1905.08171v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-by-augmented
Repo	https://github.com/qinenergy/adanet
Framework	tf

SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition


Title	SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition
Authors	Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Yi Jiang, Naiyan Wang, Zhaoxiang Zhang
Abstract	A Simple and Versatile Framework for Object Detection and Instance Recognition
Tasks	Autonomous Driving, Object Detection
Published	2019-03-14
URL	http://arxiv.org/abs/1903.05831v1
PDF	http://arxiv.org/pdf/1903.05831v1.pdf
PWC	https://paperswithcode.com/paper/simpledet-a-simple-and-versatile-distributed
Repo	https://github.com/tusimple/simpledet
Framework	mxnet

HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead


Title	HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead
Authors	Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
Abstract	Recent neural conversation models that attempted to incorporate emotion and generate empathetic responses either focused on conditioning the output to a given emotion, or incorporating the current user emotional state. While these approaches have been successful to some extent in generating more diverse and seemingly engaging utterances, they do not factor in how the user would feel towards the generated dialogue response. Hence, in this paper, we advocate such look-ahead of user emotion as the key to modeling and generating empathetic dialogue responses. We thus train a Sentiment Predictor to estimate the user sentiment look-ahead towards the generated system responses, which is then used as the reward function for generating more empathetic responses. Human evaluation results show that our model outperforms other baselines in empathy, relevance, and fluency.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08487v1
PDF	https://arxiv.org/pdf/1906.08487v1.pdf
PWC	https://paperswithcode.com/paper/happybot-generating-empathetic-dialogue
Repo	https://github.com/HLTCHKUST/sentiment-lookahead
Framework	pytorch

Transparent Classification with Multilayer Logical Perceptrons and Random Binarization


Title	Transparent Classification with Multilayer Logical Perceptrons and Random Binarization
Authors	Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang
Abstract	Models with transparent inner structure and high classification performance are required to reduce potential risk and provide trust for users in domains like health care, finance, security, etc. However, existing models are hard to simultaneously satisfy the above two properties. In this paper, we propose a new hierarchical rule-based model for classification tasks, named Concept Rule Sets (CRS), which has both a strong expressive ability and a transparent inner structure. To address the challenge of efficiently learning the non-differentiable CRS model, we propose a novel neural network architecture, Multilayer Logical Perceptron (MLLP), which is a continuous version of CRS. Using MLLP and the Random Binarization (RB) method we proposed, we can search the discrete solution of CRS in continuous space using gradient descent and ensure the discrete CRS acts almost the same as the corresponding continuous MLLP. Experiments on 12 public data sets show that CRS outperforms the state-of-the-art approaches and the complexity of the learned CRS is close to the simple decision tree. Source code is available at https://github.com/12wang3/mllp.
Tasks
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04695v2
PDF	https://arxiv.org/pdf/1912.04695v2.pdf
PWC	https://paperswithcode.com/paper/transparent-classification-with-multilayer
Repo	https://github.com/12wang3/mllp
Framework	pytorch

Learning Efficient Detector with Semi-supervised Adaptive Distillation


Title	Learning Efficient Detector with Semi-supervised Adaptive Distillation
Authors	Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei Zhang, Yimin Chen
Abstract	Knowledge Distillation (KD) has been used in image classification for model compression. However, rare studies apply this technology on single-stage object detectors. Focal loss shows that the accumulated errors of easily-classified samples dominate the overall loss in the training process. This problem is also encountered when applying KD in the detection task. For KD, the teacher-defined hard samples are far more important than any others. We propose ADL to address this issue by adaptively mimicking the teacher’s logits, with more attention paid on two types of hard samples: hard-to-learn samples predicted by teacher with low certainty and hard-to-mimic samples with a large gap between the teacher’s and the student’s prediction. ADL enlarges the distillation loss for hard-to-learn and hard-to-mimic samples and reduces distillation loss for the dominant easy samples, enabling distillation to work on the single-stage detector first time, even if the student and the teacher are identical. Besides, ADL is effective in both the supervised setting and the semi-supervised setting, even when the labeled data and unlabeled data are from different distributions. For distillation on unlabeled data, ADL achieves better performance than existing data distillation which simply utilizes hard targets, making the student detector surpass its teacher. On the COCO database, semi-supervised adaptive distillation (SAD) makes a student detector with a backbone of ResNet-50 surpasses its teacher with a backbone of ResNet-101, while the student has half of the teacher’s computation complexity. The code is avaiable at https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation
Tasks	Image Classification, Model Compression
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00366v2
PDF	http://arxiv.org/pdf/1901.00366v2.pdf
PWC	https://paperswithcode.com/paper/learning-efficient-detector-with-semi
Repo	https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation
Framework	none

HoloGAN: Unsupervised learning of 3D representations from natural images


Title	HoloGAN: Unsupervised learning of 3D representations from natural images
Authors	Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, Yong-Liang Yang
Abstract	We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.
Tasks	Image Generation, Novel View Synthesis
Published	2019-04-02
URL	https://arxiv.org/abs/1904.01326v2
PDF	https://arxiv.org/pdf/1904.01326v2.pdf
PWC	https://paperswithcode.com/paper/hologan-unsupervised-learning-of-3d
Repo	https://github.com/thunguyenphuoc/HoloGAN
Framework	tf

Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding


Title	Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding
Authors	Zehao Yu, Jia Zheng, Dongze Lian, Zihan Zhou, Shenghua Gao
Abstract	Single-image piece-wise planar 3D reconstruction aims to simultaneously segment plane instances and recover 3D plane parameters from an image. Most recent approaches leverage convolutional neural networks (CNNs) and achieve promising results. However, these methods are limited to detecting a fixed number of planes with certain learned order. To tackle this problem, we propose a novel two-stage method based on associative embedding, inspired by its recent success in instance segmentation. In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings. Then, the plane instances are obtained by grouping the embedding vectors in planar regions via an efficient mean shift clustering algorithm. In the second stage, we estimate the parameter for each plane instance by considering both pixel-level and instance-level consistencies. With the proposed method, we are able to detect an arbitrary number of planes. Extensive experiments on public datasets validate the effectiveness and efficiency of our method. Furthermore, our method runs at 30 fps at the testing time, thus could facilitate many real-time applications such as visual SLAM and human-robot interaction. Code is available at https://github.com/svip-lab/PlanarReconstruction.
Tasks	3D Plane Detection, 3D Reconstruction, Instance Segmentation, Semantic Segmentation
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09777v3
PDF	http://arxiv.org/pdf/1902.09777v3.pdf
PWC	https://paperswithcode.com/paper/single-image-piece-wise-planar-3d
Repo	https://github.com/svip-lab/PlanarReconstruction
Framework	pytorch

GraphNVP: An Invertible Flow Model for Generating Molecular Graphs


Title	GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Authors	Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
Abstract	We propose GraphNVP, the first invertible, normalizing flow-based molecular graph generation model. We decompose the generation of a graph into two steps: generation of (i) an adjacency tensor and (ii) node attributes. This decomposition yields the exact likelihood maximization on graph-structured data, combined with two novel reversible flows. We empirically demonstrate that our model efficiently generates valid molecular graphs with almost no duplicated molecules. In addition, we observe that the learned latent space can be used to generate molecules with desired chemical properties.
Tasks	Graph Generation
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11600v1
PDF	https://arxiv.org/pdf/1905.11600v1.pdf
PWC	https://paperswithcode.com/paper/graphnvp-an-invertible-flow-model-for
Repo	https://github.com/pfnet-research/chainer-chemistry
Framework	none