Paper Group AWR 424
On the computation of counterfactual explanations – A survey. The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction. Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae. Co-Separating Sounds of Visual Objects. GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond. What’ …
On the computation of counterfactual explanations – A survey
Title | On the computation of counterfactual explanations – A survey |
Authors | André Artelt, Barbara Hammer |
Abstract | Due to the increasing use of machine learning in practice it becomes more and more important to be able to explain the prediction and behavior of machine learning models. An instance of explanations are counterfactual explanations which provide an intuitive and useful explanations of machine learning models. In this survey we review model-specific methods for efficiently computing counterfactual explanations of many different machine learning models and propose methods for models that have not been considered in literature so far. |
Tasks | |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.07749v1 |
https://arxiv.org/pdf/1911.07749v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-computation-of-counterfactual |
Repo | https://github.com/andreArtelt/OnTheComputationOfCounterfactualExplanations |
Framework | none |
The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
Title | The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction |
Authors | Junwei Liang, Lu Jiang, Kevin Murphy, Ting Yu, Alexander Hauptmann |
Abstract | This paper studies the problem of predicting the distribution over multiple possible future paths of people as they move through various visual scenes. We make two main contributions. The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals. This provides the first benchmark for quantitative evaluation of the models to predict multi-future trajectories. The second contribution is a new model to generate multiple plausible future trajectories, which contains novel designs of using multi-scale location encodings and convolutional RNNs over graphs. We refer to our model as Multiverse. We show that our model achieves the best results on our dataset, as well as on the real-world VIRAT/ActEV dataset (which just contains one possible future). |
Tasks | Trajectory Prediction |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06445v3 |
https://arxiv.org/pdf/1912.06445v3.pdf | |
PWC | https://paperswithcode.com/paper/the-garden-of-forking-paths-towards-multi |
Repo | https://github.com/JunweiLiang/ForkingPaths |
Framework | none |
Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae
Title | Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae |
Authors | Piero Molino, Yang Wang, Jiawei Zhang |
Abstract | Embeddings are a fundamental component of many modern machine learning and natural language processing models. Understanding them and visualizing them is essential for gathering insights about the information they capture and the behavior of the models. State of the art in analyzing embeddings consists in projecting them in two-dimensional planes without any interpretable semantics associated to the axes of the projection, which makes detailed analyses and comparison among multiple sets of embeddings challenging. In this work, we propose to use explicit axes defined as algebraic formulae over embeddings to project them into a lower dimensional, but semantically meaningful subspace, as a simple yet effective analysis and visualization methodology. This methodology assigns an interpretable semantics to the measures of variability and the axes of visualizations, allowing for both comparisons among different sets of embeddings and fine-grained inspection of the embedding spaces. We demonstrate the power of the proposed methodology through a series of case studies that make use of visualizations constructed around the underlying methodology and through a user study. The results show how the methodology is effective at providing more profound insights than classical projection methods and how it is widely applicable to many other use cases. |
Tasks | Word Embeddings |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12099v1 |
https://arxiv.org/pdf/1905.12099v1.pdf | |
PWC | https://paperswithcode.com/paper/parallax-visualizing-and-understanding-the |
Repo | https://github.com/uber-research/parallax |
Framework | none |
Co-Separating Sounds of Visual Objects
Title | Co-Separating Sounds of Visual Objects |
Authors | Ruohan Gao, Kristen Grauman |
Abstract | Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel. Current methods for visually-guided audio source separation sidestep the issue by training with artificially mixed video clips, but this puts unwieldy restrictions on training data collection and may even prevent learning the properties of “true” mixed sounds. We introduce a co-separation training paradigm that permits learning object-level sounds from unlabeled multi-source videos. Our novel training objective requires that the deep neural network’s separated audio for similar-looking objects be consistently identifiable, while simultaneously reproducing accurate video-level audio tracks for each source training pair. Our approach disentangles sounds in realistic test videos, even in cases where an object was not observed individually during training. We obtain state-of-the-art results on visually-guided audio source separation and audio denoising for the MUSIC, AudioSet, and AV-Bench datasets. |
Tasks | Audio Denoising, Denoising |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07750v2 |
https://arxiv.org/pdf/1904.07750v2.pdf | |
PWC | https://paperswithcode.com/paper/co-separating-sounds-of-visual-objects |
Repo | https://github.com/rhgao/co-separation |
Framework | pytorch |
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Title | GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond |
Authors | Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu |
Abstract | The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks. The code and configurations are released at https://github.com/xvjiarui/GCNet. |
Tasks | Instance Segmentation, Object Detection, Object Recognition |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11492v1 |
http://arxiv.org/pdf/1904.11492v1.pdf | |
PWC | https://paperswithcode.com/paper/gcnet-non-local-networks-meet-squeeze |
Repo | https://github.com/xggIoU/GCNet_global_context_module_tensorflow |
Framework | tf |
What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering
Title | What’s Missing: A Knowledge Gap Guided Approach for Multi-hop Question Answering |
Authors | Tushar Khot, Ashish Sabharwal, Peter Clark |
Abstract | Multi-hop textual question answering requires combining information from multiple sentences. We focus on a natural setting where, unlike typical reading comprehension, only partial information is provided with each question. The model must retrieve and use additional knowledge to correctly answer the question. To tackle this challenge, we develop a novel approach that explicitly identifies the knowledge gap between a key span in the provided knowledge and the answer choices. The model, GapQA, learns to fill this gap by determining the relationship between the span and an answer choice, based on retrieved knowledge targeting this gap. We propose jointly training a model to simultaneously fill this knowledge gap and compose it with the provided partial knowledge. On the OpenBookQA dataset, given partial knowledge, explicitly identifying what’s missing substantially outperforms previous approaches. |
Tasks | Question Answering, Reading Comprehension |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09253v1 |
https://arxiv.org/pdf/1909.09253v1.pdf | |
PWC | https://paperswithcode.com/paper/whats-missing-a-knowledge-gap-guided-approach |
Repo | https://github.com/allenai/missing-fact |
Framework | none |
Stock Prices Prediction using Deep Learning Models
Title | Stock Prices Prediction using Deep Learning Models |
Authors | Jialin Liu, Fei Chao, Yu-Chen Lin, Chih-Min Lin |
Abstract | Financial markets have a vital role in the development of modern society. They allow the deployment of economic resources. Changes in stock prices reflect changes in the market. In this study, we focus on predicting stock prices by deep learning model. This is a challenge task, because there is much noise and uncertainty in information that is related to stock prices. So this work uses sparse autoencoders with one-dimension (1-D) residual convolutional networks which is a deep learning model, to de-noise the data. Long-short term memory (LSTM) is then used to predict the stock price. The prices, indices and macroeconomic variables in past are the features used to predict the next day’s price. Experiment results show that 1-D residual convolutional networks can de-noise data and extract deep features better than a model that combines wavelet transforms (WT) and stacked autoencoders (SAEs). In addition, we compare the performances of model with two different forecast targets of stock price: absolute stock price and price rate of change. The results show that predicting stock price through price rate of change is better than predicting absolute prices directly. |
Tasks | |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.12227v1 |
https://arxiv.org/pdf/1909.12227v1.pdf | |
PWC | https://paperswithcode.com/paper/stock-prices-prediction-using-deep-learning |
Repo | https://github.com/koos808/Papers_books_summary |
Framework | none |
Semi-Supervised Learning by Augmented Distribution Alignment
Title | Semi-Supervised Learning by Augmented Distribution Alignment |
Authors | Qin Wang, Wen Li, Luc Van Gool |
Abstract | In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data. To this end, we propose to align the empirical distributions of labeled and unlabeled data to alleviate the bias. On one hand, we adopt an adversarial training strategy to minimize the distribution distance between labeled and unlabeled data as inspired by domain adaptation works. On the other hand, to deal with the small sample size issue of labeled data, we also propose a simple interpolation strategy to generate pseudo training samples. Those two strategies can be easily implemented into existing deep neural networks. We demonstrate the effectiveness of our proposed approach on the benchmark SVHN and CIFAR10 datasets. Our code is available at \url{https://github.com/qinenergy/adanet}. |
Tasks | Domain Adaptation |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.08171v2 |
https://arxiv.org/pdf/1905.08171v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-by-augmented |
Repo | https://github.com/qinenergy/adanet |
Framework | tf |
SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition
Title | SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition |
Authors | Yuntao Chen, Chenxia Han, Yanghao Li, Zehao Huang, Yi Jiang, Naiyan Wang, Zhaoxiang Zhang |
Abstract | A Simple and Versatile Framework for Object Detection and Instance Recognition |
Tasks | Autonomous Driving, Object Detection |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1903.05831v1 |
http://arxiv.org/pdf/1903.05831v1.pdf | |
PWC | https://paperswithcode.com/paper/simpledet-a-simple-and-versatile-distributed |
Repo | https://github.com/tusimple/simpledet |
Framework | mxnet |
HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead
Title | HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead |
Authors | Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung |
Abstract | Recent neural conversation models that attempted to incorporate emotion and generate empathetic responses either focused on conditioning the output to a given emotion, or incorporating the current user emotional state. While these approaches have been successful to some extent in generating more diverse and seemingly engaging utterances, they do not factor in how the user would feel towards the generated dialogue response. Hence, in this paper, we advocate such look-ahead of user emotion as the key to modeling and generating empathetic dialogue responses. We thus train a Sentiment Predictor to estimate the user sentiment look-ahead towards the generated system responses, which is then used as the reward function for generating more empathetic responses. Human evaluation results show that our model outperforms other baselines in empathy, relevance, and fluency. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08487v1 |
https://arxiv.org/pdf/1906.08487v1.pdf | |
PWC | https://paperswithcode.com/paper/happybot-generating-empathetic-dialogue |
Repo | https://github.com/HLTCHKUST/sentiment-lookahead |
Framework | pytorch |
Transparent Classification with Multilayer Logical Perceptrons and Random Binarization
Title | Transparent Classification with Multilayer Logical Perceptrons and Random Binarization |
Authors | Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang |
Abstract | Models with transparent inner structure and high classification performance are required to reduce potential risk and provide trust for users in domains like health care, finance, security, etc. However, existing models are hard to simultaneously satisfy the above two properties. In this paper, we propose a new hierarchical rule-based model for classification tasks, named Concept Rule Sets (CRS), which has both a strong expressive ability and a transparent inner structure. To address the challenge of efficiently learning the non-differentiable CRS model, we propose a novel neural network architecture, Multilayer Logical Perceptron (MLLP), which is a continuous version of CRS. Using MLLP and the Random Binarization (RB) method we proposed, we can search the discrete solution of CRS in continuous space using gradient descent and ensure the discrete CRS acts almost the same as the corresponding continuous MLLP. Experiments on 12 public data sets show that CRS outperforms the state-of-the-art approaches and the complexity of the learned CRS is close to the simple decision tree. Source code is available at https://github.com/12wang3/mllp. |
Tasks | |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04695v2 |
https://arxiv.org/pdf/1912.04695v2.pdf | |
PWC | https://paperswithcode.com/paper/transparent-classification-with-multilayer |
Repo | https://github.com/12wang3/mllp |
Framework | pytorch |
Learning Efficient Detector with Semi-supervised Adaptive Distillation
Title | Learning Efficient Detector with Semi-supervised Adaptive Distillation |
Authors | Shitao Tang, Litong Feng, Wenqi Shao, Zhanghui Kuang, Wei Zhang, Yimin Chen |
Abstract | Knowledge Distillation (KD) has been used in image classification for model compression. However, rare studies apply this technology on single-stage object detectors. Focal loss shows that the accumulated errors of easily-classified samples dominate the overall loss in the training process. This problem is also encountered when applying KD in the detection task. For KD, the teacher-defined hard samples are far more important than any others. We propose ADL to address this issue by adaptively mimicking the teacher’s logits, with more attention paid on two types of hard samples: hard-to-learn samples predicted by teacher with low certainty and hard-to-mimic samples with a large gap between the teacher’s and the student’s prediction. ADL enlarges the distillation loss for hard-to-learn and hard-to-mimic samples and reduces distillation loss for the dominant easy samples, enabling distillation to work on the single-stage detector first time, even if the student and the teacher are identical. Besides, ADL is effective in both the supervised setting and the semi-supervised setting, even when the labeled data and unlabeled data are from different distributions. For distillation on unlabeled data, ADL achieves better performance than existing data distillation which simply utilizes hard targets, making the student detector surpass its teacher. On the COCO database, semi-supervised adaptive distillation (SAD) makes a student detector with a backbone of ResNet-50 surpasses its teacher with a backbone of ResNet-101, while the student has half of the teacher’s computation complexity. The code is avaiable at https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation |
Tasks | Image Classification, Model Compression |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00366v2 |
http://arxiv.org/pdf/1901.00366v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-efficient-detector-with-semi |
Repo | https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation |
Framework | none |
HoloGAN: Unsupervised learning of 3D representations from natural images
Title | HoloGAN: Unsupervised learning of 3D representations from natural images |
Authors | Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, Yong-Liang Yang |
Abstract | We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner. |
Tasks | Image Generation, Novel View Synthesis |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01326v2 |
https://arxiv.org/pdf/1904.01326v2.pdf | |
PWC | https://paperswithcode.com/paper/hologan-unsupervised-learning-of-3d |
Repo | https://github.com/thunguyenphuoc/HoloGAN |
Framework | tf |
Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding
Title | Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding |
Authors | Zehao Yu, Jia Zheng, Dongze Lian, Zihan Zhou, Shenghua Gao |
Abstract | Single-image piece-wise planar 3D reconstruction aims to simultaneously segment plane instances and recover 3D plane parameters from an image. Most recent approaches leverage convolutional neural networks (CNNs) and achieve promising results. However, these methods are limited to detecting a fixed number of planes with certain learned order. To tackle this problem, we propose a novel two-stage method based on associative embedding, inspired by its recent success in instance segmentation. In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings. Then, the plane instances are obtained by grouping the embedding vectors in planar regions via an efficient mean shift clustering algorithm. In the second stage, we estimate the parameter for each plane instance by considering both pixel-level and instance-level consistencies. With the proposed method, we are able to detect an arbitrary number of planes. Extensive experiments on public datasets validate the effectiveness and efficiency of our method. Furthermore, our method runs at 30 fps at the testing time, thus could facilitate many real-time applications such as visual SLAM and human-robot interaction. Code is available at https://github.com/svip-lab/PlanarReconstruction. |
Tasks | 3D Plane Detection, 3D Reconstruction, Instance Segmentation, Semantic Segmentation |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09777v3 |
http://arxiv.org/pdf/1902.09777v3.pdf | |
PWC | https://paperswithcode.com/paper/single-image-piece-wise-planar-3d |
Repo | https://github.com/svip-lab/PlanarReconstruction |
Framework | pytorch |
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Title | GraphNVP: An Invertible Flow Model for Generating Molecular Graphs |
Authors | Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe |
Abstract | We propose GraphNVP, the first invertible, normalizing flow-based molecular graph generation model. We decompose the generation of a graph into two steps: generation of (i) an adjacency tensor and (ii) node attributes. This decomposition yields the exact likelihood maximization on graph-structured data, combined with two novel reversible flows. We empirically demonstrate that our model efficiently generates valid molecular graphs with almost no duplicated molecules. In addition, we observe that the learned latent space can be used to generate molecules with desired chemical properties. |
Tasks | Graph Generation |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11600v1 |
https://arxiv.org/pdf/1905.11600v1.pdf | |
PWC | https://paperswithcode.com/paper/graphnvp-an-invertible-flow-model-for |
Repo | https://github.com/pfnet-research/chainer-chemistry |
Framework | none |