January 27, 2020

3011 words 15 mins read

Paper Group ANR 1298

Uncertainty Based Detection and Relabeling of Noisy Image Labels. Improving Sequence-to-Sequence Learning via Optimal Transport. On the Robustness of Human Pose Estimation. Hamiltonian Generative Networks. Interpreting Verbal Irony: Linguistic Strategies and the Connection to the Type of Semantic Incongruity. Vector Field Neural Networks. Conceptor …

Uncertainty Based Detection and Relabeling of Noisy Image Labels


Title	Uncertainty Based Detection and Relabeling of Noisy Image Labels
Authors	Jan M. Köhler, Maximilian Autenrieth, William H. Beluch
Abstract	Deep neural networks (DNNs) are powerful tools in computer vision tasks. However, in many realistic scenarios label noise is prevalent in the training images, and overfitting to these noisy labels can significantly harm the generalization performance of DNNs. We propose a novel technique to identify data with noisy labels based on the different distributions of the predictive uncertainties from a DNN over the clean and noisy data. Additionally, the behavior of the uncertainty over the course of training helps to identify the network weights which best can be used to relabel the noisy labels. Data with noisy labels can therefore be cleaned in an iterative process. Our proposed method can be easily implemented, and shows promising performance on the task of noisy label detection on CIFAR-10 and CIFAR-100.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1906.11876v1
PDF	https://arxiv.org/pdf/1906.11876v1.pdf
PWC	https://paperswithcode.com/paper/uncertainty-based-detection-and-relabeling-of
Repo
Framework

Improving Sequence-to-Sequence Learning via Optimal Transport


Title	Improving Sequence-to-Sequence Learning via Optimal Transport
Authors	Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin
Abstract	Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE). However, standard MLE training considers a word-level objective, predicting the next word given the previous ground-truth partial sentence. This procedure focuses on modeling local syntactic patterns, and may fail to capture long-range semantic structure. We present a novel solution to alleviate these issues. Our approach imposes global sequence-level guidance via new supervision based on optimal transport, enabling the overall characterization and preservation of semantic features. We further show that this method can be understood as a Wasserstein gradient flow trying to match our model to the ground truth sequence distribution. Extensive experiments are conducted to validate the utility of the proposed approach, showing consistent improvements over a wide variety of NLP tasks, including machine translation, abstractive text summarization, and image captioning.
Tasks	Abstractive Text Summarization, Image Captioning, Machine Translation, Text Summarization
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06283v1
PDF	http://arxiv.org/pdf/1901.06283v1.pdf
PWC	https://paperswithcode.com/paper/improving-sequence-to-sequence-learning-via
Repo
Framework

On the Robustness of Human Pose Estimation


Title	On the Robustness of Human Pose Estimation
Authors	Sahil Shah, Naman jain, Abhishek Sharma, Arjun Jain
Abstract	This paper provides, to the best of our knowledge, the first comprehensive and exhaustive study of adversarial attacks on human pose estimation. Besides highlighting the important differences between well-studied classification and human pose-estimation systems w.r.t. adversarial attacks, we also provide deep insights into the design choices of pose-estimation systems to shape future work. We compare the robustness of several pose-estimation architectures trained on the standard datasets, MPII and COCO. In doing so, we also explore the problem of attacking non-classification based networks including regression based networks, which has been virtually unexplored in the past. We find that compared to classification and semantic segmentation, human pose estimation architectures are relatively robust to adversarial attacks with the single-step attacks being surprisingly ineffective. Our study show that the heatmap-based pose-estimation models fare better than their direct regression-based counterparts and that the systems which explicitly model anthropomorphic semantics of human body are significantly more robust. We find that the targeted attacks are more difficult to obtain than untargeted ones and some body-joints are easier to fool than the others. We present visualizations of universal perturbations to facilitate unprecedented insights into their workings on pose-estimation. Additionally, we show them to generalize well across different networks on both the datasets.
Tasks	Pose Estimation, Semantic Segmentation
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06401v1
PDF	https://arxiv.org/pdf/1908.06401v1.pdf
PWC	https://paperswithcode.com/paper/on-the-robustness-of-human-pose-estimation
Repo
Framework

Hamiltonian Generative Networks


Title	Hamiltonian Generative Networks
Authors	Peter Toth, Danilo Jimenez Rezende, Andrew Jaegle, Sébastien Racanière, Aleksandar Botev, Irina Higgins
Abstract	The Hamiltonian formalism plays a central role in classical and quantum physics. Hamiltonians are the main tool for modelling the continuous time evolution of systems with conserved quantities, and they come equipped with many useful properties, like time reversibility and smooth interpolation in time. These properties are important for many machine learning problems - from sequence prediction to reinforcement learning and density modelling - but are not typically provided out of the box by standard tools such as recurrent neural networks. In this paper, we introduce the Hamiltonian Generative Network (HGN), the first approach capable of consistently learning Hamiltonian dynamics from high-dimensional observations (such as images) without restrictive domain assumptions. Once trained, we can use HGN to sample new trajectories, perform rollouts both forward and backward in time and even speed up or slow down the learned dynamics. We demonstrate how a simple modification of the network architecture turns HGN into a powerful normalising flow model, called Neural Hamiltonian Flow (NHF), that uses Hamiltonian dynamics to model expressive densities. We hope that our work serves as a first practical demonstration of the value that the Hamiltonian formalism can bring to deep learning.
Tasks
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13789v2
PDF	https://arxiv.org/pdf/1909.13789v2.pdf
PWC	https://paperswithcode.com/paper/hamiltonian-generative-networks
Repo
Framework

Interpreting Verbal Irony: Linguistic Strategies and the Connection to the Type of Semantic Incongruity


Title	Interpreting Verbal Irony: Linguistic Strategies and the Connection to the Type of Semantic Incongruity
Authors	Debanjan Ghosh, Elena Musi, Kartikeya Upasani, Smaranda Muresan
Abstract	Human communication often involves the use of verbal irony or sarcasm, where the speakers usually mean the opposite of what they say. To better understand how verbal irony is expressed by the speaker and interpreted by the hearer we conduct a crowdsourcing task: given an utterance expressing verbal irony, users are asked to verbalize their interpretation of the speaker’s ironic message. We propose a typology of linguistic strategies for verbal irony interpretation and link it to various theoretical linguistic frameworks. We design computational models to capture these strategies and present empirical studies aimed to answer three questions: (1) what is the distribution of linguistic strategies used by hearers to interpret ironic messages?; (2) do hearers adopt similar strategies for interpreting the speaker’s ironic intent?; and (3) does the type of semantic incongruity in the ironic message (explicit vs. implicit) influence the choice of interpretation strategies by the hearers?
Tasks
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00891v2
PDF	https://arxiv.org/pdf/1911.00891v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-verbal-irony-linguistic
Repo
Framework

Vector Field Neural Networks


Title	Vector Field Neural Networks
Authors	Daniel Vieira, Joao Paixao
Abstract	This work begins by establishing a mathematical formalization between different geometrical interpretations of Neural Networks, providing a first contribution. From this starting point, a new interpretation is explored, using the idea of implicit vector fields moving data as particles in a flow. A new architecture, Vector Fields Neural Networks(VFNN), is proposed based on this interpretation, with the vector field becoming explicit. A specific implementation of the VFNN using Euler’s method to solve ordinary differential equations (ODEs) and gaussian vector fields is tested. The first experiments present visual results remarking the important features of the new architecture and providing another contribution with the geometrically interpretable regularization of model parameters. Then, the new architecture is evaluated for different hyperparameters and inputs, with the objective of evaluating the influence on model performance, computational time, and complexity. The VFNN model is compared against the known basic models Naive Bayes, Feed Forward Neural Networks, and Support Vector Machines(SVM), showing comparable, or better, results for different datasets. Finally, the conclusion provides many new questions and ideas for improvement of the model that can be used to increase model performance.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.07033v1
PDF	https://arxiv.org/pdf/1905.07033v1.pdf
PWC	https://paperswithcode.com/paper/vector-field-neural-networks
Repo
Framework

Conceptor Debiasing of Word Representations Evaluated on WEAT


Title	Conceptor Debiasing of Word Representations Evaluated on WEAT
Authors	Saket Karve, Lyle Ungar, João Sedoc
Abstract	Bias in word embeddings such as Word2Vec has been widely investigated, and many efforts made to remove such bias. We show how to use conceptors debiasing to post-process both traditional and contextualized word embeddings. Our conceptor debiasing can simultaneously remove racial and gender biases and, unlike standard debiasing methods, can make effect use of heterogeneous lists of biased words. We show that conceptor debiasing diminishes racial and gender bias of word representations as measured using the Word Embedding Association Test (WEAT) of Caliskan et al. (2017).
Tasks	Word Embeddings
Published	2019-06-14
URL	https://arxiv.org/abs/1906.05993v1
PDF	https://arxiv.org/pdf/1906.05993v1.pdf
PWC	https://paperswithcode.com/paper/conceptor-debiasing-of-word-representations
Repo
Framework

Deep Learning for Anomaly Detection: A Survey


Title	Deep Learning for Anomaly Detection: A Survey
Authors	Raghavendra Chalapathy, Sanjay Chawla
Abstract	Anomaly detection is an important problem that has been well-studied within diverse research areas and application domains. The aim of this survey is two-fold, firstly we present a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore, we review the adoption of these methods for anomaly across various application domains and assess their effectiveness. We have grouped state-of-the-art research techniques into different categories based on the underlying assumptions and approach adopted. Within each category we outline the basic anomaly detection technique, along with its variants and present key assumptions, to differentiate between normal and anomalous behavior. For each category, we present we also present the advantages and limitations and discuss the computational complexity of the techniques in real application domains. Finally, we outline open issues in research and challenges faced while adopting these techniques.
Tasks	Anomaly Detection
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03407v2
PDF	http://arxiv.org/pdf/1901.03407v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-anomaly-detection-a-survey
Repo
Framework

A Computational Model for Tensor Core Units


Title	A Computational Model for Tensor Core Units
Authors	Francesco Silvestri, Flavio Vella
Abstract	To respond to the need of efficient training and inference of deep neural networks, a pletora of domain-specific hardware architectures have been introduced, such as Google Tensor Processing Units and NVIDIA Tensor Cores. A common feature of these architectures is a hardware circuit for efficiently computing a dense matrix multiplication of a given small size. In order to broad the class of algorithms that exploit these systems, we propose a computational model, named TCU model, that captures the ability to natively multiply small matrices. We then use the TCU model for designing fast algorithms for linear algebra problems, including dense and sparse matrix multiplication, FFT, integer multiplication, and polynomial evaluation. We finally highlight a relation between the TCU model and the external memory model.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06649v1
PDF	https://arxiv.org/pdf/1908.06649v1.pdf
PWC	https://paperswithcode.com/paper/a-computational-model-for-tensor-core-units
Repo
Framework

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning


Title	A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning
Authors	Pengfei Wang, Chengquan Zhang, Fei Qi, Zuming Huang, Mengyi En, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi
Abstract	Detecting scene text of arbitrary shapes has been a challenging task over the past years. In this paper, we propose a novel segmentation-based text detector, namely SAST, which employs a context attended multi-task learning framework based on a Fully Convolutional Network (FCN) to learn various geometric properties for the reconstruction of polygonal representation of text regions. Taking sequential characteristics of text into consideration, a Context Attention Block is introduced to capture long-range dependencies of pixel information to obtain a more reliable segmentation. In post-processing, a Point-to-Quad assignment method is proposed to cluster pixels into text instances by integrating both high-level object knowledge and low-level pixel information in a single shot. Moreover, the polygonal representation of arbitrarily-shaped text can be extracted with the proposed geometric properties much more effectively. Experiments on several benchmarks, including ICDAR2015, ICDAR2017-MLT, SCUT-CTW1500, and Total-Text, demonstrate that SAST achieves better or comparable performance in terms of accuracy. Furthermore, the proposed algorithm runs at 27.63 FPS on SCUT-CTW1500 with a Hmean of 81.0% on a single NVIDIA Titan Xp graphics card, surpassing most of the existing segmentation-based methods.
Tasks	Multi-Task Learning
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05498v1
PDF	https://arxiv.org/pdf/1908.05498v1.pdf
PWC	https://paperswithcode.com/paper/a-single-shot-arbitrarily-shaped-text
Repo
Framework

Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving


Title	Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving
Authors	Sascha Rosbach, Vinit James, Simon Großjohann, Silviu Homoceanu, Xing Li, Stefan Roth
Abstract	General-purpose planning algorithms for automated driving combine mission, behavior, and local motion planning. Such planning algorithms map features of the environment and driving kinematics into complex reward functions. To achieve this, planning experts often rely on linear reward functions. The specification and tuning of these reward functions is a tedious process and requires significant experience. Moreover, a manually designed linear reward function does not generalize across different driving situations. In this work, we propose a deep learning approach based on inverse reinforcement learning that generates situation-dependent reward functions. Our neural network provides a mapping between features and actions of sampled driving policies of a model-predictive control-based planner and predicts reward functions for upcoming planning cycles. In our evaluation, we compare the driving style of reward functions predicted by our deep network against clustered and linear reward functions. Our proposed deep learning approach outperforms clustered linear reward functions and is at par with linear reward functions with a-priori knowledge about the situation.
Tasks	Motion Planning
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03509v1
PDF	https://arxiv.org/pdf/1912.03509v1.pdf
PWC	https://paperswithcode.com/paper/driving-style-encoder-situational-reward
Repo
Framework

Variational Inference MPC for Bayesian Model-based Reinforcement Learning


Title	Variational Inference MPC for Bayesian Model-based Reinforcement Learning
Authors	Masashi Okada, Tadahiro Taniguchi
Abstract	In recent studies on model-based reinforcement learning (MBRL), incorporating uncertainty in forward dynamics is a state-of-the-art strategy to enhance learning performance, making MBRLs competitive to cutting-edge model free methods, especially in simulated robotics tasks. Probabilistic ensembles with trajectory sampling (PETS) is a leading type of MBRL, which employs Bayesian inference to dynamics modeling and model predictive control (MPC) with stochastic optimization via the cross entropy method (CEM). In this paper, we propose a novel extension to the uncertainty-aware MBRL. Our main contributions are twofold: Firstly, we introduce a variational inference MPC, which reformulates various stochastic methods, including CEM, in a Bayesian fashion. Secondly, we propose a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS). As a result, our Bayesian MBRL can involve multimodal uncertainties both in dynamics and optimal trajectories. In comparison to PETS, our method consistently improves asymptotic performance on several challenging locomotion tasks.
Tasks	Bayesian Inference, Stochastic Optimization
Published	2019-07-08
URL	https://arxiv.org/abs/1907.04202v2
PDF	https://arxiv.org/pdf/1907.04202v2.pdf
PWC	https://paperswithcode.com/paper/variational-inference-mpc-for-bayesian-model
Repo
Framework

Detecting mechanical loosening of total hip replacement implant from plain radiograph using deep convolutional neural network


Title	Detecting mechanical loosening of total hip replacement implant from plain radiograph using deep convolutional neural network
Authors	Alireza Borjali, Antonia F. Chen, Orhun K. Muratoglu, Mohammad A. Morid, Kartik M. Varadarajan
Abstract	Plain radiography is widely used to detect mechanical loosening of total hip replacement (THR) implants. Currently, radiographs are assessed manually by medical professionals, which may be prone to poor inter and intra observer reliability and low accuracy. Furthermore, manual detection of mechanical loosening of THR implants requires experienced clinicians who might not always be readily available, potentially resulting in delayed diagnosis. In this study, we present a novel, fully automatic and interpretable approach to detect mechanical loosening of THR implants from plain radiographs using deep convolutional neural network (CNN). We trained a CNN on 40 patients anteroposterior hip x rays using five fold cross validation and compared its performance with a high volume board certified orthopaedic surgeon (AFC). To increase the confidence in the machine outcome, we also implemented saliency maps to visualize where the CNN looked at to make a diagnosis. CNN outperformed the orthopaedic surgeon in diagnosing mechanical loosening of THR implants achieving significantly higher sensitively (0.94) than the orthopaedic surgeon (0.53) with the same specificity (0.96). The saliency maps showed that the CNN looked at clinically relevant features to make a diagnosis. Such CNNs can be used for automatic radiologic assessment of mechanical loosening of THR implants to supplement the practitioners decision making process, increasing their diagnostic accuracy, and freeing them to engage in more patient centric care.
Tasks	Decision Making
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00943v1
PDF	https://arxiv.org/pdf/1912.00943v1.pdf
PWC	https://paperswithcode.com/paper/detecting-mechanical-loosening-of-total-hip
Repo
Framework

Learning to Learn in Simulation


Title	Learning to Learn in Simulation
Authors	Ervin Teng, Bob Iannucci
Abstract	Deep learning often requires the manual collection and annotation of a training set. On robotic platforms, can we partially automate this task by training the robot to be curious, i.e., to seek out beneficial training information in the environment? In this work, we address the problem of curiosity as it relates to online, real-time, human-in-the-loop training of an object detection algorithm onboard a drone, where motion is constrained to two dimensions. We use a 3D simulation environment and deep reinforcement learning to train a curiosity agent to, in turn, train the object detection model. This agent could have one of two conflicting objectives: train as quickly as possible, or train with minimal human input. We outline a reward function that allows the curiosity agent to learn either of these objectives, while taking into account some of the physical characteristics of the drone platform on which it is meant to run. In addition, We show that we can weigh the importance of achieving these objectives by adjusting a parameter in the reward function.
Tasks	Object Detection
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01569v1
PDF	http://arxiv.org/pdf/1902.01569v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-learn-in-simulation
Repo
Framework

Explainable Semantic Mapping for First Responders


Title	Explainable Semantic Mapping for First Responders
Authors	Jean Oh, Martial Hebert, Hae-Gon Jeon, Xavier Perez, Chia Dai, Yeeho Song
Abstract	One of the key challenges in the semantic mapping problem in postdisaster environments is how to analyze a large amount of data efficiently with minimal supervision. To address this challenge, we propose a deep learning-based semantic mapping tool consisting of three main ideas. First, we develop a frugal semantic segmentation algorithm that uses only a small amount of labeled data. Next, we investigate on the problem of learning to detect a new class of object using just a few training examples. Finally, we develop an explainable cost map learning algorithm that can be quickly trained to generate traversability cost maps using only raw sensor data such as aerial-view imagery. This paper presents an overview of the proposed idea and the lessons learned.
Tasks	Semantic Segmentation
Published	2019-10-15
URL	https://arxiv.org/abs/1910.07093v1
PDF	https://arxiv.org/pdf/1910.07093v1.pdf
PWC	https://paperswithcode.com/paper/explainable-semantic-mapping-for-first
Repo
Framework