January 30, 2020

3482 words 17 mins read

Paper Group ANR 434

A New Local Transformation Module for Few-shot Segmentation. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit. CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models. DeepDA: LSTM-based Deep Data Association Network for Multi-Ta …

A New Local Transformation Module for Few-shot Segmentation


Title	A New Local Transformation Module for Few-shot Segmentation
Authors	Yuwei Yang, Fanman Meng, Hongliang Li, Qingbo Wu, Xiaolong Xu, Shuai Chen
Abstract	Few-shot segmentation segments object regions of new classes with a few of manual annotations. Its key step is to establish the transformation module between support images (annotated images) and query images (unlabeled images), so that the segmentation cues of support images can guide the segmentation of query images. The existing methods form transformation model based on global cues, which however ignores the local cues that are verified in this paper to be very important for the transformation. This paper proposes a new transformation module based on local cues, where the relationship of the local features is used for transformation. To enhance the generalization performance of the network, the relationship matrix is calculated in a high-dimensional metric embedding space based on cosine distance. In addition, to handle the challenging mapping problem from the low-level local relationships to high-level semantic cues, we propose to apply generalized inverse matrix of the annotation matrix of support images to transform the relationship matrix linearly, which is non-parametric and class-agnostic. The result by the matrix transformation can be regarded as an attention map with high-level semantic cues, based on which a transformation module can be built simply.The proposed transformation module is a general module that can be used to replace the transformation module in the existing few-shot segmentation frameworks. We verify the effectiveness of the proposed method on Pascal VOC 2012 dataset. The value of mIoU achieves at 57.0% in 1-shot and 60.6% in 5-shot, which outperforms the state-of-the-art method by 1.6% and 3.5%, respectively.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05886v2
PDF	https://arxiv.org/pdf/1910.05886v2.pdf
PWC	https://paperswithcode.com/paper/a-new-local-transformation-module-for-few
Repo
Framework

Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit


Title	Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit
Authors	Belinda Tzen, Maxim Raginsky
Abstract	In deep latent Gaussian models, the latent variable is generated by a time-inhomogeneous Markov chain, where at each time step we pass the current state through a parametric nonlinear map, such as a feedforward neural net, and add a small independent Gaussian perturbation. This work considers the diffusion limit of such models, where the number of layers tends to infinity, while the step size and the noise variance tend to zero. The limiting latent object is an It^o diffusion process that solves a stochastic differential equation (SDE) whose drift and diffusion coefficient are implemented by neural nets. We develop a variational inference framework for these \textit{neural SDEs} via stochastic automatic differentiation in Wiener space, where the variational approximations to the posterior are obtained by Girsanov (mean-shift) transformation of the standard Wiener process and the computation of gradients is based on the theory of stochastic flows. This permits the use of black-box SDE solvers and automatic differentiation for end-to-end inference. Experimental results with synthetic data are provided.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09883v2
PDF	https://arxiv.org/pdf/1905.09883v2.pdf
PWC	https://paperswithcode.com/paper/neural-stochastic-differential-equations-deep
Repo
Framework

CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models


Title	CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models
Authors	Shubham Sharma, Jette Henderson, Joydeep Ghosh
Abstract	As artificial intelligence plays an increasingly important role in our society, there are ethical and moral obligations for both businesses and researchers to ensure that their machine learning models are designed, deployed, and maintained responsibly. These models need to be rigorously audited for fairness, robustness, transparency, and interpretability. A variety of methods have been developed that focus on these issues in isolation, however, managing these methods in conjunction with model development can be cumbersome and timeconsuming. In this paper, we introduce a unified and model-agnostic approach to address these issues: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models (CERTIFAI). Unlike previous methods in this domain, CERTIFAI is a general tool that can be applied to any black-box model and any type of input data. Given a model and an input instance, CERTIFAI uses a custom genetic algorithm to generate counterfactuals: instances close to the input that change the prediction of the model. We demonstrate how these counterfactuals can be used to examine issues of robustness, interpretability, transparency, and fairness. Additionally, we introduce CERScore, the first black-box model robustness score that performs comparably to methods that have access to model internals.
Tasks
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07857v1
PDF	https://arxiv.org/pdf/1905.07857v1.pdf
PWC	https://paperswithcode.com/paper/certifai-counterfactual-explanations-for
Repo
Framework

DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter


Title	DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter
Authors	Huajun Liu, Hui Zhang, Christoph Mertz
Abstract	The Long Short-Term Memory (LSTM) neural network based data association algorithm named as DeepDA for multi-target tracking in clutters is proposed to deal with the NP-hard combinatorial optimization problem in this paper. Different from the classical data association methods involving complex models and accurate prior knowledge on clutter density, filter covariance or associated gating etc, data-driven deep learning methods have been extensively researched for this topic. Firstly, data association mathematical problem for multitarget tracking on unknown target number, missed detection and clutter, which is beyond one-to-one mapping between observations and targets is redefined formally. Subsequently, an LSTM network is designed to learn the measurement-to-track association probability from radar noisy measurements and exist tracks. Moreover, an LSTM-based data-driven deep neural network after a supervised training through the BPTT and RMSprop optimization method can get the association probability directly. Experimental results on simulated data show a significant performance on association ratio, target ID switching and time-consuming for tracking multiple targets even they are crossing each other in the complicated clutter environment.
Tasks	Combinatorial Optimization
Published	2019-07-16
URL	https://arxiv.org/abs/1907.09915v1
PDF	https://arxiv.org/pdf/1907.09915v1.pdf
PWC	https://paperswithcode.com/paper/deepda-lstm-based-deep-data-association
Repo
Framework

Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body


Title	Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body
Authors	Hongtao Wu, Deven Misra, Gregory S. Chirikjian
Abstract	For robots to exhibit a high level of intelligence in the real world, they must be able to assess objects for which they have no prior knowledge. Therefore, it is crucial for robots to perceive object affordances by reasoning about physical interactions with the object. In this paper, we propose a novel method to provide robots with an imagination of object affordances using physical simulations. The class of chair is chosen here as an initial category of objects to illustrate a more general paradigm. In our method, the robot “imagines” the affordance of an arbitrarily oriented object as a chair by simulating a physical “sitting” interaction between an articulated human body and the object. This object affordance reasoning is used as a cue for object classification (chair vs non-chair). Moreover, if an object is classified as a chair, the affordance reasoning can also predict the upright pose of the object which allows the sitting interaction to take place. We call this type of poses the functional pose. We demonstrate our method in chair classification on synthetic 3D CAD models. Although our method uses only 20 models for training, it outperforms appearance-based deep learning methods, which require a large amount of training data, when the upright orientation is not assumed to be known as a priori. In addition, we showcase that the functional pose predictions of our method on both synthetic models and real objects scanned by a depth camera align well with human judgments.
Tasks	Object Classification, Physical Simulations
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07572v1
PDF	https://arxiv.org/pdf/1909.07572v1.pdf
PWC	https://paperswithcode.com/paper/is-that-a-chair-imagining-affordances-using
Repo
Framework

Anatomically-Informed Multiple Linear Assignment Problems for White Matter Bundle Segmentation


Title	Anatomically-Informed Multiple Linear Assignment Problems for White Matter Bundle Segmentation
Authors	Giulia Bertò, Paolo Avesani, Franco Pestilli, Daniel Bullock, Bradley Caron, Emanuele Olivetti
Abstract	Segmenting white matter bundles from human tractograms is a task of interest for several applications. Current methods for bundle segmentation consider either only prior knowledge about the relative anatomical position of a bundle, or only its geometrical properties. Our aim is to improve the results of segmentation by proposing a method that takes into account information about both the underlying anatomy and the geometry of bundles at the same time. To achieve this goal, we extend a state-of-the-art example-based method based on the Linear Assignment Problem (LAP) by including prior anatomical information within the optimization process. The proposed method shows a significant improvement with respect to the original method, in particular on small bundles.
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07077v1
PDF	https://arxiv.org/pdf/1907.07077v1.pdf
PWC	https://paperswithcode.com/paper/anatomically-informed-multiple-linear
Repo
Framework

Pairwise coupling of convolutional neural networks for better explicability of classification systems


Title	Pairwise coupling of convolutional neural networks for better explicability of classification systems
Authors	Ondrej Šuch, Peter Tarábek, Katarína Bachratá, Andrea Tinajová
Abstract	We examine several aspects of explicability of a classification system built from neural networks. The first aspect is the pairwise explicability, which is the ability to provide the most accurate prediction when the range of possibilities is narrowed to just two. Next we consider explicability in development, which means ability to make incremental improvement in prediction accuracy based on observed deficiency of the system. Inherent stochasticity of neural network based classifiers can be interpreted using likelihood randomness explicability. Finally, sureness explicability indicates confidence of the classifying system to make any prediction at all. These concepts are examined in the framework of pairwise coupling, which is a non-trainable metamodel that originated during development of support vector machines. Several methodologies are evaluated, of which the key one is shown to be the choice of the pairwise coupling method. We compare two methods: the established Wu-Lin-Weng method with the recently proposed Bayes covariant method. Our experiments indicate that the Wu-Lin-Weng method gives more weight to a single pairwise classifier, whereas the latter tries to balance information from the whole matrix of pairwise likelihoods. This translates into higher accuracy, and better sureness predictions for the Bayes covariant method. Pairwise coupling methodology has its costs, especially in terms of the number of parameters (but not necessarily in terms of training costs). However, when additional explicability aspects beyond accuracy are desired in an application, the pairwise coupling models are a promising alternative to the established methodology.
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03645v1
PDF	https://arxiv.org/pdf/1911.03645v1.pdf
PWC	https://paperswithcode.com/paper/pairwise-coupling-of-convolutional-neural
Repo
Framework

Consistent Optimization for Single-Shot Object Detection


Title	Consistent Optimization for Single-Shot Object Detection
Authors	Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Jianbo Shi
Abstract	We present consistent optimization for single stage object detection. Previous works of single stage object detectors usually rely on the regular, dense sampled anchors to generate hypothesis for the optimization of the model. Through an examination of the behavior of the detector, we observe that the misalignment between the optimization target and inference configurations has hindered the performance improvement. We propose to bride this gap by consistent optimization, which is an extension of the traditional single stage detector’s optimization strategy. Consistent optimization focuses on matching the training hypotheses and the inference quality by utilizing of the refined anchors during training. To evaluate its effectiveness, we conduct various design choices based on the state-of-the-art RetinaNet detector. We demonstrate it is the consistent optimization, not the architecture design, that yields the performance boosts. Consistent optimization is nearly cost-free, and achieves stable performance gains independent of the model capacities or input scales. Specifically, utilizing consistent optimization improves RetinaNet from 39.1 AP to 40.1 AP on COCO dataset without any bells or whistles, which surpasses the accuracy of all existing state-of-the-art one-stage detectors when adopting ResNet-101 as backbone. The code will be made available.
Tasks	Object Detection
Published	2019-01-19
URL	http://arxiv.org/abs/1901.06563v2
PDF	http://arxiv.org/pdf/1901.06563v2.pdf
PWC	https://paperswithcode.com/paper/consistent-optimization-for-single-shot
Repo
Framework

Exploration-Enhanced POLITEX


Title	Exploration-Enhanced POLITEX
Authors	Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari, Gellert Weisz
Abstract	We study algorithms for average-cost reinforcement learning problems with value function approximation. Our starting point is the recently proposed POLITEX algorithm, a version of policy iteration where the policy produced in each iteration is near-optimal in hindsight for the sum of all past value function estimates. POLITEX has sublinear regret guarantees in uniformly-mixing MDPs when the value estimation error can be controlled, which can be satisfied if all policies sufficiently explore the environment. Unfortunately, this assumption is often unrealistic. Motivated by the rapid growth of interest in developing policies that learn to explore their environment in the lack of rewards (also known as no-reward learning), we replace the previous assumption that all policies explore the environment with that a single, sufficiently exploring policy is available beforehand. The main contribution of the paper is the modification of POLITEX to incorporate such an exploration policy in a way that allows us to obtain a regret guarantee similar to the previous one but without requiring that all policies explore environment. In addition to the novel theoretical guarantees, we demonstrate the benefits of our scheme on environments which are difficult to explore using simple schemes like dithering. While the solution we obtain may not achieve the best possible regret, it is the first result that shows how to control the regret in the presence of function approximation errors on problems where exploration is nontrivial. Our approach can also be seen as a way of reducing the problem of minimizing the regret to learning a good exploration policy. We believe that modular approaches like ours can be highly beneficial in tackling harder control problems.
Tasks
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10479v1
PDF	https://arxiv.org/pdf/1908.10479v1.pdf
PWC	https://paperswithcode.com/paper/exploration-enhanced-politex
Repo
Framework

Mixed-Supervised Dual-Network for Medical Image Segmentation


Title	Mixed-Supervised Dual-Network for Medical Image Segmentation
Authors	Duo Wang, Ming Li, Nir Ben-Shlomo, C. Eduardo Corrales, Yu Cheng, Tao Zhang, Jagadeesan Jayender
Abstract	Deep learning based medical image segmentation models usually require large datasets with high-quality dense segmentations to train, which are very time-consuming and expensive to prepare. One way to tackle this challenge is by using the mixed-supervised learning framework, in which only a part of data is densely annotated with segmentation label and the rest is weakly labeled with bounding boxes. The model is trained jointly in a multi-task learning setting. In this paper, we propose Mixed-Supervised Dual-Network (MSDN), a novel architecture which consists of two separate networks for the detection and segmentation tasks respectively, and a series of connection modules between the layers of the two networks. These connection modules are used to transfer useful information from the auxiliary detection task to help the segmentation task. We propose to use a recent technique called “Squeeze and Excitation” in the connection module to boost the transfer. We conduct experiments on two medical image segmentation datasets. The proposed MSDN model outperforms multiple baselines.
Tasks	Medical Image Segmentation, Multi-Task Learning, Semantic Segmentation
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10209v2
PDF	https://arxiv.org/pdf/1907.10209v2.pdf
PWC	https://paperswithcode.com/paper/mixed-supervised-dual-network-for-medical
Repo
Framework

Application-level Studies of Cellular Neural Network-based Hardware Accelerators


Title	Application-level Studies of Cellular Neural Network-based Hardware Accelerators
Authors	Qiuwen Lou, Indranil Palit, Tang Li, Andras Horvath, Michael Niemier, X. Sharon Hu
Abstract	As cost and performance benefits associated with Moore’s Law scaling slow, researchers are studying alternative architectures (e.g., based on analog and/or spiking circuits) and/or computational models (e.g., convolutional and recurrent neural networks) to perform application-level tasks faster, more energy efficiently, and/or more accurately. We investigate cellular neural network (CeNN)-based co-processors at the application-level for these metrics. While it is well-known that CeNNs can be well-suited for spatio-temporal information processing, few (if any) studies have quantified the energy/delay/accuracy of a CeNN-friendly algorithm and compared the CeNN-based approach to the best von Neumann algorithm at the application level. We present an evaluation framework for such studies. As a case study, a CeNN-friendly target-tracking algorithm was developed and mapped to an array architecture developed in conjunction with the algorithm. We compare the energy, delay, and accuracy of our architecture/algorithm (assuming all overheads) to the most accurate von Neumann algorithm (Struck). Von Neumann CPU data is measured on an Intel i5 chip. The CeNN approach is capable of matching the accuracy of Struck, and can offer approximately 1000x improvements in energy-delay product.
Tasks
Published	2019-02-28
URL	https://arxiv.org/abs/1903.06649v2
PDF	https://arxiv.org/pdf/1903.06649v2.pdf
PWC	https://paperswithcode.com/paper/application-level-studies-of-cellular-neural
Repo
Framework

Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade


Title	Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade
Authors	Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath
Abstract	For automatic speech translation (AST), end-to-end approaches are outperformed by cascaded models that transcribe with automatic speech recognition (ASR), then translate with machine translation (MT). A major cause of the performance gap is that, while existing AST corpora are small, massive datasets exist for both the ASR and MT subsystems. In this work, we evaluate several data augmentation and pretraining approaches for AST, by comparing all on the same datasets. Simple data augmentation by translating ASR transcripts proves most effective on the English–French augmented LibriSpeech dataset, closing the performance gap from 8.2 to 1.4 BLEU, compared to a very strong cascade that could directly utilize copious ASR and MT data. The same end-to-end approach plus fine-tuning closes the gap on the English–Romanian MuST-C dataset from 6.7 to 3.7 BLEU. In addition to these results, we present practical recommendations for augmentation and pretraining approaches. Finally, we decrease the performance gap to 0.01 BLEU using a Transformer-based architecture.
Tasks	Data Augmentation, Machine Translation, Speech Recognition
Published	2019-09-14
URL	https://arxiv.org/abs/1909.06515v2
PDF	https://arxiv.org/pdf/1909.06515v2.pdf
PWC	https://paperswithcode.com/paper/leveraging-out-of-task-data-for-end-to-end
Repo
Framework

Machine Learning with Multi-Site Imaging Data: An Empirical Study on the Impact of Scanner Effects


Title	Machine Learning with Multi-Site Imaging Data: An Empirical Study on the Impact of Scanner Effects
Authors	Ben Glocker, Robert Robinson, Daniel C. Castro, Qi Dou, Ender Konukoglu
Abstract	This is an empirical study to investigate the impact of scanner effects when using machine learning on multi-site neuroimaging data. We utilize structural T1-weighted brain MRI obtained from two different studies, Cam-CAN and UK Biobank. For the purpose of our investigation, we construct a dataset consisting of brain scans from 592 age- and sex-matched individuals, 296 subjects from each original study. Our results demonstrate that even after careful pre-processing with state-of-the-art neuroimaging pipelines a classifier can easily distinguish between the origin of the data with very high accuracy. Our analysis on the example application of sex classification suggests that current approaches to harmonize data are unable to remove scanner-specific bias leading to overly optimistic performance estimates and poor generalization. We conclude that multi-site data harmonization remains an open challenge and particular care needs to be taken when using such data with advanced machine learning methods for predictive modelling.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04597v1
PDF	https://arxiv.org/pdf/1910.04597v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-with-multi-site-imaging-data
Repo
Framework

Band-Limited Gaussian Processes: The Sinc Kernel


Title	Band-Limited Gaussian Processes: The Sinc Kernel
Authors	Felipe Tobar
Abstract	We propose a novel class of Gaussian processes (GPs) whose spectra have compact support, meaning that their sample trajectories are almost-surely band limited. As a complement to the growing literature on spectral design of covariance kernels, the core of our proposal is to model power spectral densities through a rectangular function, which results in a kernel based on the sinc function with straightforward extensions to non-centred (around zero frequency) and frequency-varying cases. In addition to its use in regression, the relationship between the sinc kernel and the classic theory is illuminated, in particular, the Shannon-Nyquist theorem is interpreted as posterior reconstruction under the proposed kernel. Additionally, we show that the sinc kernel is instrumental in two fundamental signal processing applications: first, in stereo amplitude modulation, where the non-centred sinc kernel arises naturally. Second, for band-pass filtering, where the proposed kernel allows for a Bayesian treatment that is robust to observation noise and missing data. The developed theory is complemented with illustrative graphic examples and validated experimentally using real-world data.
Tasks	Gaussian Processes
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07279v1
PDF	https://arxiv.org/pdf/1909.07279v1.pdf
PWC	https://paperswithcode.com/paper/band-limited-gaussian-processes-the-sinc
Repo
Framework

Clustered Reinforcement Learning


Title	Clustered Reinforcement Learning
Authors	Xiao Ma, Shen-Yi Zhao, Wu-Jun Li
Abstract	Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover novel areas or high reward~(quality) areas. In most existing methods, the novelty and quality in the neighboring area of the current state are not well utilized to guide the exploration of the agent. To tackle this problem, we propose a novel RL framework, called \underline{c}lustered \underline{r}einforcement \underline{l}earning~(CRL), for efficient exploration in RL. CRL adopts clustering to divide the collected states into several clusters, based on which a bonus reward reflecting both novelty and quality in the neighboring area~(cluster) of the current state is given to the agent. Experiments on a continuous control task and several \emph{Atari 2600} games show that CRL can outperform other state-of-the-art methods to achieve the best performance in most cases.
Tasks	Atari Games, Continuous Control, Efficient Exploration
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02457v1
PDF	https://arxiv.org/pdf/1906.02457v1.pdf
PWC	https://paperswithcode.com/paper/clustered-reinforcement-learning
Repo
Framework