January 29, 2020

3445 words 17 mins read

Paper Group ANR 735

Rethinking Person Re-Identification with Confidence. Scanner Invariant Representations for Diffusion MRI Harmonization. Richer priors for infinitely wide multi-layer perceptrons. Augmented Replay Memory in Reinforcement Learning With Continuous Control. Event-scheduling algorithms with Kalikow decomposition for simulating potentially infinite neuro …

Rethinking Person Re-Identification with Confidence


Title	Rethinking Person Re-Identification with Confidence
Authors	George Adaimi, Sven Kreiss, Alexandre Alahi
Abstract	A common challenge in person re-identification systems is to differentiate people with very similar appearances. The current learning frameworks based on cross-entropy minimization are not suited for this challenge. To tackle this issue, we propose to modify the cross-entropy loss and model confidence in the representation learning framework using three methods: label smoothing, confidence penalty, and deep variational information bottleneck. A key property of our approach is the fact that we do not make use of any hand-crafted human characteristics but rather focus our attention on the learning supervision. Although methods modeling confidence did not show significant improvements on other computer vision tasks such as object classification, we are able to show their notable effect on the task of re-identifying people outperforming state-of-the-art methods on 3 publicly available datasets. Our analysis and experiments not only offer insights into the problems that person re-id suffers from, but also provide a simple and straightforward recipe to tackle this issue.
Tasks	Object Classification, Person Re-Identification, Representation Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04692v1
PDF	https://arxiv.org/pdf/1906.04692v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-person-re-identification-with
Repo
Framework

Scanner Invariant Representations for Diffusion MRI Harmonization


Title	Scanner Invariant Representations for Diffusion MRI Harmonization
Authors	Daniel Moyer, Greg Ver Steeg, Chantal M. W. Tax, Paul M. Thompson
Abstract	Purpose: In the present work we describe the correction of diffusion-weighted MRI for site and scanner biases using a novel method based on invariant representation. Theory and Methods: Pooled imaging data from multiple sources are subject to variation between the sources. Correcting for these biases has become very important as imaging studies increase in size and multi-site cases become more common. We propose learning an intermediate representation invariant to site/protocol variables, a technique adapted from information theory-based algorithmic fairness; by leveraging the data processing inequality, such a representation can then be used to create an image reconstruction that is uninformative of its original source, yet still faithful to underlying structures. To implement this, we use a deep learning method based on variational auto-encoders (VAE) to construct scanner invariant encodings of the imaging data. Results: To evaluate our method, we use training data from the 2018 MICCAI Computational Diffusion MRI (CDMRI) Challenge Harmonization dataset. Our proposed method shows improvements on independent test data relative to a recently published baseline method on each subtask, mapping data from three different scanning contexts to and from one separate target scanning context. Conclusion: As imaging studies continue to grow, the use of pooled multi-site imaging will similarly increase. Invariant representation presents a strong candidate for the harmonization of these data.
Tasks	Image Reconstruction
Published	2019-04-10
URL	https://arxiv.org/abs/1904.05375v2
PDF	https://arxiv.org/pdf/1904.05375v2.pdf
PWC	https://paperswithcode.com/paper/scanner-invariant-representations-for
Repo
Framework

Richer priors for infinitely wide multi-layer perceptrons


Title	Richer priors for infinitely wide multi-layer perceptrons
Authors	Russell Tsuchida, Fred Roosta, Marcus Gallagher
Abstract	It is well-known that the distribution over functions induced through a zero-mean iid prior distribution over the parameters of a multi-layer perceptron (MLP) converges to a Gaussian process (GP), under mild conditions. We extend this result firstly to independent priors with general zero or non-zero means, and secondly to a family of partially exchangeable priors which generalise iid priors. We discuss how the second prior arises naturally when considering an equivalence class of functions in an MLP and through training processes such as stochastic gradient descent. The model resulting from partially exchangeable priors is a GP, with an additional level of inference in the sense that the prior and posterior predictive distributions require marginalisation over hyperparameters. We derive the kernels of the limiting GP in deep MLPs, and show empirically that these kernels avoid certain pathologies present in previously studied priors. We empirically evaluate our claims of convergence by measuring the maximum mean discrepancy between finite width models and limiting models. We compare the performance of our new limiting model to some previously discussed models on synthetic regression problems. We observe increasing ill-conditioning of the marginal likelihood and hyper-posterior as the depth of the model increases, drawing parallels with finite width networks which require notoriously involved optimisation tricks.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12927v1
PDF	https://arxiv.org/pdf/1911.12927v1.pdf
PWC	https://paperswithcode.com/paper/richer-priors-for-infinitely-wide-multi-layer
Repo
Framework

Augmented Replay Memory in Reinforcement Learning With Continuous Control


Title	Augmented Replay Memory in Reinforcement Learning With Continuous Control
Authors	Mirza Ramicic, Andrea Bonarini
Abstract	Online reinforcement learning agents are currently able to process an increasing amount of data by converting it into a higher order value functions. This expansion of the information collected from the environment increases the agent’s state space enabling it to scale up to a more complex problems but also increases the risk of forgetting by learning on redundant or conflicting data. To improve the approximation of a large amount of data, a random mini-batch of the past experiences that are stored in the replay memory buffer is often replayed at each learning step. The proposed work takes inspiration from a biological mechanism which act as a protective layer of human brain higher cognitive functions: active memory consolidation mitigates the effect of forgetting of previous memories by dynamically processing the new ones. The similar dynamics are implemented by a proposed augmented memory replay AMR capable of optimizing the replay of the experiences from the agent’s memory structure by altering or augmenting their relevance. Experimental results show that an evolved AMR augmentation function capable of increasing the significance of the specific memories is able to further increase the stability and convergence speed of the learning algorithms dealing with the complexity of continuous action domains.
Tasks	Continuous Control
Published	2019-12-29
URL	https://arxiv.org/abs/1912.12719v1
PDF	https://arxiv.org/pdf/1912.12719v1.pdf
PWC	https://paperswithcode.com/paper/augmented-replay-memory-in-reinforcement
Repo
Framework

Event-scheduling algorithms with Kalikow decomposition for simulating potentially infinite neuronal networks


Title	Event-scheduling algorithms with Kalikow decomposition for simulating potentially infinite neuronal networks
Authors	Tien Cuong Phi, Alexandre Muzy, Patricia Reynaud-Bouret
Abstract	Event-scheduling algorithms can compute in continuous time the next occurrence of points (as events) of a counting process based on their current conditional intensity. In particular event-scheduling algorithms can be adapted to perform the simulation of finite neuronal networks activity. These algorithms are based on Ogata’s thinning strategy \cite{Oga81}, which always needs to simulate the whole network to access the behaviour of one particular neuron of the network. On the other hand, for discrete time models, theoretical algorithms based on Kalikow decomposition can pick at random influencing neurons and perform a perfect simulation (meaning without approximations) of the behaviour of one given neuron embedded in an infinite network, at every time step. These algorithms are currently not computationally tractable in continuous time. To solve this problem, an event-scheduling algorithm with Kalikow decomposition is proposed here for the sequential simulation of point processes neuronal models satisfying this decomposition. This new algorithm is applied to infinite neuronal networks whose finite time simulation is a prerequisite to realistic brain modeling.
Tasks	Point Processes
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10576v1
PDF	https://arxiv.org/pdf/1910.10576v1.pdf
PWC	https://paperswithcode.com/paper/event-scheduling-algorithms-with-kalikow
Repo
Framework

Analyzing Recurrent Neural Network by Probabilistic Abstraction


Title	Analyzing Recurrent Neural Network by Probabilistic Abstraction
Authors	Guoliang Dong, Jingyi Wang, Jun Sun, Yang Zhang, Xinyu Wang, Ting Dai, Jin Song Dong
Abstract	Neural network is becoming the dominant approach for solving many real-world problems like computer vision and natural language processing due to its exceptional performance as an end-to-end solution. However, deep learning models are complex and work in a black-box manner in general. This hinders humans from understanding how such systems make decisions or analyzing them using traditional software analysis techniques like testing and verification. To solve this problem and bridge the gap, several recent approaches have proposed to extract simple models in the form of finite-state automata or weighted automata for human understanding and reasoning. The results are however not encouraging due to multiple reasons like low accuracy and scalability issue. In this work, we propose to extract models in the form of probabilistic automata from recurrent neural network models instead. Our work distinguishes itself from existing approaches in two important ways. One is that we extract probabilistic models to compensate for the limited expressiveness of simple models (compared to that of deep neural networks). This is inspired by the observation that human reasoning is often `probabilistic’. The other is that we identify the right level of abstraction based on hierarchical clustering so that the models are extracted in a task-specific way. We conducted experiments on several real-world datasets using state-of-the-art RNN architectures including GRU and LSTM. The result shows that our approach improves existing model extraction approaches significantly and can produce simple models which accurately mimic the original models. \|
Tasks
Published	2019-09-22
URL	https://arxiv.org/abs/1909.10023v1
PDF	https://arxiv.org/pdf/1909.10023v1.pdf
PWC	https://paperswithcode.com/paper/190910023
Repo
Framework

HireNet: a Hierarchical Attention Model for the Automatic Analysis of Asynchronous Video Job Interviews


Title	HireNet: a Hierarchical Attention Model for the Automatic Analysis of Asynchronous Video Job Interviews
Authors	Léo Hemamou, Ghazi Felhi, Vincent Vandenbussche, Jean-Claude Martin, Chloé Clavel
Abstract	New technologies drastically change recruitment techniques. Some research projects aim at designing interactive systems that help candidates practice job interviews. Other studies aim at the automatic detection of social signals (e.g. smile, turn of speech, etc…) in videos of job interviews. These studies are limited with respect to the number of interviews they process, but also by the fact that they only analyze simulated job interviews (e.g. students pretending to apply for a fake position). Asynchronous video interviewing tools have become mature products on the human resources market, and thus, a popular step in the recruitment process. As part of a project to help recruiters, we collected a corpus of more than 7000 candidates having asynchronous video job interviews for real positions and recording videos of themselves answering a set of questions. We propose a new hierarchical attention model called HireNet that aims at predicting the hirability of the candidates as evaluated by recruiters. In HireNet, an interview is considered as a sequence of questions and answers containing salient socials signals. Two contextual sources of information are modeled in HireNet: the words contained in the question and in the job position. Our model achieves better F1-scores than previous approaches for each modality (verbal content, audio and video). Results from early and late multimodal fusion suggest that more sophisticated fusion schemes are needed to improve on the monomodal results. Finally, some examples of moments captured by the attention mechanisms suggest our model could potentially be used to help finding key moments in an asynchronous job interview.
Tasks
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11062v1
PDF	https://arxiv.org/pdf/1907.11062v1.pdf
PWC	https://paperswithcode.com/paper/hirenet-a-hierarchical-attention-model-for
Repo
Framework

Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods


Title	Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods
Authors	Franca Hoffmann, Bamdad Hosseini, Zhi Ren, Andrew M. Stuart
Abstract	Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07658v2
PDF	https://arxiv.org/pdf/1906.07658v2.pdf
PWC	https://paperswithcode.com/paper/consistency-of-semi-supervised-learning
Repo
Framework

Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling


Title	Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling
Authors	Siyuan Feng, Tan Lee, Zhiyuan Peng
Abstract	This study addresses the problem of unsupervised subword unit discovery from untranscribed speech. It forms the basis of the ultimate goal of ZeroSpeech 2019, building text-to-speech systems without text labels. In this work, unit discovery is formulated as a pipeline of phonetically discriminative feature learning and unit inference. One major difficulty in robust unsupervised feature learning is dealing with speaker variation. Here the robustness towards speaker variation is achieved by applying adversarial training and FHVAE based disentangled speech representation learning. A comparison of the two approaches as well as their combination is studied in a DNN-bottleneck feature (DNN-BNF) architecture. Experiments are conducted on ZeroSpeech 2019 and 2017. Experimental results on ZeroSpeech 2017 show that both approaches are effective while the latter is more prominent, and that their combination brings further marginal improvement in across-speaker condition. Results on ZeroSpeech 2019 show that in the ABX discriminability task, our approaches significantly outperform the official baseline, and are competitive to or even outperform the official topline. The proposed unit sequence smoothing algorithm improves synthesis quality, at a cost of slight decrease in ABX discriminability.
Tasks	Representation Learning
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07234v3
PDF	https://arxiv.org/pdf/1906.07234v3.pdf
PWC	https://paperswithcode.com/paper/combining-adversarial-training-and
Repo
Framework

EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment


Title	EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment
Authors	Mohamed Elgharib, Mallikarjun BR, Ayush Tewari, Hyeongwoo Kim, Wentao Liu, Hans-Peter Seidel, Christian Theobalt
Abstract	Face performance capture and reenactment techniques use multiple cameras and sensors, positioned at a distance from the face or mounted on heavy wearable devices. This limits their applications in mobile and outdoor environments. We present EgoFace, a radically new lightweight setup for face performance capture and front-view videorealistic reenactment using a single egocentric RGB camera. Our lightweight setup allows operations in uncontrolled environments, and lends itself to telepresence applications such as video-conferencing from dynamic environments. The input image is projected into a low dimensional latent space of the facial expression parameters. Through careful adversarial training of the parameter-space synthetic rendering, a videorealistic animation is produced. Our problem is challenging as the human visual system is sensitive to the smallest face irregularities that could occur in the final results. This sensitivity is even stronger for video results. Our solution is trained in a pre-processing stage, through a supervised manner without manual annotations. EgoFace captures a wide variety of facial expressions, including mouth movements and asymmetrical expressions. It works under varying illuminations, background, movements, handles people from different ethnicities and can operate in real time.
Tasks
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10822v1
PDF	https://arxiv.org/pdf/1905.10822v1.pdf
PWC	https://paperswithcode.com/paper/egoface-egocentric-face-performance-capture
Repo
Framework

Learning to Remember from a Multi-Task Teacher


Title	Learning to Remember from a Multi-Task Teacher
Authors	Yuwen Xiong, Mengye Ren, Raquel Urtasun
Abstract	Recent studies on catastrophic forgetting during sequential learning typically focus on fixing the accuracy of the predictions for a previously learned task. In this paper we argue that the outputs of neural networks are subject to rapid changes when learning a new data distribution, and networks that appear to “forget” everything still contain useful representation towards previous tasks. Instead of enforcing the output accuracy to stay the same, we propose to reduce the effect of catastrophic forgetting on the representation level, as the output layer can be quickly recovered later with a small number of examples. Towards this goal, we propose an experimental setup that measures the amount of representational forgetting, and develop a novel meta-learning algorithm to overcome this issue. The proposed meta-learner produces weight updates of a sequential learning network, mimicking a multi-task teacher network’s representation. We show that our meta-learner can improve its learned representations on new tasks, while maintaining a good representation for old tasks.
Tasks	Meta-Learning
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04650v2
PDF	https://arxiv.org/pdf/1910.04650v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-remember-from-a-multi-task-1
Repo
Framework

Reliable Estimation of Individual Treatment Effect with Causal Information Bottleneck


Title	Reliable Estimation of Individual Treatment Effect with Causal Information Bottleneck
Authors	Sungyub Kim, Yongsu Baek, Sung Ju Hwang, Eunho Yang
Abstract	Estimating individual level treatment effects (ITE) from observational data is a challenging and important area in causal machine learning and is commonly considered in diverse mission-critical applications. In this paper, we propose an information theoretic approach in order to find more reliable representations for estimating ITE. We leverage the Information Bottleneck (IB) principle, which addresses the trade-off between conciseness and predictive power of representation. With the introduction of an extended graphical model for causal information bottleneck, we encourage the independence between the learned representation and the treatment type. We also introduce an additional form of a regularizer from the perspective of understanding ITE in the semi-supervised learning framework to ensure more reliable representations. Experimental results show that our model achieves the state-of-the-art results and exhibits more reliable prediction performances with uncertainty information on real-world datasets.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03118v1
PDF	https://arxiv.org/pdf/1906.03118v1.pdf
PWC	https://paperswithcode.com/paper/reliable-estimation-of-individual-treatment
Repo
Framework

EasyLabel: A Semi-Automatic Pixel-wise Object Annotation Tool for Creating Robotic RGB-D Datasets


Title	EasyLabel: A Semi-Automatic Pixel-wise Object Annotation Tool for Creating Robotic RGB-D Datasets
Authors	Markus Suchi, Timothy Patten, David Fischinger, Markus Vincze
Abstract	Developing robot perception systems for recognizing objects in the real-world requires computer vision algorithms to be carefully scrutinized with respect to the expected operating domain. This demands large quantities of ground truth data to rigorously evaluate the performance of algorithms. This paper presents the EasyLabel tool for easily acquiring high quality ground truth annotation of objects at the pixel-level in densely cluttered scenes. In a semi-automatic process, complex scenes are incrementally built and EasyLabel exploits depth change to extract precise object masks at each step. We use this tool to generate the Object Cluttered Indoor Dataset (OCID) that captures diverse settings of objects, background, context, sensor to scene distance, viewpoint angle and lighting conditions. OCID is used to perform a systematic comparison of existing object segmentation methods. The baseline comparison supports the need for pixel- and object-wise annotation to progress robot vision towards realistic applications. This insight reveals the usefulness of EasyLabel and OCID to better understand the challenges that robots face in the real-world. Copyright 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Tasks	Semantic Segmentation
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01626v2
PDF	http://arxiv.org/pdf/1902.01626v2.pdf
PWC	https://paperswithcode.com/paper/easylabel-a-semi-automatic-pixel-wise-object
Repo
Framework

Separation of Chaotic Signals by Reservoir Computing


Title	Separation of Chaotic Signals by Reservoir Computing
Authors	Sanjukta Krishnagopal, Michelle Girvan, Edward Ott, Brian Hunt
Abstract	We demonstrate the utility of machine learning in the separation of superimposed chaotic signals using a technique called Reservoir Computing. We assume no knowledge of the dynamical equations that produce the signals, and require only training data consisting of finite time samples of the component signals. We test our method on signals that are formed as linear combinations of signals from two Lorenz systems with different parameters. Comparing our nonlinear method with the optimal linear solution to the separation problem, the Wiener filter, we find that our method significantly outperforms the Wiener filter in all the scenarios we study. Furthermore, this difference is particularly striking when the component signals have similar frequency spectra. Indeed, our method works well when the component frequency spectra are indistinguishable - a case where a Wiener filter performs essentially no separation.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.10080v2
PDF	https://arxiv.org/pdf/1910.10080v2.pdf
PWC	https://paperswithcode.com/paper/separation-of-chaotic-signals-by-reservoir
Repo
Framework

An Optimized and Energy-Efficient Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks


Title	An Optimized and Energy-Efficient Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks
Authors	Julia El Zini, Yara Rizk, Mariette Awad
Abstract	Recurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation through time (BPTT) which is prohibitively expensive, especially when the length of the time dependencies and the number of hidden neurons increase. To reduce the training time, extreme learning machines (ELMs) have been recently applied to RNN training, reaching a 99% speedup on some applications. Due to its non-iterative nature, ELM training, when parallelized, has the potential to reach higher speedups than BPTT. In this work, we present \opt, an optimized parallel RNN training algorithm based on ELM that takes advantage of the GPU shared memory and of parallel QR factorization algorithms to efficiently reach optimal solutions. The theoretical analysis of the proposed algorithm is presented on six RNN architectures, including LSTM and GRU, and its performance is empirically tested on ten time-series prediction applications. \opt~is shown to reach up to 845 times speedup over its sequential counterpart and to require up to 20x less time to train than parallel BPTT.
Tasks	Decision Making, Time Series, Time Series Prediction
Published	2019-11-26
URL	https://arxiv.org/abs/1911.13252v1
PDF	https://arxiv.org/pdf/1911.13252v1.pdf
PWC	https://paperswithcode.com/paper/an-optimized-and-energy-efficient-parallel
Repo
Framework