October 16, 2019

3215 words 16 mins read

Paper Group ANR 1018

Active Testing: An Efficient and Robust Framework for Estimating Accuracy. Mix&Match - Agent Curricula for Reinforcement Learning. Inducing and Embedding Senses with Scaled Gumbel Softmax. Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning. Regularizing Deep Hashing Networks Using GA …

Active Testing: An Efficient and Robust Framework for Estimating Accuracy


Title	Active Testing: An Efficient and Robust Framework for Estimating Accuracy
Authors	Phuc Nguyen, Deva Ramanan, Charless Fowlkes
Abstract	Much recent work on visual recognition aims to scale up learning to massive, noisily-annotated datasets. We address the problem of scaling- up the evaluation of such models to large-scale datasets with noisy labels. Current protocols for doing so require a human user to either vet (re-annotate) a small fraction of the test set and ignore the rest, or else correct errors in annotation as they are found through manual inspection of results. In this work, we re-formulate the problem as one of active testing, and examine strategies for efficiently querying a user so as to obtain an accu- rate performance estimate with minimal vetting. We demonstrate the effectiveness of our proposed active testing framework on estimating two performance metrics, Precision@K and mean Average Precision, for two popular computer vision tasks, multi-label classification and instance segmentation. We further show that our approach is able to save significant human annotation effort and is more robust than alternative evaluation protocols.
Tasks	Instance Segmentation, Multi-Label Classification, Semantic Segmentation
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00493v1
PDF	http://arxiv.org/pdf/1807.00493v1.pdf
PWC	https://paperswithcode.com/paper/active-testing-an-efficient-and-robust
Repo
Framework

Mix&Match - Agent Curricula for Reinforcement Learning


Title	Mix&Match - Agent Curricula for Reinforcement Learning
Authors	Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Simon Osindero, Nicolas Heess, Razvan Pascanu
Abstract	We introduce Mix&Match (M&M) - a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents. Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping from solutions found by simpler agents. In contradistinction to typical curriculum learning approaches, we do not gradually modify the tasks or environments presented, but instead use a process to gradually alter how the policy is represented internally. We show the broad applicability of our method by demonstrating significant performance gains in three different experimental setups: (1) We train an agent able to control more than 700 actions in a challenging 3D first-person task; using our method to progress through an action-space curriculum we achieve both faster training and better final performance than one obtains using traditional methods. (2) We further show that M&M can be used successfully to progress through a curriculum of architectural variants defining an agents internal state. (3) Finally, we illustrate how a variant of our method can be used to improve agent performance in a multitask setting.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01780v1
PDF	http://arxiv.org/pdf/1806.01780v1.pdf
PWC	https://paperswithcode.com/paper/mixmatch-agent-curricula-for-reinforcement
Repo
Framework

Inducing and Embedding Senses with Scaled Gumbel Softmax


Title	Inducing and Embedding Senses with Scaled Gumbel Softmax
Authors	Fenfei Guo, Mohit Iyyer, Jordan Boyd-Graber
Abstract	Methods for learning word sense embeddings represent a single word with multiple sense-specific vectors. These methods should not only produce interpretable sense embeddings, but should also learn how to select which sense to use in a given context. We propose an unsupervised model that learns sense embeddings using a modified Gumbel softmax function, which allows for differentiable discrete sense selection. Our model produces sense embeddings that are competitive (and sometimes state of the art) on multiple similarity based downstream evaluations. However, performance on these downstream evaluations tasks does not correlate with interpretability of sense embeddings, as we discover through an interpretability comparison with competing multi-sense embeddings. While many previous approaches perform well on downstream evaluations, they do not produce interpretable embeddings and learn duplicated sense groups; our method achieves the best of both worlds.
Tasks
Published	2018-04-22
URL	https://arxiv.org/abs/1804.08077v2
PDF	https://arxiv.org/pdf/1804.08077v2.pdf
PWC	https://paperswithcode.com/paper/inducing-and-embedding-senses-with-scaled
Repo
Framework

Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning


Title	Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning
Authors	Ilkay Oksuz, Bram Ruijsink, Esther Puyol-Anton, James Clough, Gastao Cruz, Aurelien Bustin, Claudia Prieto, Rene Botnar, Daniel Rueckert, Julia A. Schnabel, Andrew P. King
Abstract	Good quality of medical images is a prerequisite for the success of subsequent image analysis pipelines. Quality assessment of medical images is therefore an essential activity and for large population studies such as the UK Biobank (UKBB), manual identification of artefacts such as those caused by unanticipated motion is tedious and time-consuming. Therefore, there is an urgent need for automatic image quality assessment techniques. In this paper, we propose a method to automatically detect the presence of motion-related artefacts in cardiac magnetic resonance (CMR) cine images. We compare two deep learning architectures to classify poor quality CMR images: 1) 3D spatio-temporal Convolutional Neural Networks (3D-CNN), 2) Long-term Recurrent Convolutional Network (LRCN). Though in real clinical setup motion artefacts are common, high-quality imaging of UKBB, which comprises cross-sectional population data of volunteers who do not necessarily have health problems creates a highly imbalanced classification problem. Due to the high number of good quality images compared to the relatively low number of images with motion artefacts, we propose a novel data augmentation scheme based on synthetic artefact creation in k-space. We also investigate a learning approach using a predetermined curriculum based on synthetic artefact severity. We evaluate our pipeline on a subset of the UK Biobank data set consisting of 3510 CMR images. The LRCN architecture outperformed the 3D-CNN architecture and was able to detect 2D+time short axis images with motion artefacts in less than 1ms with high recall. We compare our approach to a range of state-of-the-art quality assessment methods. The novel data augmentation and curriculum learning approaches both improved classification performance achieving overall area under the ROC curve of 0.89.
Tasks	Data Augmentation, Image Quality Assessment
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12185v2
PDF	http://arxiv.org/pdf/1810.12185v2.pdf
PWC	https://paperswithcode.com/paper/automatic-cnn-based-detection-of-cardiac-mr
Repo
Framework

Regularizing Deep Hashing Networks Using GAN Generated Fake Images


Title	Regularizing Deep Hashing Networks Using GAN Generated Fake Images
Authors	Libing Geng, Yan Pan, Jikai Chen, Hanjiang Lai
Abstract	Recently, deep-networks-based hashing (deep hashing) has become a leading approach for large-scale image retrieval. It aims to learn a compact bitwise representation for images via deep networks, so that similar images are mapped to nearby hash codes. Since a deep network model usually has a large number of parameters, it may probably be too complicated for the training data we have, leading to model over-fitting. To address this issue, in this paper, we propose a simple two-stage pipeline to learn deep hashing models, by regularizing the deep hashing networks using fake images. The first stage is to generate fake images from the original training set without extra data, via a generative adversarial network (GAN). In the second stage, we propose a deep architec- ture to learn hash functions, in which we use a maximum-entropy based loss to incorporate the newly created fake images by the GAN. We show that this loss acts as a strong regularizer of the deep architecture, by penalizing low-entropy output hash codes. This loss can also be interpreted as a model ensemble by simultaneously training many network models with massive weight sharing but over different training sets. Empirical evaluation results on several benchmark datasets show that the proposed method has superior performance gains over state-of-the-art hashing methods.
Tasks	Image Retrieval
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09466v2
PDF	http://arxiv.org/pdf/1803.09466v2.pdf
PWC	https://paperswithcode.com/paper/regularizing-deep-hashing-networks-using-gan
Repo
Framework

Robust Hypothesis Testing Using Wasserstein Uncertainty Sets


Title	Robust Hypothesis Testing Using Wasserstein Uncertainty Sets
Authors	Rui Gao, Liyan Xie, Yao Xie, Huan Xu
Abstract	We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector.
Tasks
Published	2018-05-27
URL	http://arxiv.org/abs/1805.10611v1
PDF	http://arxiv.org/pdf/1805.10611v1.pdf
PWC	https://paperswithcode.com/paper/robust-hypothesis-testing-using-wasserstein
Repo
Framework

A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking


Title	A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking
Authors	Henning Tjaden, Ulrich Schwanecke, Elmar Schömer, Daniel Cremers
Abstract	We propose an algorithm for real-time 6DOF pose tracking of rigid 3D objects using a monocular RGB camera. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, we systematically derive a Gauss-Newton optimization scheme which gives rise to drastically faster convergence and highly accurate and robust tracking performance. We furthermore propose a novel complex dataset dedicated for the task of monocular object pose tracking and make it publicly available to the community. To our knowledge, it is the first to address the common and important scenario in which both the camera as well as the objects are moving simultaneously in cluttered scenes. In numerous experiments - including our own proposed dataset - we demonstrate that the proposed Gauss-Newton approach outperforms existing approaches, in particular in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions.
Tasks	Multiple Object Tracking, Object Tracking, Pose Tracking
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02087v2
PDF	http://arxiv.org/pdf/1807.02087v2.pdf
PWC	https://paperswithcode.com/paper/a-region-based-gauss-newton-approach-to-real
Repo
Framework

Studying oppressive cityscapes of Bangladesh


Title	Studying oppressive cityscapes of Bangladesh
Authors	Halima Akhter, Nazmus Saquib, Deeni Fatiha
Abstract	In a densely populated city like Dhaka (Bangladesh), a growing number of high-rise buildings is an inevitable reality. However, they pose mental health risks for citizens in terms of detachment from natural light, sky view, greenery, and environmental landscapes. The housing economy and rent structure in different areas may or may not take account of such environmental factors. In this paper, we build a computer vision based pipeline to study factors like sky visibility, greenery in the sidewalks, and dominant colors present in streets from a pedestrian’s perspective. We show that people in lower economy classes may suffer from lower sky visibility, whereas people in higher economy classes may suffer from lack of greenery in their environment, both of which could be possibly addressed by implementing rent restructuring schemes.
Tasks
Published	2018-12-10
URL	http://arxiv.org/abs/1812.10413v1
PDF	http://arxiv.org/pdf/1812.10413v1.pdf
PWC	https://paperswithcode.com/paper/studying-oppressive-cityscapes-of-bangladesh
Repo
Framework

Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fitting


Title	Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fitting
Authors	Yuuki Yokoyama, Tomu Katsumata, Muneki Yasuda
Abstract	Generalization is one of the most important issues in machine learning problems. In this study, we consider generalization in restricted Boltzmann machines (RBMs). We propose an RBM with multivalued hidden variables, which is a simple extension of conventional RBMs. We demonstrate that the proposed model is better than the conventional model via numerical experiments for contrastive divergence learning with artificial data and a classification problem with MNIST.
Tasks
Published	2018-11-30
URL	https://arxiv.org/abs/1811.12587v4
PDF	https://arxiv.org/pdf/1811.12587v4.pdf
PWC	https://paperswithcode.com/paper/restricted-boltzmann-machine-with-multivalued
Repo
Framework

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings


Title	Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
Authors	John D. Co-Reyes, YuXuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine
Abstract	In this work, we take a representation learning perspective on hierarchical reinforcement learning, where the problem of learning lower layers in a hierarchy is transformed into the problem of learning trajectory-level generative models. We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems. Our proposed model, SeCTAR, draws inspiration from variational autoencoders, and learns latent representations of trajectories. A key component of this method is to learn both a latent-conditioned policy and a latent-conditioned model which are consistent with each other. Given the same latent, the policy generates a trajectory which should match the trajectory predicted by the model. This model provides a built-in prediction mechanism, by predicting the outcome of closed loop policy behavior. We propose a novel algorithm for performing hierarchical RL with this model, combining model-based planning in the learned latent space with an unsupervised exploration objective. We show that our model is effective at reasoning over long horizons with sparse rewards for several simulated tasks, outperforming standard reinforcement learning methods and prior methods for hierarchical reasoning, model-based planning, and exploration.
Tasks	Hierarchical Reinforcement Learning, Representation Learning
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02813v1
PDF	http://arxiv.org/pdf/1806.02813v1.pdf
PWC	https://paperswithcode.com/paper/self-consistent-trajectory-autoencoder
Repo
Framework

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms


Title	AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Authors	Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
Abstract	This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks. Seq2Seq has been outstanding at numerous tasks involving sequence modeling such as speech synthesis and recognition, machine translation, and image captioning. In contrast to current VC techniques, our method 1) stabilizes and accelerates the training procedure by considering guided attention and proposed context preservation losses, 2) allows not only spectral envelopes but also fundamental frequency contours and durations of speech to be converted, 3) requires no context information such as phoneme labels, and 4) requires no time-aligned source and target speech data in advance. In our experiment, the proposed VC framework can be trained in only one day, using only one GPU of an NVIDIA Tesla K80, while the quality of the synthesized speech is higher than that of speech converted by Gaussian mixture model-based VC and is comparable to that of speech generated by recurrent neural network-based text-to-speech synthesis, which can be regarded as an upper limit on VC performance.
Tasks	Image Captioning, Machine Translation, Speech Synthesis, Text-To-Speech Synthesis, Voice Conversion
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04076v1
PDF	http://arxiv.org/pdf/1811.04076v1.pdf
PWC	https://paperswithcode.com/paper/atts2s-vc-sequence-to-sequence-voice
Repo
Framework

Speeding up the Metabolism in E-commerce by Reinforcement Mechanism Design


Title	Speeding up the Metabolism in E-commerce by Reinforcement Mechanism Design
Authors	Hua-Lin He, Chun-Xiang Pan, Qing Da, An-Xiang Zeng
Abstract	In a large E-commerce platform, all the participants compete for impressions under the allocation mechanism of the platform. Existing methods mainly focus on the short-term return based on the current observations instead of the long-term return. In this paper, we formally establish the lifecycle model for products, by defining the introduction, growth, maturity and decline stages and their transitions throughout the whole life period. Based on such model, we further propose a reinforcement learning based mechanism design framework for impression allocation, which incorporates the first principal component based permutation and the novel experiences generation method, to maximize short-term as well as long-term return of the platform. With the power of trial-and-error, it is possible to optimize impression allocation strategies globally which is contribute to the healthy development of participants and the platform itself. We evaluate our algorithm on a simulated environment built based on one of the largest E-commerce platforms, and a significant improvement has been achieved in comparison with the baseline solutions.
Tasks
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00448v1
PDF	http://arxiv.org/pdf/1807.00448v1.pdf
PWC	https://paperswithcode.com/paper/speeding-up-the-metabolism-in-e-commerce-by
Repo
Framework

Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning


Title	Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning
Authors	Mariano Chouza, Stephen Roberts, Stefan Zohren
Abstract	In this paper we model the loss function of high-dimensional optimization problems by a Gaussian random field, or equivalently a Gaussian process. Our aim is to study gradient descent in such loss functions or energy landscapes and compare it to results obtained from real high-dimensional optimization problems such as encountered in deep learning. In particular, we analyze the distribution of the improved loss function after a step of gradient descent, provide analytic expressions for the moments as well as prove asymptotic normality as the dimension of the parameter space becomes large. Moreover, we compare this with the expectation of the global minimum of the landscape obtained by means of the Euler characteristic of excursion sets. Besides complementing our analytical findings with numerical results from simulated Gaussian random fields, we also compare it to loss functions obtained from optimisation problems on synthetic and real data sets by proposing a “black box” random field toy-model for a deep neural network loss function.
Tasks
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09119v1
PDF	http://arxiv.org/pdf/1803.09119v1.pdf
PWC	https://paperswithcode.com/paper/gradient-descent-in-gaussian-random-fields-as
Repo
Framework

Deep Similarity Metric Learning for Real-Time Pedestrian Tracking


Title	Deep Similarity Metric Learning for Real-Time Pedestrian Tracking
Authors	Michael Thoreau, Navinda Kottege
Abstract	Tracking by detection is a common approach to solving the Multiple Object Tracking problem. In this paper we show how learning a deep similarity metric can improve three key aspects of pedestrian tracking on a multiple object tracking benchmark. We train a convolutional neural network to learn an embedding function in a Siamese configuration on a large person re-identification dataset. The offline-trained embedding network is integrated in to the tracking formulation to improve performance while retaining real-time performance. The proposed tracker stores appearance metrics while detections are strong, using this appearance information to: prevent ID switches, associate tracklets through occlusion, and propose new detections where detector confidence is low. This method achieves competitive results in evaluation, especially among online, real-time approaches. We present an ablative study showing the impact of each of the three uses of our deep appearance metric.
Tasks	Metric Learning, Multiple Object Tracking, Object Tracking, Person Re-Identification
Published	2018-06-20
URL	https://arxiv.org/abs/1806.07592v2
PDF	https://arxiv.org/pdf/1806.07592v2.pdf
PWC	https://paperswithcode.com/paper/improving-online-multiple-object-tracking
Repo
Framework

Extreme Classification in Log Memory


Title	Extreme Classification in Log Memory
Authors	Qixuan Huang, Yiqiu Wang, Tharun Medini, Anshumali Shrivastava
Abstract	We present Merged-Averaged Classifiers via Hashing (MACH) for K-classification with ultra-large values of K. Compared to traditional one-vs-all classifiers that require O(Kd) memory and inference cost, MACH only need O(d log K) (d is dimensionality )memory while only requiring O(K log K + d log K) operation for inference. MACH is a generic K-classification algorithm, with provably theoretical guarantees, which requires O(log K) memory without any assumption on the relationship between classes. MACH uses universal hashing to reduce classification with a large number of classes to few independent classification tasks with small (constant) number of classes. We provide theoretical quantification of discriminability-memory tradeoff. With MACH we can train ODP dataset with 100,000 classes and 400,000 features on a single Titan X GPU, with the classification accuracy of 19.28%, which is the best-reported accuracy on this dataset. Before this work, the best performing baseline is a one-vs-all classifier that requires 40 billion parameters (160 GB model size) and achieves 9% accuracy. In contrast, MACH can achieve 9% accuracy with 480x reduction in the model size (of mere 0.3GB). With MACH, we also demonstrate complete training of fine-grained imagenet dataset (compressed size 104GB), with 21,000 classes, on a single GPU. To the best of our knowledge, this is the first work to demonstrate complete training of these extreme-class datasets on a single Titan X.
Tasks
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04254v1
PDF	http://arxiv.org/pdf/1810.04254v1.pdf
PWC	https://paperswithcode.com/paper/extreme-classification-in-log-memory
Repo
Framework