Paper Group ANR 1018
Active Testing: An Efficient and Robust Framework for Estimating Accuracy. Mix&Match - Agent Curricula for Reinforcement Learning. Inducing and Embedding Senses with Scaled Gumbel Softmax. Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning. Regularizing Deep Hashing Networks Using GA …
Active Testing: An Efficient and Robust Framework for Estimating Accuracy
Title | Active Testing: An Efficient and Robust Framework for Estimating Accuracy |
Authors | Phuc Nguyen, Deva Ramanan, Charless Fowlkes |
Abstract | Much recent work on visual recognition aims to scale up learning to massive, noisily-annotated datasets. We address the problem of scaling- up the evaluation of such models to large-scale datasets with noisy labels. Current protocols for doing so require a human user to either vet (re-annotate) a small fraction of the test set and ignore the rest, or else correct errors in annotation as they are found through manual inspection of results. In this work, we re-formulate the problem as one of active testing, and examine strategies for efficiently querying a user so as to obtain an accu- rate performance estimate with minimal vetting. We demonstrate the effectiveness of our proposed active testing framework on estimating two performance metrics, Precision@K and mean Average Precision, for two popular computer vision tasks, multi-label classification and instance segmentation. We further show that our approach is able to save significant human annotation effort and is more robust than alternative evaluation protocols. |
Tasks | Instance Segmentation, Multi-Label Classification, Semantic Segmentation |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00493v1 |
http://arxiv.org/pdf/1807.00493v1.pdf | |
PWC | https://paperswithcode.com/paper/active-testing-an-efficient-and-robust |
Repo | |
Framework | |
Mix&Match - Agent Curricula for Reinforcement Learning
Title | Mix&Match - Agent Curricula for Reinforcement Learning |
Authors | Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Simon Osindero, Nicolas Heess, Razvan Pascanu |
Abstract | We introduce Mix&Match (M&M) - a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents. Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping from solutions found by simpler agents. In contradistinction to typical curriculum learning approaches, we do not gradually modify the tasks or environments presented, but instead use a process to gradually alter how the policy is represented internally. We show the broad applicability of our method by demonstrating significant performance gains in three different experimental setups: (1) We train an agent able to control more than 700 actions in a challenging 3D first-person task; using our method to progress through an action-space curriculum we achieve both faster training and better final performance than one obtains using traditional methods. (2) We further show that M&M can be used successfully to progress through a curriculum of architectural variants defining an agents internal state. (3) Finally, we illustrate how a variant of our method can be used to improve agent performance in a multitask setting. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01780v1 |
http://arxiv.org/pdf/1806.01780v1.pdf | |
PWC | https://paperswithcode.com/paper/mixmatch-agent-curricula-for-reinforcement |
Repo | |
Framework | |
Inducing and Embedding Senses with Scaled Gumbel Softmax
Title | Inducing and Embedding Senses with Scaled Gumbel Softmax |
Authors | Fenfei Guo, Mohit Iyyer, Jordan Boyd-Graber |
Abstract | Methods for learning word sense embeddings represent a single word with multiple sense-specific vectors. These methods should not only produce interpretable sense embeddings, but should also learn how to select which sense to use in a given context. We propose an unsupervised model that learns sense embeddings using a modified Gumbel softmax function, which allows for differentiable discrete sense selection. Our model produces sense embeddings that are competitive (and sometimes state of the art) on multiple similarity based downstream evaluations. However, performance on these downstream evaluations tasks does not correlate with interpretability of sense embeddings, as we discover through an interpretability comparison with competing multi-sense embeddings. While many previous approaches perform well on downstream evaluations, they do not produce interpretable embeddings and learn duplicated sense groups; our method achieves the best of both worlds. |
Tasks | |
Published | 2018-04-22 |
URL | https://arxiv.org/abs/1804.08077v2 |
https://arxiv.org/pdf/1804.08077v2.pdf | |
PWC | https://paperswithcode.com/paper/inducing-and-embedding-senses-with-scaled |
Repo | |
Framework | |
Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning
Title | Automatic CNN-based detection of cardiac MR motion artefacts using k-space data augmentation and curriculum learning |
Authors | Ilkay Oksuz, Bram Ruijsink, Esther Puyol-Anton, James Clough, Gastao Cruz, Aurelien Bustin, Claudia Prieto, Rene Botnar, Daniel Rueckert, Julia A. Schnabel, Andrew P. King |
Abstract | Good quality of medical images is a prerequisite for the success of subsequent image analysis pipelines. Quality assessment of medical images is therefore an essential activity and for large population studies such as the UK Biobank (UKBB), manual identification of artefacts such as those caused by unanticipated motion is tedious and time-consuming. Therefore, there is an urgent need for automatic image quality assessment techniques. In this paper, we propose a method to automatically detect the presence of motion-related artefacts in cardiac magnetic resonance (CMR) cine images. We compare two deep learning architectures to classify poor quality CMR images: 1) 3D spatio-temporal Convolutional Neural Networks (3D-CNN), 2) Long-term Recurrent Convolutional Network (LRCN). Though in real clinical setup motion artefacts are common, high-quality imaging of UKBB, which comprises cross-sectional population data of volunteers who do not necessarily have health problems creates a highly imbalanced classification problem. Due to the high number of good quality images compared to the relatively low number of images with motion artefacts, we propose a novel data augmentation scheme based on synthetic artefact creation in k-space. We also investigate a learning approach using a predetermined curriculum based on synthetic artefact severity. We evaluate our pipeline on a subset of the UK Biobank data set consisting of 3510 CMR images. The LRCN architecture outperformed the 3D-CNN architecture and was able to detect 2D+time short axis images with motion artefacts in less than 1ms with high recall. We compare our approach to a range of state-of-the-art quality assessment methods. The novel data augmentation and curriculum learning approaches both improved classification performance achieving overall area under the ROC curve of 0.89. |
Tasks | Data Augmentation, Image Quality Assessment |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12185v2 |
http://arxiv.org/pdf/1810.12185v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-cnn-based-detection-of-cardiac-mr |
Repo | |
Framework | |
Regularizing Deep Hashing Networks Using GAN Generated Fake Images
Title | Regularizing Deep Hashing Networks Using GAN Generated Fake Images |
Authors | Libing Geng, Yan Pan, Jikai Chen, Hanjiang Lai |
Abstract | Recently, deep-networks-based hashing (deep hashing) has become a leading approach for large-scale image retrieval. It aims to learn a compact bitwise representation for images via deep networks, so that similar images are mapped to nearby hash codes. Since a deep network model usually has a large number of parameters, it may probably be too complicated for the training data we have, leading to model over-fitting. To address this issue, in this paper, we propose a simple two-stage pipeline to learn deep hashing models, by regularizing the deep hashing networks using fake images. The first stage is to generate fake images from the original training set without extra data, via a generative adversarial network (GAN). In the second stage, we propose a deep architec- ture to learn hash functions, in which we use a maximum-entropy based loss to incorporate the newly created fake images by the GAN. We show that this loss acts as a strong regularizer of the deep architecture, by penalizing low-entropy output hash codes. This loss can also be interpreted as a model ensemble by simultaneously training many network models with massive weight sharing but over different training sets. Empirical evaluation results on several benchmark datasets show that the proposed method has superior performance gains over state-of-the-art hashing methods. |
Tasks | Image Retrieval |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09466v2 |
http://arxiv.org/pdf/1803.09466v2.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-deep-hashing-networks-using-gan |
Repo | |
Framework | |
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets
Title | Robust Hypothesis Testing Using Wasserstein Uncertainty Sets |
Authors | Rui Gao, Liyan Xie, Yao Xie, Huan Xu |
Abstract | We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector. |
Tasks | |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10611v1 |
http://arxiv.org/pdf/1805.10611v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-hypothesis-testing-using-wasserstein |
Repo | |
Framework | |
A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking
Title | A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking |
Authors | Henning Tjaden, Ulrich Schwanecke, Elmar Schömer, Daniel Cremers |
Abstract | We propose an algorithm for real-time 6DOF pose tracking of rigid 3D objects using a monocular RGB camera. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, we systematically derive a Gauss-Newton optimization scheme which gives rise to drastically faster convergence and highly accurate and robust tracking performance. We furthermore propose a novel complex dataset dedicated for the task of monocular object pose tracking and make it publicly available to the community. To our knowledge, it is the first to address the common and important scenario in which both the camera as well as the objects are moving simultaneously in cluttered scenes. In numerous experiments - including our own proposed dataset - we demonstrate that the proposed Gauss-Newton approach outperforms existing approaches, in particular in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions. |
Tasks | Multiple Object Tracking, Object Tracking, Pose Tracking |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02087v2 |
http://arxiv.org/pdf/1807.02087v2.pdf | |
PWC | https://paperswithcode.com/paper/a-region-based-gauss-newton-approach-to-real |
Repo | |
Framework | |
Studying oppressive cityscapes of Bangladesh
Title | Studying oppressive cityscapes of Bangladesh |
Authors | Halima Akhter, Nazmus Saquib, Deeni Fatiha |
Abstract | In a densely populated city like Dhaka (Bangladesh), a growing number of high-rise buildings is an inevitable reality. However, they pose mental health risks for citizens in terms of detachment from natural light, sky view, greenery, and environmental landscapes. The housing economy and rent structure in different areas may or may not take account of such environmental factors. In this paper, we build a computer vision based pipeline to study factors like sky visibility, greenery in the sidewalks, and dominant colors present in streets from a pedestrian’s perspective. We show that people in lower economy classes may suffer from lower sky visibility, whereas people in higher economy classes may suffer from lack of greenery in their environment, both of which could be possibly addressed by implementing rent restructuring schemes. |
Tasks | |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.10413v1 |
http://arxiv.org/pdf/1812.10413v1.pdf | |
PWC | https://paperswithcode.com/paper/studying-oppressive-cityscapes-of-bangladesh |
Repo | |
Framework | |
Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fitting
Title | Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fitting |
Authors | Yuuki Yokoyama, Tomu Katsumata, Muneki Yasuda |
Abstract | Generalization is one of the most important issues in machine learning problems. In this study, we consider generalization in restricted Boltzmann machines (RBMs). We propose an RBM with multivalued hidden variables, which is a simple extension of conventional RBMs. We demonstrate that the proposed model is better than the conventional model via numerical experiments for contrastive divergence learning with artificial data and a classification problem with MNIST. |
Tasks | |
Published | 2018-11-30 |
URL | https://arxiv.org/abs/1811.12587v4 |
https://arxiv.org/pdf/1811.12587v4.pdf | |
PWC | https://paperswithcode.com/paper/restricted-boltzmann-machine-with-multivalued |
Repo | |
Framework | |
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
Title | Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings |
Authors | John D. Co-Reyes, YuXuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine |
Abstract | In this work, we take a representation learning perspective on hierarchical reinforcement learning, where the problem of learning lower layers in a hierarchy is transformed into the problem of learning trajectory-level generative models. We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems. Our proposed model, SeCTAR, draws inspiration from variational autoencoders, and learns latent representations of trajectories. A key component of this method is to learn both a latent-conditioned policy and a latent-conditioned model which are consistent with each other. Given the same latent, the policy generates a trajectory which should match the trajectory predicted by the model. This model provides a built-in prediction mechanism, by predicting the outcome of closed loop policy behavior. We propose a novel algorithm for performing hierarchical RL with this model, combining model-based planning in the learned latent space with an unsupervised exploration objective. We show that our model is effective at reasoning over long horizons with sparse rewards for several simulated tasks, outperforming standard reinforcement learning methods and prior methods for hierarchical reasoning, model-based planning, and exploration. |
Tasks | Hierarchical Reinforcement Learning, Representation Learning |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02813v1 |
http://arxiv.org/pdf/1806.02813v1.pdf | |
PWC | https://paperswithcode.com/paper/self-consistent-trajectory-autoencoder |
Repo | |
Framework | |
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Title | AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms |
Authors | Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo |
Abstract | This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks. Seq2Seq has been outstanding at numerous tasks involving sequence modeling such as speech synthesis and recognition, machine translation, and image captioning. In contrast to current VC techniques, our method 1) stabilizes and accelerates the training procedure by considering guided attention and proposed context preservation losses, 2) allows not only spectral envelopes but also fundamental frequency contours and durations of speech to be converted, 3) requires no context information such as phoneme labels, and 4) requires no time-aligned source and target speech data in advance. In our experiment, the proposed VC framework can be trained in only one day, using only one GPU of an NVIDIA Tesla K80, while the quality of the synthesized speech is higher than that of speech converted by Gaussian mixture model-based VC and is comparable to that of speech generated by recurrent neural network-based text-to-speech synthesis, which can be regarded as an upper limit on VC performance. |
Tasks | Image Captioning, Machine Translation, Speech Synthesis, Text-To-Speech Synthesis, Voice Conversion |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.04076v1 |
http://arxiv.org/pdf/1811.04076v1.pdf | |
PWC | https://paperswithcode.com/paper/atts2s-vc-sequence-to-sequence-voice |
Repo | |
Framework | |
Speeding up the Metabolism in E-commerce by Reinforcement Mechanism Design
Title | Speeding up the Metabolism in E-commerce by Reinforcement Mechanism Design |
Authors | Hua-Lin He, Chun-Xiang Pan, Qing Da, An-Xiang Zeng |
Abstract | In a large E-commerce platform, all the participants compete for impressions under the allocation mechanism of the platform. Existing methods mainly focus on the short-term return based on the current observations instead of the long-term return. In this paper, we formally establish the lifecycle model for products, by defining the introduction, growth, maturity and decline stages and their transitions throughout the whole life period. Based on such model, we further propose a reinforcement learning based mechanism design framework for impression allocation, which incorporates the first principal component based permutation and the novel experiences generation method, to maximize short-term as well as long-term return of the platform. With the power of trial-and-error, it is possible to optimize impression allocation strategies globally which is contribute to the healthy development of participants and the platform itself. We evaluate our algorithm on a simulated environment built based on one of the largest E-commerce platforms, and a significant improvement has been achieved in comparison with the baseline solutions. |
Tasks | |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00448v1 |
http://arxiv.org/pdf/1807.00448v1.pdf | |
PWC | https://paperswithcode.com/paper/speeding-up-the-metabolism-in-e-commerce-by |
Repo | |
Framework | |
Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning
Title | Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning |
Authors | Mariano Chouza, Stephen Roberts, Stefan Zohren |
Abstract | In this paper we model the loss function of high-dimensional optimization problems by a Gaussian random field, or equivalently a Gaussian process. Our aim is to study gradient descent in such loss functions or energy landscapes and compare it to results obtained from real high-dimensional optimization problems such as encountered in deep learning. In particular, we analyze the distribution of the improved loss function after a step of gradient descent, provide analytic expressions for the moments as well as prove asymptotic normality as the dimension of the parameter space becomes large. Moreover, we compare this with the expectation of the global minimum of the landscape obtained by means of the Euler characteristic of excursion sets. Besides complementing our analytical findings with numerical results from simulated Gaussian random fields, we also compare it to loss functions obtained from optimisation problems on synthetic and real data sets by proposing a “black box” random field toy-model for a deep neural network loss function. |
Tasks | |
Published | 2018-03-24 |
URL | http://arxiv.org/abs/1803.09119v1 |
http://arxiv.org/pdf/1803.09119v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-descent-in-gaussian-random-fields-as |
Repo | |
Framework | |
Deep Similarity Metric Learning for Real-Time Pedestrian Tracking
Title | Deep Similarity Metric Learning for Real-Time Pedestrian Tracking |
Authors | Michael Thoreau, Navinda Kottege |
Abstract | Tracking by detection is a common approach to solving the Multiple Object Tracking problem. In this paper we show how learning a deep similarity metric can improve three key aspects of pedestrian tracking on a multiple object tracking benchmark. We train a convolutional neural network to learn an embedding function in a Siamese configuration on a large person re-identification dataset. The offline-trained embedding network is integrated in to the tracking formulation to improve performance while retaining real-time performance. The proposed tracker stores appearance metrics while detections are strong, using this appearance information to: prevent ID switches, associate tracklets through occlusion, and propose new detections where detector confidence is low. This method achieves competitive results in evaluation, especially among online, real-time approaches. We present an ablative study showing the impact of each of the three uses of our deep appearance metric. |
Tasks | Metric Learning, Multiple Object Tracking, Object Tracking, Person Re-Identification |
Published | 2018-06-20 |
URL | https://arxiv.org/abs/1806.07592v2 |
https://arxiv.org/pdf/1806.07592v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-online-multiple-object-tracking |
Repo | |
Framework | |
Extreme Classification in Log Memory
Title | Extreme Classification in Log Memory |
Authors | Qixuan Huang, Yiqiu Wang, Tharun Medini, Anshumali Shrivastava |
Abstract | We present Merged-Averaged Classifiers via Hashing (MACH) for K-classification with ultra-large values of K. Compared to traditional one-vs-all classifiers that require O(Kd) memory and inference cost, MACH only need O(d log K) (d is dimensionality )memory while only requiring O(K log K + d log K) operation for inference. MACH is a generic K-classification algorithm, with provably theoretical guarantees, which requires O(log K) memory without any assumption on the relationship between classes. MACH uses universal hashing to reduce classification with a large number of classes to few independent classification tasks with small (constant) number of classes. We provide theoretical quantification of discriminability-memory tradeoff. With MACH we can train ODP dataset with 100,000 classes and 400,000 features on a single Titan X GPU, with the classification accuracy of 19.28%, which is the best-reported accuracy on this dataset. Before this work, the best performing baseline is a one-vs-all classifier that requires 40 billion parameters (160 GB model size) and achieves 9% accuracy. In contrast, MACH can achieve 9% accuracy with 480x reduction in the model size (of mere 0.3GB). With MACH, we also demonstrate complete training of fine-grained imagenet dataset (compressed size 104GB), with 21,000 classes, on a single GPU. To the best of our knowledge, this is the first work to demonstrate complete training of these extreme-class datasets on a single Titan X. |
Tasks | |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.04254v1 |
http://arxiv.org/pdf/1810.04254v1.pdf | |
PWC | https://paperswithcode.com/paper/extreme-classification-in-log-memory |
Repo | |
Framework | |