October 18, 2019

2846 words 14 mins read

Paper Group ANR 484

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data. GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval. Deep Mixture of Experts via Shallow Embedding. What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits. …

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data


Title	Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data
Authors	Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa
Abstract	In this work, travel destination and business location are taken as venues. Discovering a venue by a photo is very important for context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photos generated by users. Our goal is fine-grained venue discovery from heterogeneous social multimodal data. To this end, we propose a novel deep learning model, Category-based Deep Canonical Correlation Analysis (C-DCCA). Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues with the same category as that of the photo), by the cross-modal correlation between the input photo and textual description of venues. In this model, data in different modalities are projected to a same space via deep networks. Pairwise correlation (between different modal data from the same venue) for exact venue search and category-based correlation (between different modal data from different venues with the same category) for group venue search are jointly optimized. Because a photo cannot fully reflect rich text description of a venue, the number of photos per venue in the training phase is increased to capture more aspects of a venue. We build a new venue-aware multimodal dataset by integrating Wikipedia featured articles and Foursquare venue photos. Experimental results on this dataset confirm the feasibility of the proposed method. Moreover, the evaluation over another publicly available dataset confirms that the proposed method outperforms state-of-the-arts for cross-modal retrieval between image and text.
Tasks	Cross-Modal Retrieval
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02997v1
PDF	http://arxiv.org/pdf/1805.02997v1.pdf
PWC	https://paperswithcode.com/paper/category-based-deep-cca-for-fine-grained
Repo
Framework

GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining


Title	GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining
Authors	Kamil Bennani-Smires, Claudiu Musat, Andreea Hossmann, Michael Baeriswyl
Abstract	The dramatic success of deep neural networks across multiple application areas often relies on experts painstakingly designing a network architecture specific to each task. To simplify this process and make it more accessible, an emerging research effort seeks to automate the design of neural network architectures, using e.g. evolutionary algorithms or reinforcement learning or simple search in a constrained space of neural modules. Considering the typical size of the search space (e.g. $10^{10}$ candidates for a $10$-layer network) and the cost of evaluating a single candidate, current architecture search methods are very restricted. They either rely on static pre-built modules to be recombined for the task at hand, or they define a static hand-crafted framework within which they can generate new architectures from the simplest possible operations. In this paper, we relax these restrictions, by capitalizing on the collective wisdom contained in the plethora of neural networks published in online code repositories. Concretely, we (a) extract and publish GitGraph, a corpus of neural architectures and their descriptions; (b) we create problem-specific neural architecture search spaces, implemented as a textual search mechanism over GitGraph; (c) we propose a method of identifying unique common subgraphs within the architectures solving each problem (e.g., image processing, reinforcement learning), that can then serve as modules in the newly created problem specific neural search space.
Tasks	Neural Architecture Search
Published	2018-01-16
URL	http://arxiv.org/abs/1801.05159v1
PDF	http://arxiv.org/pdf/1801.05159v1.pdf
PWC	https://paperswithcode.com/paper/gitgraph-architecture-search-space-creation
Repo
Framework


Title	Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
Authors	Lin Wu, Yang Wang, Ling Shao
Abstract	In this paper, we propose a novel deep generative approach to cross-modal retrieval to learn hash functions in the absence of paired training samples through the cycle consistency loss. Our proposed approach employs adversarial training scheme to lean a couple of hash functions enabling translation between modalities while assuming the underlying semantic relationship. To induce the hash codes with semantics to the input-output pair, cycle consistency loss is further proposed upon the adversarial training to strengthen the correlations between inputs and corresponding outputs. Our approach is generative to learn hash functions such that the learned hash codes can maximally correlate each input-output correspondence, meanwhile can also regenerate the inputs so as to minimize the information loss. The learning to hash embedding is thus performed to jointly optimize the parameters of the hash functions across modalities as well as the associated generative models. Extensive experiments on a variety of large-scale cross-modal data sets demonstrate that our proposed method achieves better retrieval results than the state-of-the-arts.
Tasks	Cross-Modal Retrieval
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11013v2
PDF	http://arxiv.org/pdf/1804.11013v2.pdf
PWC	https://paperswithcode.com/paper/cycle-consistent-deep-generative-hashing-for
Repo
Framework

Deep Mixture of Experts via Shallow Embedding


Title	Deep Mixture of Experts via Shallow Embedding
Authors	Xin Wang, Fisher Yu, Lisa Dunlap, Yi-An Ma, Ruth Wang, Azalia Mirhoseini, Trevor Darrell, Joseph E. Gonzalez
Abstract	Larger networks generally have greater representational power at the cost of increased computational complexity. Sparsifying such networks has been an active area of research but has been generally limited to static regularization or dynamic approaches using reinforcement learning. We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis. Our novel DeepMoE architecture increases the representational power of standard convolutional networks by adaptively sparsifying and recalibrating channel-wise features in each convolutional layer. We employ a multi-headed sparse gating network to determine the selection and scaling of channels for each input, leveraging exponential combinations of experts within a single convolutional network. Our proposed architecture is evaluated on four benchmark datasets and tasks, and we show that Deep-MoEs are able to achieve higher accuracy with lower computation than standard convolutional networks.
Tasks	Few-Shot Learning, Meta-Learning, Zero-Shot Learning
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01531v3
PDF	http://arxiv.org/pdf/1806.01531v3.pdf
PWC	https://paperswithcode.com/paper/tafe-net-task-aware-feature-embeddings-for
Repo
Framework

What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits


Title	What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits
Authors	Lilian Besson, Emilie Kaufmann
Abstract	An online reinforcement learning algorithm is anytime if it does not need to know in advance the horizon T of the experiment. A well-known technique to obtain an anytime algorithm from any non-anytime algorithm is the “Doubling Trick”. In the context of adversarial or stochastic multi-armed bandits, the performance of an algorithm is measured by its regret, and we study two families of sequences of growing horizons (geometric and exponential) to generalize previously known results that certain doubling tricks can be used to conserve certain regret bounds. In a broad setting, we prove that a geometric doubling trick can be used to conserve (minimax) bounds in $R_T = O(\sqrt{T})$ but cannot conserve (distribution-dependent) bounds in $R_T = O(\log T)$. We give insights as to why exponential doubling tricks may be better, as they conserve bounds in $R_T = O(\log T)$, and are close to conserving bounds in $R_T = O(\sqrt{T})$.
Tasks	Multi-Armed Bandits
Published	2018-03-19
URL	http://arxiv.org/abs/1803.06971v1
PDF	http://arxiv.org/pdf/1803.06971v1.pdf
PWC	https://paperswithcode.com/paper/what-doubling-tricks-can-and-cant-do-for
Repo
Framework

Finding beans in burgers: Deep semantic-visual embedding with localization


Title	Finding beans in burgers: Deep semantic-visual embedding with localization
Authors	Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord
Abstract	Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships. Such a multi-modal embedding can be trained and used for various tasks, notably image captioning. In the present work, we introduce a new architecture of this type, with a visual path that leverages recent space-aware pooling mechanisms. Combined with a textual path which is jointly trained from scratch, our semantic-visual embedding offers a versatile model. Once trained under the supervision of captioned images, it yields new state-of-the-art performance on cross-modal retrieval. It also allows the localization of new concepts from the embedding space into any input image, delivering state-of-the-art result on the visual grounding of phrases.
Tasks	Cross-Modal Retrieval, Image Captioning
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01720v2
PDF	http://arxiv.org/pdf/1804.01720v2.pdf
PWC	https://paperswithcode.com/paper/finding-beans-in-burgers-deep-semantic-visual
Repo
Framework


Title	Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
Authors	Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, Dacheng Tao
Abstract	Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (\textbf{SSAH}) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal hashing in a self-supervised fashion. The primary contribution of this work is that two adversarial networks are leveraged to maximize the semantic correlation and consistency of the representations between different modalities. In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations. Such information guides the feature learning process and preserves the modality relationships in both the common semantic space and the Hamming space. Extensive experiments carried out on three benchmark datasets validate that the proposed SSAH surpasses the state-of-the-art methods.
Tasks	Cross-Modal Retrieval
Published	2018-04-04
URL	http://arxiv.org/abs/1804.01223v1
PDF	http://arxiv.org/pdf/1804.01223v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-adversarial-hashing-networks
Repo
Framework

Are All Training Examples Created Equal? An Empirical Study


Title	Are All Training Examples Created Equal? An Empirical Study
Authors	Kailas Vodrahalli, Ke Li, Jitendra Malik
Abstract	Modern computer vision algorithms often rely on very large training datasets. However, it is conceivable that a carefully selected subsample of the dataset is sufficient for training. In this paper, we propose a gradient-based importance measure that we use to empirically analyze relative importance of training images in four datasets of varying complexity. We find that in some cases, a small subsample is indeed sufficient for training. For other datasets, however, the relative differences in importance are negligible. These results have important implications for active learning on deep networks. Additionally, our analysis method can be used as a general tool to better understand diversity of training examples in datasets.
Tasks	Active Learning
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12569v1
PDF	http://arxiv.org/pdf/1811.12569v1.pdf
PWC	https://paperswithcode.com/paper/are-all-training-examples-created-equal-an
Repo
Framework

Sparse Reduced Rank Regression With Nonconvex Regularization


Title	Sparse Reduced Rank Regression With Nonconvex Regularization
Authors	Ziping Zhao, Daniel P. Palomar
Abstract	In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing functions have been used for SRRR in literature. In this work, a nonconvex function is proposed for better sparsity inducing. An efficient algorithm is developed based on the alternating minimization (or projection) method to solve the nonconvex optimization problem. Numerical simulations show that the proposed algorithm is much more efficient compared to the benchmark methods and the nonconvex function can result in a better estimation accuracy.
Tasks	Dimensionality Reduction
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07247v1
PDF	http://arxiv.org/pdf/1803.07247v1.pdf
PWC	https://paperswithcode.com/paper/sparse-reduced-rank-regression-with-nonconvex
Repo
Framework

Applying an Ensemble Learning Method for Improving Multi-label Classification Performance


Title	Applying an Ensemble Learning Method for Improving Multi-label Classification Performance
Authors	Amirreza Mahdavi-Shahri, Mahboobeh Houshmand, Mahdi Yaghoobi, Mehrdad Jalali
Abstract	In recent years, multi-label classification problem has become a controversial issue. In this kind of classification, each sample is associated with a set of class labels. Ensemble approaches are supervised learning algorithms in which an operator takes a number of learning algorithms, namely base-level algorithms and combines their outcomes to make an estimation. The simplest form of ensemble learning is to train the base-level algorithms on random subsets of data and then let them vote for the most popular classifications or average the predictions of the base-level algorithms. In this study, an ensemble learning method is proposed for improving multi-label classification evaluation criteria. We have compared our method with well-known base-level algorithms on some data sets. Experiment results show the proposed approach outperforms the base well-known classifiers for the multi-label classification problem.
Tasks	Multi-Label Classification
Published	2018-01-07
URL	http://arxiv.org/abs/1801.02149v1
PDF	http://arxiv.org/pdf/1801.02149v1.pdf
PWC	https://paperswithcode.com/paper/applying-an-ensemble-learning-method-for
Repo
Framework

Decentralized Decision-Making Over Multi-Task Networks


Title	Decentralized Decision-Making Over Multi-Task Networks
Authors	Sahar Khawatmi, Abdelhak M. Zoubir, Ali H. Sayed
Abstract	In important applications involving multi-task networks with multiple objectives, agents in the network need to decide between these multiple objectives and reach an agreement about which single objective to follow for the network. In this work we propose a distributed decision-making algorithm. The agents are assumed to observe data that may be generated by different models. Through localized interactions, the agents reach agreement about which model to track and interact with each other in order to enhance the network performance. We investigate the approach for both static and mobile networks. The simulations illustrate the performance of the proposed strategies.
Tasks	Decision Making
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08843v2
PDF	http://arxiv.org/pdf/1812.08843v2.pdf
PWC	https://paperswithcode.com/paper/decentralized-decision-making-over-multi-task
Repo
Framework

Faster Bounding Box Annotation for Object Detection in Indoor Scenes


Title	Faster Bounding Box Annotation for Object Detection in Indoor Scenes
Authors	Bishwo Adhikari, Jukka Peltomäki, Jussi Puura, Heikki Huttunen
Abstract	This paper proposes an approach for rapid bounding box annotation for object detection datasets. The procedure consists of two stages: The first step is to annotate a part of the dataset manually, and the second step proposes annotations for the remaining samples using a model trained with the first stage annotations. We experimentally study which first/second stage split minimizes to total workload. In addition, we introduce a new fully labeled object detection dataset collected from indoor scenes. Compared to other indoor datasets, our collection has more class categories, different backgrounds, lighting conditions, occlusion and high intra-class differences. We train deep learning based object detectors with a number of state-of-the-art models and compare them in terms of speed and accuracy. The fully annotated dataset is released freely available for the research community.
Tasks	Object Detection, Object Detection In Indoor Scenes
Published	2018-07-03
URL	http://arxiv.org/abs/1807.03142v1
PDF	http://arxiv.org/pdf/1807.03142v1.pdf
PWC	https://paperswithcode.com/paper/faster-bounding-box-annotation-for-object
Repo
Framework

Linear Bandits with Stochastic Delayed Feedback


Title	Linear Bandits with Stochastic Delayed Feedback
Authors	Claire Vernade, Alexandra Carpentier, Tor Lattimore, Giovanni Zappella, Beyza Ermis, Michael Brueckner
Abstract	Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by practitioners hoping to apply existing algorithms is that usually the feedback is randomly delayed and delays are only partially observable. For example, while a purchase is usually observable some time after the display, the decision of not buying is never explicitly sent to the system. In other words, the learner only observes delayed positive events. We formalize this problem as a novel stochastic delayed linear bandit and propose ${\tt OTFLinUCB}$ and ${\tt OTFLinTS}$, two computationally efficient algorithms able to integrate new information as it becomes available and to deal with the permanently censored feedback. We prove optimal $\tilde O(\smash{d\sqrt{T}})$ bounds on the regret of the first algorithm and study the dependency on delay-dependent parameters. Our model, assumptions and results are validated by experiments on simulated and real data.
Tasks	Multi-Armed Bandits
Published	2018-07-05
URL	https://arxiv.org/abs/1807.02089v3
PDF	https://arxiv.org/pdf/1807.02089v3.pdf
PWC	https://paperswithcode.com/paper/contextual-bandits-under-delayed-feedback
Repo
Framework

Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text


Title	Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text
Authors	Ji Wen, Xu Sun, Xuancheng Ren, Qi Su
Abstract	Relation classification is an important semantic processing task in the field of natural language processing. In this paper, we propose the task of relation classification for Chinese literature text. A new dataset of Chinese literature text is constructed to facilitate the study in this task. We present a novel model, named Structure Regularized Bidirectional Recurrent Convolutional Neural Network (SR-BRCNN), to identify the relation between entities. The proposed model learns relation representations along the shortest dependency path (SDP) extracted from the structure regularized dependency tree, which has the benefits of reducing the complexity of the whole model. Experimental results show that the proposed method significantly improves the F1 score by 10.3, and outperforms the state-of-the-art approaches on Chinese literature text.
Tasks	Relation Classification
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05662v1
PDF	http://arxiv.org/pdf/1803.05662v1.pdf
PWC	https://paperswithcode.com/paper/structure-regularized-neural-network-for
Repo
Framework

A Covariance Matrix Self-Adaptation Evolution Strategy for Optimization under Linear Constraints


Title	A Covariance Matrix Self-Adaptation Evolution Strategy for Optimization under Linear Constraints
Authors	Patrick Spettel, Hans-Georg Beyer, Michael Hellwig
Abstract	This paper addresses the development of a covariance matrix self-adaptation evolution strategy (CMSA-ES) for solving optimization problems with linear constraints. The proposed algorithm is referred to as Linear Constraint CMSA-ES (lcCMSA-ES). It uses a specially built mutation operator together with repair by projection to satisfy the constraints. The lcCMSA-ES evolves itself on a linear manifold defined by the constraints. The objective function is only evaluated at feasible search points (interior point method). This is a property often required in application domains such as simulation optimization and finite element methods. The algorithm is tested on a variety of different test problems revealing considerable results.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05845v2
PDF	http://arxiv.org/pdf/1806.05845v2.pdf
PWC	https://paperswithcode.com/paper/a-covariance-matrix-self-adaptation-evolution
Repo
Framework