October 18, 2019

2846 words 14 mins read

Paper Group ANR 484

Paper Group ANR 484

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data. GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval. Deep Mixture of Experts via Shallow Embedding. What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits. …

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data

Title Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data
Authors Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa
Abstract In this work, travel destination and business location are taken as venues. Discovering a venue by a photo is very important for context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photos generated by users. Our goal is fine-grained venue discovery from heterogeneous social multimodal data. To this end, we propose a novel deep learning model, Category-based Deep Canonical Correlation Analysis (C-DCCA). Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues with the same category as that of the photo), by the cross-modal correlation between the input photo and textual description of venues. In this model, data in different modalities are projected to a same space via deep networks. Pairwise correlation (between different modal data from the same venue) for exact venue search and category-based correlation (between different modal data from different venues with the same category) for group venue search are jointly optimized. Because a photo cannot fully reflect rich text description of a venue, the number of photos per venue in the training phase is increased to capture more aspects of a venue. We build a new venue-aware multimodal dataset by integrating Wikipedia featured articles and Foursquare venue photos. Experimental results on this dataset confirm the feasibility of the proposed method. Moreover, the evaluation over another publicly available dataset confirms that the proposed method outperforms state-of-the-arts for cross-modal retrieval between image and text.
Tasks Cross-Modal Retrieval
Published 2018-05-08
URL http://arxiv.org/abs/1805.02997v1
PDF http://arxiv.org/pdf/1805.02997v1.pdf
PWC https://paperswithcode.com/paper/category-based-deep-cca-for-fine-grained
Repo
Framework

GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining

Title GitGraph - Architecture Search Space Creation through Frequent Computational Subgraph Mining
Authors Kamil Bennani-Smires, Claudiu Musat, Andreea Hossmann, Michael Baeriswyl
Abstract The dramatic success of deep neural networks across multiple application areas often relies on experts painstakingly designing a network architecture specific to each task. To simplify this process and make it more accessible, an emerging research effort seeks to automate the design of neural network architectures, using e.g. evolutionary algorithms or reinforcement learning or simple search in a constrained space of neural modules. Considering the typical size of the search space (e.g. $10^{10}$ candidates for a $10$-layer network) and the cost of evaluating a single candidate, current architecture search methods are very restricted. They either rely on static pre-built modules to be recombined for the task at hand, or they define a static hand-crafted framework within which they can generate new architectures from the simplest possible operations. In this paper, we relax these restrictions, by capitalizing on the collective wisdom contained in the plethora of neural networks published in online code repositories. Concretely, we (a) extract and publish GitGraph, a corpus of neural architectures and their descriptions; (b) we create problem-specific neural architecture search spaces, implemented as a textual search mechanism over GitGraph; (c) we propose a method of identifying unique common subgraphs within the architectures solving each problem (e.g., image processing, reinforcement learning), that can then serve as modules in the newly created problem specific neural search space.
Tasks Neural Architecture Search
Published 2018-01-16
URL http://arxiv.org/abs/1801.05159v1
PDF http://arxiv.org/pdf/1801.05159v1.pdf
PWC https://paperswithcode.com/paper/gitgraph-architecture-search-space-creation
Repo
Framework

Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval

Title Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
Authors Lin Wu, Yang Wang, Ling Shao
Abstract In this paper, we propose a novel deep generative approach to cross-modal retrieval to learn hash functions in the absence of paired training samples through the cycle consistency loss. Our proposed approach employs adversarial training scheme to lean a couple of hash functions enabling translation between modalities while assuming the underlying semantic relationship. To induce the hash codes with semantics to the input-output pair, cycle consistency loss is further proposed upon the adversarial training to strengthen the correlations between inputs and corresponding outputs. Our approach is generative to learn hash functions such that the learned hash codes can maximally correlate each input-output correspondence, meanwhile can also regenerate the inputs so as to minimize the information loss. The learning to hash embedding is thus performed to jointly optimize the parameters of the hash functions across modalities as well as the associated generative models. Extensive experiments on a variety of large-scale cross-modal data sets demonstrate that our proposed method achieves better retrieval results than the state-of-the-arts.
Tasks Cross-Modal Retrieval
Published 2018-04-30
URL http://arxiv.org/abs/1804.11013v2
PDF http://arxiv.org/pdf/1804.11013v2.pdf
PWC https://paperswithcode.com/paper/cycle-consistent-deep-generative-hashing-for
Repo
Framework

Deep Mixture of Experts via Shallow Embedding

Title Deep Mixture of Experts via Shallow Embedding
Authors Xin Wang, Fisher Yu, Lisa Dunlap, Yi-An Ma, Ruth Wang, Azalia Mirhoseini, Trevor Darrell, Joseph E. Gonzalez
Abstract Larger networks generally have greater representational power at the cost of increased computational complexity. Sparsifying such networks has been an active area of research but has been generally limited to static regularization or dynamic approaches using reinforcement learning. We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis. Our novel DeepMoE architecture increases the representational power of standard convolutional networks by adaptively sparsifying and recalibrating channel-wise features in each convolutional layer. We employ a multi-headed sparse gating network to determine the selection and scaling of channels for each input, leveraging exponential combinations of experts within a single convolutional network. Our proposed architecture is evaluated on four benchmark datasets and tasks, and we show that Deep-MoEs are able to achieve higher accuracy with lower computation than standard convolutional networks.
Tasks Few-Shot Learning, Meta-Learning, Zero-Shot Learning
Published 2018-06-05
URL http://arxiv.org/abs/1806.01531v3
PDF http://arxiv.org/pdf/1806.01531v3.pdf
PWC https://paperswithcode.com/paper/tafe-net-task-aware-feature-embeddings-for
Repo
Framework

What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits

Title What Doubling Tricks Can and Can’t Do for Multi-Armed Bandits
Authors Lilian Besson, Emilie Kaufmann
Abstract An online reinforcement learning algorithm is anytime if it does not need to know in advance the horizon T of the experiment. A well-known technique to obtain an anytime algorithm from any non-anytime algorithm is the “Doubling Trick”. In the context of adversarial or stochastic multi-armed bandits, the performance of an algorithm is measured by its regret, and we study two families of sequences of growing horizons (geometric and exponential) to generalize previously known results that certain doubling tricks can be used to conserve certain regret bounds. In a broad setting, we prove that a geometric doubling trick can be used to conserve (minimax) bounds in $R_T = O(\sqrt{T})$ but cannot conserve (distribution-dependent) bounds in $R_T = O(\log T)$. We give insights as to why exponential doubling tricks may be better, as they conserve bounds in $R_T = O(\log T)$, and are close to conserving bounds in $R_T = O(\sqrt{T})$.
Tasks Multi-Armed Bandits
Published 2018-03-19
URL http://arxiv.org/abs/1803.06971v1
PDF http://arxiv.org/pdf/1803.06971v1.pdf
PWC https://paperswithcode.com/paper/what-doubling-tricks-can-and-cant-do-for
Repo
Framework

Finding beans in burgers: Deep semantic-visual embedding with localization

Title Finding beans in burgers: Deep semantic-visual embedding with localization
Authors Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord
Abstract Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships. Such a multi-modal embedding can be trained and used for various tasks, notably image captioning. In the present work, we introduce a new architecture of this type, with a visual path that leverages recent space-aware pooling mechanisms. Combined with a textual path which is jointly trained from scratch, our semantic-visual embedding offers a versatile model. Once trained under the supervision of captioned images, it yields new state-of-the-art performance on cross-modal retrieval. It also allows the localization of new concepts from the embedding space into any input image, delivering state-of-the-art result on the visual grounding of phrases.
Tasks Cross-Modal Retrieval, Image Captioning
Published 2018-04-05
URL http://arxiv.org/abs/1804.01720v2
PDF http://arxiv.org/pdf/1804.01720v2.pdf
PWC https://paperswithcode.com/paper/finding-beans-in-burgers-deep-semantic-visual
Repo
Framework

Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval

Title Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
Authors Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, Dacheng Tao
Abstract Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (\textbf{SSAH}) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal hashing in a self-supervised fashion. The primary contribution of this work is that two adversarial networks are leveraged to maximize the semantic correlation and consistency of the representations between different modalities. In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations. Such information guides the feature learning process and preserves the modality relationships in both the common semantic space and the Hamming space. Extensive experiments carried out on three benchmark datasets validate that the proposed SSAH surpasses the state-of-the-art methods.
Tasks Cross-Modal Retrieval
Published 2018-04-04
URL http://arxiv.org/abs/1804.01223v1
PDF http://arxiv.org/pdf/1804.01223v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-adversarial-hashing-networks
Repo
Framework

Are All Training Examples Created Equal? An Empirical Study

Title Are All Training Examples Created Equal? An Empirical Study
Authors Kailas Vodrahalli, Ke Li, Jitendra Malik
Abstract Modern computer vision algorithms often rely on very large training datasets. However, it is conceivable that a carefully selected subsample of the dataset is sufficient for training. In this paper, we propose a gradient-based importance measure that we use to empirically analyze relative importance of training images in four datasets of varying complexity. We find that in some cases, a small subsample is indeed sufficient for training. For other datasets, however, the relative differences in importance are negligible. These results have important implications for active learning on deep networks. Additionally, our analysis method can be used as a general tool to better understand diversity of training examples in datasets.
Tasks Active Learning
Published 2018-11-30
URL http://arxiv.org/abs/1811.12569v1
PDF http://arxiv.org/pdf/1811.12569v1.pdf
PWC https://paperswithcode.com/paper/are-all-training-examples-created-equal-an
Repo
Framework

Sparse Reduced Rank Regression With Nonconvex Regularization

Title Sparse Reduced Rank Regression With Nonconvex Regularization
Authors Ziping Zhao, Daniel P. Palomar
Abstract In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing functions have been used for SRRR in literature. In this work, a nonconvex function is proposed for better sparsity inducing. An efficient algorithm is developed based on the alternating minimization (or projection) method to solve the nonconvex optimization problem. Numerical simulations show that the proposed algorithm is much more efficient compared to the benchmark methods and the nonconvex function can result in a better estimation accuracy.
Tasks Dimensionality Reduction
Published 2018-03-20
URL http://arxiv.org/abs/1803.07247v1
PDF http://arxiv.org/pdf/1803.07247v1.pdf
PWC https://paperswithcode.com/paper/sparse-reduced-rank-regression-with-nonconvex
Repo
Framework

Applying an Ensemble Learning Method for Improving Multi-label Classification Performance

Title Applying an Ensemble Learning Method for Improving Multi-label Classification Performance
Authors Amirreza Mahdavi-Shahri, Mahboobeh Houshmand, Mahdi Yaghoobi, Mehrdad Jalali
Abstract In recent years, multi-label classification problem has become a controversial issue. In this kind of classification, each sample is associated with a set of class labels. Ensemble approaches are supervised learning algorithms in which an operator takes a number of learning algorithms, namely base-level algorithms and combines their outcomes to make an estimation. The simplest form of ensemble learning is to train the base-level algorithms on random subsets of data and then let them vote for the most popular classifications or average the predictions of the base-level algorithms. In this study, an ensemble learning method is proposed for improving multi-label classification evaluation criteria. We have compared our method with well-known base-level algorithms on some data sets. Experiment results show the proposed approach outperforms the base well-known classifiers for the multi-label classification problem.
Tasks Multi-Label Classification
Published 2018-01-07
URL http://arxiv.org/abs/1801.02149v1
PDF http://arxiv.org/pdf/1801.02149v1.pdf
PWC https://paperswithcode.com/paper/applying-an-ensemble-learning-method-for
Repo
Framework

Decentralized Decision-Making Over Multi-Task Networks

Title Decentralized Decision-Making Over Multi-Task Networks
Authors Sahar Khawatmi, Abdelhak M. Zoubir, Ali H. Sayed
Abstract In important applications involving multi-task networks with multiple objectives, agents in the network need to decide between these multiple objectives and reach an agreement about which single objective to follow for the network. In this work we propose a distributed decision-making algorithm. The agents are assumed to observe data that may be generated by different models. Through localized interactions, the agents reach agreement about which model to track and interact with each other in order to enhance the network performance. We investigate the approach for both static and mobile networks. The simulations illustrate the performance of the proposed strategies.
Tasks Decision Making
Published 2018-12-20
URL http://arxiv.org/abs/1812.08843v2
PDF http://arxiv.org/pdf/1812.08843v2.pdf
PWC https://paperswithcode.com/paper/decentralized-decision-making-over-multi-task
Repo
Framework

Faster Bounding Box Annotation for Object Detection in Indoor Scenes

Title Faster Bounding Box Annotation for Object Detection in Indoor Scenes
Authors Bishwo Adhikari, Jukka Peltomäki, Jussi Puura, Heikki Huttunen
Abstract This paper proposes an approach for rapid bounding box annotation for object detection datasets. The procedure consists of two stages: The first step is to annotate a part of the dataset manually, and the second step proposes annotations for the remaining samples using a model trained with the first stage annotations. We experimentally study which first/second stage split minimizes to total workload. In addition, we introduce a new fully labeled object detection dataset collected from indoor scenes. Compared to other indoor datasets, our collection has more class categories, different backgrounds, lighting conditions, occlusion and high intra-class differences. We train deep learning based object detectors with a number of state-of-the-art models and compare them in terms of speed and accuracy. The fully annotated dataset is released freely available for the research community.
Tasks Object Detection, Object Detection In Indoor Scenes
Published 2018-07-03
URL http://arxiv.org/abs/1807.03142v1
PDF http://arxiv.org/pdf/1807.03142v1.pdf
PWC https://paperswithcode.com/paper/faster-bounding-box-annotation-for-object
Repo
Framework

Linear Bandits with Stochastic Delayed Feedback

Title Linear Bandits with Stochastic Delayed Feedback
Authors Claire Vernade, Alexandra Carpentier, Tor Lattimore, Giovanni Zappella, Beyza Ermis, Michael Brueckner
Abstract Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by practitioners hoping to apply existing algorithms is that usually the feedback is randomly delayed and delays are only partially observable. For example, while a purchase is usually observable some time after the display, the decision of not buying is never explicitly sent to the system. In other words, the learner only observes delayed positive events. We formalize this problem as a novel stochastic delayed linear bandit and propose ${\tt OTFLinUCB}$ and ${\tt OTFLinTS}$, two computationally efficient algorithms able to integrate new information as it becomes available and to deal with the permanently censored feedback. We prove optimal $\tilde O(\smash{d\sqrt{T}})$ bounds on the regret of the first algorithm and study the dependency on delay-dependent parameters. Our model, assumptions and results are validated by experiments on simulated and real data.
Tasks Multi-Armed Bandits
Published 2018-07-05
URL https://arxiv.org/abs/1807.02089v3
PDF https://arxiv.org/pdf/1807.02089v3.pdf
PWC https://paperswithcode.com/paper/contextual-bandits-under-delayed-feedback
Repo
Framework

Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text

Title Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text
Authors Ji Wen, Xu Sun, Xuancheng Ren, Qi Su
Abstract Relation classification is an important semantic processing task in the field of natural language processing. In this paper, we propose the task of relation classification for Chinese literature text. A new dataset of Chinese literature text is constructed to facilitate the study in this task. We present a novel model, named Structure Regularized Bidirectional Recurrent Convolutional Neural Network (SR-BRCNN), to identify the relation between entities. The proposed model learns relation representations along the shortest dependency path (SDP) extracted from the structure regularized dependency tree, which has the benefits of reducing the complexity of the whole model. Experimental results show that the proposed method significantly improves the F1 score by 10.3, and outperforms the state-of-the-art approaches on Chinese literature text.
Tasks Relation Classification
Published 2018-03-15
URL http://arxiv.org/abs/1803.05662v1
PDF http://arxiv.org/pdf/1803.05662v1.pdf
PWC https://paperswithcode.com/paper/structure-regularized-neural-network-for
Repo
Framework

A Covariance Matrix Self-Adaptation Evolution Strategy for Optimization under Linear Constraints

Title A Covariance Matrix Self-Adaptation Evolution Strategy for Optimization under Linear Constraints
Authors Patrick Spettel, Hans-Georg Beyer, Michael Hellwig
Abstract This paper addresses the development of a covariance matrix self-adaptation evolution strategy (CMSA-ES) for solving optimization problems with linear constraints. The proposed algorithm is referred to as Linear Constraint CMSA-ES (lcCMSA-ES). It uses a specially built mutation operator together with repair by projection to satisfy the constraints. The lcCMSA-ES evolves itself on a linear manifold defined by the constraints. The objective function is only evaluated at feasible search points (interior point method). This is a property often required in application domains such as simulation optimization and finite element methods. The algorithm is tested on a variety of different test problems revealing considerable results.
Tasks
Published 2018-06-15
URL http://arxiv.org/abs/1806.05845v2
PDF http://arxiv.org/pdf/1806.05845v2.pdf
PWC https://paperswithcode.com/paper/a-covariance-matrix-self-adaptation-evolution
Repo
Framework
comments powered by Disqus