May 7, 2019

3187 words 15 mins read

Paper Group ANR 122

Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks. On the convergence of gradient-like flows with noisy gradient input. Online Influence Maximization in Non-Stationary Social Networks. Parameterized Analysis of Multi-objective Evolutionary Algorithms and the Weighted Vertex Cover Problem. Visual Genome: Connecting Langua …

Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks


Title	Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks
Authors	Gonçalo Oliveira, Xavier Frazão, André Pimentel, Bernardete Ribeiro
Abstract	Brand recognition is a very challenging topic with many useful applications in localization recognition, advertisement and marketing. In this paper we present an automatic graphic logo detection system that robustly handles unconstrained imaging conditions. Our approach is based on Fast Region-based Convolutional Networks (FRCN) proposed by Ross Girshick, which have shown state-of-the-art performance in several generic object recognition tasks (PASCAL Visual Object Classes challenges). In particular, we use two CNN models pre-trained with the ILSVRC ImageNet dataset and we look at the selective search of windows `proposals’ in the pre-processing stage and data augmentation to enhance the logo recognition rate. The novelty lies in the use of transfer learning to leverage powerful Convolutional Neural Network models trained with large-scale datasets and repurpose them in the context of graphic logo detection. Another benefit of this framework is that it allows for multiple detections of graphic logos using regions that are likely to have an object. Experimental results with the FlickrLogos-32 dataset show not only the promising performance of our developed models with respect to noise and other transformations a graphic logo can be subject to, but also its superiority over state-of-the-art systems with hand-crafted models and features. \|
Tasks	Data Augmentation, Logo Recognition, Object Recognition, Transfer Learning
Published	2016-04-20
URL	http://arxiv.org/abs/1604.06083v1
PDF	http://arxiv.org/pdf/1604.06083v1.pdf
PWC	https://paperswithcode.com/paper/automatic-graphic-logo-detection-via-fast
Repo
Framework

On the convergence of gradient-like flows with noisy gradient input


Title	On the convergence of gradient-like flows with noisy gradient input
Authors	Panayotis Mertikopoulos, Mathias Staudigl
Abstract	In view of solving convex optimization problems with noisy gradient input, we analyze the asymptotic behavior of gradient-like flows under stochastic disturbances. Specifically, we focus on the widely studied class of mirror descent schemes for convex programs with compact feasible regions, and we examine the dynamics’ convergence and concentration properties in the presence of noise. In the vanishing noise limit, we show that the dynamics converge to the solution set of the underlying problem (a.s.). Otherwise, when the noise is persistent, we show that the dynamics are concentrated around interior solutions in the long run, and they converge to boundary solutions that are sufficiently “sharp”. Finally, we show that a suitably rectified variant of the method converges irrespective of the magnitude of the noise (or the structure of the underlying convex program), and we derive an explicit estimate for its rate of convergence.
Tasks
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06730v2
PDF	http://arxiv.org/pdf/1611.06730v2.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-gradient-like-flows
Repo
Framework


Title	Online Influence Maximization in Non-Stationary Social Networks
Authors	Yixin Bao, Xiaoke Wang, Zhi Wang, Chuan Wu, Francis C. M. Lau
Abstract	Social networks have been popular platforms for information propagation. An important use case is viral marketing: given a promotion budget, an advertiser can choose some influential users as the seed set and provide them free or discounted sample products; in this way, the advertiser hopes to increase the popularity of the product in the users’ friend circles by the world-of-mouth effect, and thus maximizes the number of users that information of the production can reach. There has been a body of literature studying the influence maximization problem. Nevertheless, the existing studies mostly investigate the problem on a one-off basis, assuming fixed known influence probabilities among users, or the knowledge of the exact social network topology. In practice, the social network topology and the influence probabilities are typically unknown to the advertiser, which can be varying over time, i.e., in cases of newly established, strengthened or weakened social ties. In this paper, we focus on a dynamic non-stationary social network and design a randomized algorithm, RSB, based on multi-armed bandit optimization, to maximize influence propagation over time. The algorithm produces a sequence of online decisions and calibrates its explore-exploit strategy utilizing outcomes of previous decisions. It is rigorously proven to achieve an upper-bounded regret in reward and applicable to large-scale social networks. Practical effectiveness of the algorithm is evaluated using both synthetic and real-world datasets, which demonstrates that our algorithm outperforms previous stationary methods under non-stationary conditions.
Tasks
Published	2016-04-26
URL	http://arxiv.org/abs/1604.07638v1
PDF	http://arxiv.org/pdf/1604.07638v1.pdf
PWC	https://paperswithcode.com/paper/online-influence-maximization-in-non
Repo
Framework

Parameterized Analysis of Multi-objective Evolutionary Algorithms and the Weighted Vertex Cover Problem


Title	Parameterized Analysis of Multi-objective Evolutionary Algorithms and the Weighted Vertex Cover Problem
Authors	Mojgan Pourhassan, Feng Shi, Frank Neumann
Abstract	A rigorous runtime analysis of evolutionary multi-objective optimization for the classical vertex cover problem in the context of parameterized complexity analysis has been presented by Kratsch and Neumann (2013). In this paper, we extend the analysis to the weighted vertex cover problem and provide a fixed parameter evolutionary algorithm with respect to OPT, the cost of the the optimal solution for the problem. Moreover, using a diversity mechanisms, we present a multi-objective evolutionary algorithm that finds a 2-approximation in expected polynomial time and introduce a population-based evolutionary algorithm which finds a $(1+\varepsilon)$-approximation in expected time $O(n\cdot 2^{\min {n,2(1- \varepsilon)OPT }} + n^3)$.
Tasks
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01495v1
PDF	http://arxiv.org/pdf/1604.01495v1.pdf
PWC	https://paperswithcode.com/paper/parameterized-analysis-of-multi-objective
Repo
Framework

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations


Title	Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Authors	Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Fei-Fei Li
Abstract	Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks that involve not just recognizing, but reasoning about our visual world. However, models used to tackle the rich content in images for cognitive tasks are still being trained using the same datasets designed for perceptual tasks. To achieve success at cognitive tasks, models need to understand the interactions and relationships between objects in an image. When asked “What vehicle is the person riding?", computers will need to identify the objects in an image as well as the relationships riding(man, carriage) and pulling(horse, carriage) in order to answer correctly that “the person is riding a horse-drawn carriage”. In this paper, we present the Visual Genome dataset to enable the modeling of such relationships. We collect dense annotations of objects, attributes, and relationships within each image to learn these models. Specifically, our dataset contains over 100K images where each image has an average of 21 objects, 18 attributes, and 18 pairwise relationships between objects. We canonicalize the objects, attributes, relationships, and noun phrases in region descriptions and questions answer pairs to WordNet synsets. Together, these annotations represent the densest and largest dataset of image descriptions, objects, attributes, relationships, and question answers.
Tasks	Image Classification, Question Answering
Published	2016-02-23
URL	http://arxiv.org/abs/1602.07332v1
PDF	http://arxiv.org/pdf/1602.07332v1.pdf
PWC	https://paperswithcode.com/paper/visual-genome-connecting-language-and-vision
Repo
Framework

Joint Recursive Monocular Filtering of Camera Motion and Disparity Map


Title	Joint Recursive Monocular Filtering of Camera Motion and Disparity Map
Authors	Johannes Berger, Christoph Schnörr
Abstract	Monocular scene reconstruction is essential for modern applications such as robotics or autonomous driving. Although stereo methods usually result in better accuracy than monocular methods, they are more expensive and more difficult to calibrate. In this work, we present a novel second order optimal minimum energy filter that jointly estimates the camera motion, the disparity map and also higher order kinematics recursively on a product Lie group containing a novel disparity group. This mathematical framework enables to cope with non-Euclidean state spaces, non-linear observations and high dimensions which is infeasible for most classical filters. To be robust against outliers, we use a generalized Charbonnier energy function in this framework rather than a quadratic energy function as proposed in related work. Experiments confirm that our method enables accurate reconstructions on-par with state-of-the-art.
Tasks	Autonomous Driving
Published	2016-06-07
URL	http://arxiv.org/abs/1606.02092v1
PDF	http://arxiv.org/pdf/1606.02092v1.pdf
PWC	https://paperswithcode.com/paper/joint-recursive-monocular-filtering-of-camera
Repo
Framework

Even Trolls Are Useful: Efficient Link Classification in Signed Networks


Title	Even Trolls Are Useful: Efficient Link Classification in Signed Networks
Authors	Géraud Le Falher, Fabio Vitale
Abstract	We address the problem of classifying the links of signed social networks given their full structural topology. Motivated by a binary user behaviour assumption, which is supported by decades of research in psychology, we develop an efficient and surprisingly simple approach to solve this classification problem. Our methods operate both within the active and batch settings. We demonstrate that the algorithms we developed are extremely fast in both theoretical and practical terms. Within the active setting, we provide a new complexity measure and a rigorous analysis of our methods that hold for arbitrary signed networks. We validate our theoretical claims carrying out a set of experiments on three well known real-world datasets, showing that our methods outperform the competitors while being much faster.
Tasks
Published	2016-02-29
URL	http://arxiv.org/abs/1602.08986v1
PDF	http://arxiv.org/pdf/1602.08986v1.pdf
PWC	https://paperswithcode.com/paper/even-trolls-are-useful-efficient-link
Repo
Framework

Can a CNN Recognize Catalan Diet?


Title	Can a CNN Recognize Catalan Diet?
Authors	Pedro Herruzo, Marc Bolaños, Petia Radeva
Abstract	Nowadays, we can find several diseases related to the unhealthy diet habits of the population, such as diabetes, obesity, anemia, bulimia and anorexia. In many cases, these diseases are related to the food consumption of people. Mediterranean diet is scientifically known as a healthy diet that helps to prevent many metabolic diseases. In particular, our work focuses on the recognition of Mediterranean food and dishes. The development of this methodology would allow to analise the daily habits of users with wearable cameras, within the topic of lifelogging. By using automatic mechanisms we could build an objective tool for the analysis of the patient’s behaviour, allowing specialists to discover unhealthy food patterns and understand the user’s lifestyle. With the aim to automatically recognize a complete diet, we introduce a challenging multi-labeled dataset related to Mediterranean diet called FoodCAT. The first type of label provided consists of 115 food classes with an average of 400 images per dish, and the second one consists of 12 food categories with an average of 3800 pictures per class. This dataset will serve as a basis for the development of automatic diet recognition. In this context, deep learning and more specifically, Convolutional Neural Networks (CNNs), currently are state-of-the-art methods for automatic food recognition. In our work, we compare several architectures for image classification, with the purpose of diet recognition. Applying the best model for recognising food categories, we achieve a top-1 accuracy of 72.29%, and top-5 of 97.07%. In a complete diet recognition of dishes from Mediterranean diet, enlarged with the Food-101 dataset for international dishes recognition, we achieve a top-1 accuracy of 68.07%, and top-5 of 89.53%, for a total of 115+101 food classes.
Tasks	Food Recognition, Image Classification
Published	2016-07-29
URL	http://arxiv.org/abs/1607.08811v1
PDF	http://arxiv.org/pdf/1607.08811v1.pdf
PWC	https://paperswithcode.com/paper/can-a-cnn-recognize-catalan-diet
Repo
Framework

Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments


Title	Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments
Authors	Markus Wulfmeier, Dominic Zeng Wang, Ingmar Posner
Abstract	In this work, we present an approach to learn cost maps for driving in complex urban environments from a very large number of demonstrations of driving behaviour by human experts. The learned cost maps are constructed directly from raw sensor measurements, bypassing the effort of manually designing cost maps as well as features. When deploying the learned cost maps, the trajectories generated not only replicate human-like driving behaviour but are also demonstrably robust against systematic errors in putative robot configuration. To achieve this we deploy a Maximum Entropy based, non-linear IRL framework which uses Fully Convolutional Neural Networks (FCNs) to represent the cost model underlying expert driving behaviour. Using a deep, parametric approach enables us to scale efficiently to large datasets and complex behaviours by being run-time independent of dataset extent during deployment. We demonstrate the scalability and the performance of the proposed approach on an ambitious dataset collected over the course of one year including more than 25k demonstration trajectories extracted from over 120km of driving around pedestrianised areas in the city of Milton Keynes, UK. We evaluate the resulting cost representations by showing the advantages over a carefully manually designed cost map and, in addition, demonstrate its robustness to systematic errors by learning precise cost-maps even in the presence of system calibration perturbations.
Tasks	Calibration
Published	2016-07-08
URL	http://arxiv.org/abs/1607.02329v1
PDF	http://arxiv.org/pdf/1607.02329v1.pdf
PWC	https://paperswithcode.com/paper/watch-this-scalable-cost-function-learning
Repo
Framework

Fast methods for training Gaussian processes on large data sets


Title	Fast methods for training Gaussian processes on large data sets
Authors	Christopher J. Moore, Alvin J. K. Chua, Christopher P. L. Berry, Jonathan R. Gair
Abstract	Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.
Tasks	Gaussian Processes
Published	2016-04-05
URL	http://arxiv.org/abs/1604.01250v2
PDF	http://arxiv.org/pdf/1604.01250v2.pdf
PWC	https://paperswithcode.com/paper/fast-methods-for-training-gaussian-processes
Repo
Framework

Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation


Title	Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
Authors	Jie Zhou, Ying Cao, Xuguang Wang, Peng Li, Wei Xu
Abstract	Neural machine translation (NMT) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. However, most of the existing NMT models are shallow and there is still a performance gap between a single NMT model and the best conventional MT system. In this work, we introduce a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers. Fast-forward connections play an essential role in propagating the gradients and building a deep topology of depth 16. On the WMT’14 English-to-French task, we achieve BLEU=37.7 with a single attention model, which outperforms the corresponding single shallow model by 6.2 BLEU points. This is the first time that a single NMT model achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points. We can still achieve BLEU=36.3 even without using an attention mechanism. After special handling of unknown words and model ensembling, we obtain the best score reported to date on this task with BLEU=40.4. Our models are also validated on the more difficult WMT’14 English-to-German task.
Tasks	Machine Translation
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04199v3
PDF	http://arxiv.org/pdf/1606.04199v3.pdf
PWC	https://paperswithcode.com/paper/deep-recurrent-models-with-fast-forward
Repo
Framework

Deep Reinforcement Learning With Macro-Actions


Title	Deep Reinforcement Learning With Macro-Actions
Authors	Ishan P. Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan
Abstract	Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions, and evaluate these on different Atari 2600 games, where we show that they yield significant improvements in learning speed. Additionally, we show that they can even achieve better scores than DQN. We offer analysis and explanation for both convergence and final results, revealing a problem deep RL approaches have with sparse reward signals.
Tasks	Atari Games
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04615v1
PDF	http://arxiv.org/pdf/1606.04615v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-with-macro
Repo
Framework

1-bit Matrix Completion: PAC-Bayesian Analysis of a Variational Approximation


Title	1-bit Matrix Completion: PAC-Bayesian Analysis of a Variational Approximation
Authors	Vincent Cottet, Pierre Alquier
Abstract	Due to challenging applications such as collaborative filtering, the matrix completion problem has been widely studied in the past few years. Different approaches rely on different structure assumptions on the matrix in hand. Here, we focus on the completion of a (possibly) low-rank matrix with binary entries, the so-called 1-bit matrix completion problem. Our approach relies on tools from machine learning theory: empirical risk minimization and its convex relaxations. We propose an algorithm to compute a variational approximation of the pseudo-posterior. Thanks to the convex relaxation, the corresponding minimization problem is bi-convex, and thus the method behaves well in practice. We also study the performance of this variational approximation through PAC-Bayesian learning bounds. On the contrary to previous works that focused on upper bounds on the estimation error of M with various matrix norms, we are able to derive from this analysis a PAC bound on the prediction error of our algorithm. We focus essentially on convex relaxation through the hinge loss, for which we present the complete analysis, a complete simulation study and a test on the MovieLens data set. However, we also discuss a variational approximation to deal with the logistic loss.
Tasks	Matrix Completion
Published	2016-04-14
URL	http://arxiv.org/abs/1604.04191v1
PDF	http://arxiv.org/pdf/1604.04191v1.pdf
PWC	https://paperswithcode.com/paper/1-bit-matrix-completion-pac-bayesian-analysis
Repo
Framework

Deep Structured Features for Semantic Segmentation


Title	Deep Structured Features for Semantic Segmentation
Authors	Michael Tschannen, Lukas Cavigelli, Fabian Mentzer, Thomas Wiatowski, Luca Benini
Abstract	We propose a highly structured neural network architecture for semantic segmentation with an extremely small model size, suitable for low-power embedded and mobile platforms. Specifically, our architecture combines i) a Haar wavelet-based tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages i) and ii) are completely pre-specified, only the linear classifier is learned from data. We apply the proposed architecture to outdoor scene and aerial image semantic segmentation and show that the accuracy of our architecture is competitive with conventional pixel classification CNNs. Furthermore, we demonstrate that the proposed architecture is data efficient in the sense of matching the accuracy of pixel classification CNNs when trained on a much smaller data set.
Tasks	Semantic Segmentation
Published	2016-09-26
URL	http://arxiv.org/abs/1609.07916v3
PDF	http://arxiv.org/pdf/1609.07916v3.pdf
PWC	https://paperswithcode.com/paper/deep-structured-features-for-semantic
Repo
Framework

Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling


Title	Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling
Authors	Christophe Dupuy, Francis Bach
Abstract	We study parameter inference in large-scale latent variable models. We first propose an unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We then propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods to online inference of latent variable models with the proper use of local Gibbs sampling. Then, for latent Dirich-let allocation,we provide an extensive set of experiments and comparisons with existing work, where our new approach outperforms all previously proposed methods. In particular, using Gibbs sampling for latent variable inference is superior to variational inference in terms of test log-likelihoods. Moreover, Bayesian inference through variational methods perform poorly, sometimes leading to worse fits with latent variables of higher dimensionality.
Tasks	Bayesian Inference, Latent Variable Models
Published	2016-03-08
URL	http://arxiv.org/abs/1603.02644v5
PDF	http://arxiv.org/pdf/1603.02644v5.pdf
PWC	https://paperswithcode.com/paper/online-but-accurate-inference-for-latent
Repo
Framework