May 6, 2019

3164 words 15 mins read

Paper Group ANR 317

Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks. Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation. Regularized Optimal Transport and the Rot Mover’s Distance. Heuristic Approaches for Generating Local Process Models through Log Projections. AdversariaLib: An …

Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks


Title	Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks
Authors	Jinghua Wang, Zhenhua Wang, Dacheng Tao, Simon See, Gang Wang
Abstract	In this paper, we tackle the problem of RGB-D semantic segmentation of indoor images. We take advantage of deconvolutional networks which can predict pixel-wise class labels, and develop a new structure for deconvolution of multiple modalities. We propose a novel feature transformation network to bridge the convolutional networks and deconvolutional networks. In the feature transformation network, we correlate the two modalities by discovering common features between them, as well as characterize each modality by discovering modality specific features. With the common features, we not only closely correlate the two modalities, but also allow them to borrow features from each other to enhance the representation of shared information. With specific features, we capture the visual patterns that are only visible in one modality. The proposed network achieves competitive segmentation accuracy on NYU depth dataset V1 and V2.
Tasks	Semantic Segmentation
Published	2016-08-03
URL	http://arxiv.org/abs/1608.01082v1
PDF	http://arxiv.org/pdf/1608.01082v1.pdf
PWC	https://paperswithcode.com/paper/learning-common-and-specific-features-for-rgb
Repo
Framework

Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation


Title	Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation
Authors	Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen
Abstract	Object proposals for detecting moving or static video objects need to address issues such as speed, memory complexity and temporal consistency. We propose an efficient Video Object Proposal (VOP) generation method and show its efficacy in learning a better video object detector. A deep-learning based video object detector learned using the proposed VOP achieves state-of-the-art detection performance on the Youtube-Objects dataset. We further propose a clustering of VOPs which can efficiently be used for detecting objects in video in a streaming fashion. As opposed to applying per-frame convolutional neural network (CNN) based object detection, our proposed method called Objects in Video Enabler thRough LAbel Propagation (OVERLAP) needs to classify only a small fraction of all candidate proposals in every video frame through streaming clustering of object proposals and class-label propagation. Source code will be made available soon.
Tasks	Object Detection
Published	2016-01-20
URL	http://arxiv.org/abs/1601.05447v1
PDF	http://arxiv.org/pdf/1601.05447v1.pdf
PWC	https://paperswithcode.com/paper/detecting-temporally-consistent-objects-in
Repo
Framework

Regularized Optimal Transport and the Rot Mover’s Distance


Title	Regularized Optimal Transport and the Rot Mover’s Distance
Authors	Arnaud Dessein, Nicolas Papadakis, Jean-Luc Rouas
Abstract	This paper presents a unified framework for smooth convex regularization of discrete optimal transport problems. In this context, the regularized optimal transport turns out to be equivalent to a matrix nearness problem with respect to Bregman divergences. Our framework thus naturally generalizes a previously proposed regularization based on the Boltzmann-Shannon entropy related to the Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We call the regularized optimal transport distance the rot mover’s distance in reference to the classical earth mover’s distance. We develop two generic schemes that we respectively call the alternate scaling algorithm and the non-negative alternate scaling algorithm, to compute efficiently the regularized optimal plans depending on whether the domain of the regularizer lies within the non-negative orthant or not. These schemes are based on Dykstra’s algorithm with alternate Bregman projections, and further exploit the Newton-Raphson method when applied to separable divergences. We enhance the separable case with a sparse extension to deal with high data dimensions. We also instantiate our proposed framework and discuss the inherent specificities for well-known regularizers and statistical divergences in the machine learning and information geometry communities. Finally, we demonstrate the merits of our methods with experiments using synthetic data to illustrate the effect of different regularizers and penalties on the solutions, as well as real-world data for a pattern recognition application to audio scene classification.
Tasks	Scene Classification
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06447v4
PDF	http://arxiv.org/pdf/1610.06447v4.pdf
PWC	https://paperswithcode.com/paper/regularized-optimal-transport-and-the-rot
Repo
Framework

Heuristic Approaches for Generating Local Process Models through Log Projections


Title	Heuristic Approaches for Generating Local Process Models through Log Projections
Authors	Niek Tax, Natalia Sidorova, Wil M. P. van der Aalst, Reinder Haakma
Abstract	Local Process Model (LPM) discovery is focused on the mining of a set of process models where each model describes the behavior represented in the event log only partially, i.e. subsets of possible events are taken into account to create so-called local process models. Often such smaller models provide valuable insights into the behavior of the process, especially when no adequate and comprehensible single overall process model exists that is able to describe the traces of the process from start to end. The practical application of LPM discovery is however hindered by computational issues in the case of logs with many activities (problems may already occur when there are more than 17 unique activities). In this paper, we explore three heuristics to discover subsets of activities that lead to useful log projections with the goal of speeding up LPM discovery considerably while still finding high-quality LPMs. We found that a Markov clustering approach to create projection sets results in the largest improvement of execution time, with discovered LPMs still being better than with the use of randomly generated activity sets of the same size. Another heuristic, based on log entropy, yields a more moderate speedup, but enables the discovery of higher quality LPMs. The third heuristic, based on the relative information gain, shows unstable performance: for some data sets the speedup and LPM quality are higher than with the log entropy based method, while for other data sets there is no speedup at all.
Tasks
Published	2016-10-10
URL	http://arxiv.org/abs/1610.02876v1
PDF	http://arxiv.org/pdf/1610.02876v1.pdf
PWC	https://paperswithcode.com/paper/heuristic-approaches-for-generating-local
Repo
Framework

AdversariaLib: An Open-source Library for the Security Evaluation of Machine Learning Algorithms Under Attack


Title	AdversariaLib: An Open-source Library for the Security Evaluation of Machine Learning Algorithms Under Attack
Authors	Igino Corona, Battista Biggio, Davide Maiorca
Abstract	We present AdversariaLib, an open-source python library for the security evaluation of machine learning (ML) against carefully-targeted attacks. It supports the implementation of several attacks proposed thus far in the literature of adversarial learning, allows for the evaluation of a wide range of ML algorithms, runs on multiple platforms, and has multi-processing enabled. The library has a modular architecture that makes it easy to use and to extend by implementing novel attacks and countermeasures. It relies on other widely-used open-source ML libraries, including scikit-learn and FANN. Classification algorithms are implemented and optimized in C/C++, allowing for a fast evaluation of the simulated attacks. The package is distributed under the GNU General Public License v3, and it is available for download at http://sourceforge.net/projects/adversarialib.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04786v1
PDF	http://arxiv.org/pdf/1611.04786v1.pdf
PWC	https://paperswithcode.com/paper/adversarialib-an-open-source-library-for-the
Repo
Framework

Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition


Title	Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
Authors	Théodore Bluche
Abstract	Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient multi-dimensional long short-term memory recurrent neural networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can recognize one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM database yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents.
Tasks
Published	2016-04-28
URL	http://arxiv.org/abs/1604.08352v1
PDF	http://arxiv.org/pdf/1604.08352v1.pdf
PWC	https://paperswithcode.com/paper/joint-line-segmentation-and-transcription-for
Repo
Framework

Refined Lower Bounds for Adversarial Bandits


Title	Refined Lower Bounds for Adversarial Bandits
Authors	Sébastien Gerchinovitz, Tor Lattimore
Abstract	We provide new lower bounds on the regret that must be suffered by adversarial bandit algorithms. The new results show that recent upper bounds that either (a) hold with high-probability or (b) depend on the total lossof the best arm or (c) depend on the quadratic variation of the losses, are close to tight. Besides this we prove two impossibility results. First, the existence of a single arm that is optimal in every round cannot improve the regret in the worst case. Second, the regret cannot scale with the effective range of the losses. In contrast, both results are possible in the full-information setting.
Tasks
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07416v2
PDF	http://arxiv.org/pdf/1605.07416v2.pdf
PWC	https://paperswithcode.com/paper/refined-lower-bounds-for-adversarial-bandits
Repo
Framework

Compressive Imaging with Iterative Forward Models


Title	Compressive Imaging with Iterative Forward Models
Authors	Hsiou-Yuan Liu, Ulugbek S. Kamilov, Dehong Liu, Hassan Mansour, Petros T. Boufounos
Abstract	We propose a new compressive imaging method for reconstructing 2D or 3D objects from their scattered wave-field measurements. Our method relies on a novel, nonlinear measurement model that can account for the multiple scattering phenomenon, which makes the method preferable in applications where linear measurement models are inaccurate. We construct the measurement model by expanding the scattered wave-field with an accelerated-gradient method, which is guaranteed to converge and is suitable for large-scale problems. We provide explicit formulas for computing the gradient of our measurement model with respect to the unknown image, which enables image formation with a sparsity- driven numerical optimization algorithm. We validate the method both analytically and with numerical simulations.
Tasks
Published	2016-10-05
URL	http://arxiv.org/abs/1610.01852v1
PDF	http://arxiv.org/pdf/1610.01852v1.pdf
PWC	https://paperswithcode.com/paper/compressive-imaging-with-iterative-forward
Repo
Framework

Generalization error bounds for learning to rank: Does the length of document lists matter?


Title	Generalization error bounds for learning to rank: Does the length of document lists matter?
Authors	Ambuj Tewari, Sougata Chaudhuri
Abstract	We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well known ListNet method, there is \emph{no} degradation in generalization ability as document lists become longer. We also provide novel generalization error bounds under $\ell_1$ regularization and faster convergence rates if the loss function is smooth.
Tasks	Learning-To-Rank
Published	2016-03-06
URL	http://arxiv.org/abs/1603.01860v1
PDF	http://arxiv.org/pdf/1603.01860v1.pdf
PWC	https://paperswithcode.com/paper/generalization-error-bounds-for-learning-to
Repo
Framework

Nonparametric semi-supervised learning of class proportions


Title	Nonparametric semi-supervised learning of class proportions
Authors	Shantanu Jain, Martha White, Michael W. Trosset, Predrag Radivojac
Abstract	The problem of developing binary classifiers from positive and unlabeled data is often encountered in machine learning. A common requirement in this setting is to approximate posterior probabilities of positive and negative classes for a previously unseen data point. This problem can be decomposed into two steps: (i) the development of accurate predictors that discriminate between positive and unlabeled data, and (ii) the accurate estimation of the prior probabilities of positive and negative examples. In this work we primarily focus on the latter subproblem. We study nonparametric class prior estimation and formulate this problem as an estimation of mixing proportions in two-component mixture models, given a sample from one of the components and another sample from the mixture itself. We show that estimation of mixing proportions is generally ill-defined and propose a canonical form to obtain identifiability while maintaining the flexibility to model any distribution. We use insights from this theory to elucidate the optimization surface of the class priors and propose an algorithm for estimating them. To address the problems of high-dimensional density estimation, we provide practical transformations to low-dimensional spaces that preserve class priors. Finally, we demonstrate the efficacy of our method on univariate and multivariate data.
Tasks	Density Estimation
Published	2016-01-08
URL	http://arxiv.org/abs/1601.01944v1
PDF	http://arxiv.org/pdf/1601.01944v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-semi-supervised-learning-of
Repo
Framework

Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs


Title	Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs
Authors	Yaniv Altshuler, Alex Pentland, Shlomo Bekhor, Yoram Shiftan, Alfred Bruckstein
Abstract	Current state of the art in the field of UAV activation relies solely on human operators for the design and adaptation of the drones’ flying routes. Furthermore, this is being done today on an individual level (one vehicle per operators), with some exceptions of a handful of new systems, that are comprised of a small number of self-organizing swarms, manually guided by a human operator. Drones-based monitoring is of great importance in variety of civilian domains, such as road safety, homeland security, and even environmental control. In its military aspect, efficiently detecting evading targets by a fleet of unmanned drones has an ever increasing impact on the ability of modern armies to engage in warfare. The latter is true both traditional symmetric conflicts among armies as well as asymmetric ones. Be it a speeding driver, a polluting trailer or a covert convoy, the basic challenge remains the same – how can its detection probability be maximized using as little number of drones as possible. In this work we propose a novel approach for the optimization of large scale swarms of reconnaissance drones – capable of producing on-demand optimal coverage strategies for any given search scenario. Given an estimation cost of the threat’s potential damages, as well as types of monitoring drones available and their comparative performance, our proposed method generates an analytically provable strategy, stating the optimal number and types of drones to be deployed, in order to cost-efficiently monitor a pre-defined region for targets maneuvering using a given roads networks. We demonstrate our model using a unique dataset of the Israeli transportation network, on which different deployment schemes for drones deployment are evaluated.
Tasks
Published	2016-11-17
URL	http://arxiv.org/abs/1611.05735v1
PDF	http://arxiv.org/pdf/1611.05735v1.pdf
PWC	https://paperswithcode.com/paper/optimal-dynamic-coverage-infrastructure-for
Repo
Framework

Generalized Dropout


Title	Generalized Dropout
Authors	Suraj Srinivas, R. Venkatesh Babu
Abstract	Deep Neural Networks often require good regularizers to generalize well. Dropout is one such regularizer that is widely used among Deep Learning practitioners. Recent work has shown that Dropout can also be viewed as performing Approximate Bayesian Inference over the network parameters. In this work, we generalize this notion and introduce a rich family of regularizers which we call Generalized Dropout. One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters. Classical Dropout emerges as a special case of this method. Another member of this family selects the width of neural network layers. Experiments show that these methods help in improving generalization performance over Dropout.
Tasks	Bayesian Inference
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06791v1
PDF	http://arxiv.org/pdf/1611.06791v1.pdf
PWC	https://paperswithcode.com/paper/generalized-dropout
Repo
Framework

Neural Machine Translation with Latent Semantic of Image and Text


Title	Neural Machine Translation with Latent Semantic of Image and Text
Authors	Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo
Abstract	Although attention-based Neural Machine Translation have achieved great success, attention-mechanism cannot capture the entire meaning of the source sentence because the attention mechanism generates a target word depending heavily on the relevant parts of the source sentence. The report of earlier studies has introduced a latent variable to capture the entire meaning of sentence and achieved improvement on attention-based Neural Machine Translation. We follow this approach and we believe that the capturing meaning of sentence benefits from image information because human beings understand the meaning of language not only from textual information but also from perceptual information such as that gained from vision. As described herein, we propose a neural machine translation model that introduces a continuous latent variable containing an underlying semantic extracted from texts and images. Our model, which can be trained end-to-end, requires image information only when training. Experiments conducted with an English–German translation task show that our model outperforms over the baseline.
Tasks	Machine Translation
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08459v1
PDF	http://arxiv.org/pdf/1611.08459v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-with-latent
Repo
Framework

Online Learning for Wireless Distributed Computing


Title	Online Learning for Wireless Distributed Computing
Authors	Yi-Hsuan Kao, Kwame Wright, Bhaskar Krishnamachari, Fan Bai
Abstract	There has been a growing interest for Wireless Distributed Computing (WDC), which leverages collaborative computing over multiple wireless devices. WDC enables complex applications that a single device cannot support individually. However, the problem of assigning tasks over multiple devices becomes challenging in the dynamic environments encountered in real-world settings, considering that the resource availability and channel conditions change over time in unpredictable ways due to mobility and other factors. In this paper, we formulate a task assignment problem as an online learning problem using an adversarial multi-armed bandit framework. We propose MABSTA, a novel online learning algorithm that learns the performance of unknown devices and channel qualities continually through exploratory probing and makes task assignment decisions by exploiting the gained knowledge. For maximal adaptability, MABSTA is designed to make no stochastic assumption about the environment. We analyze it mathematically and provide a worst-case performance guarantee for any dynamic environment. We also compare it with the optimal offline policy as well as other baselines via emulations on trace-data obtained from a wireless IoT testbed, and show that it offers competitive and robust performance in all cases. To the best of our knowledge, MABSTA is the first online algorithm in this domain of task assignment problems and provides provable performance guarantee.
Tasks
Published	2016-11-09
URL	http://arxiv.org/abs/1611.02830v1
PDF	http://arxiv.org/pdf/1611.02830v1.pdf
PWC	https://paperswithcode.com/paper/online-learning-for-wireless-distributed
Repo
Framework

A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image


Title	A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image
Authors	Ruiqi Zhao, Yan Wang, Aleix Martinez
Abstract	Three-dimensional shape reconstruction of 2D landmark points on a single image is a hallmark of human vision, but is a task that has been proven difficult for computer vision algorithms. We define a feed-forward deep neural network algorithm that can reconstruct 3D shapes from 2D landmark points almost perfectly (i.e., with extremely small reconstruction errors), even when these 2D landmarks are from a single image. Our experimental results show an improvement of up to two-fold over state-of-the-art computer vision algorithms; 3D shape reconstruction of human faces is given at a reconstruction error < .004, cars at .0022, human bodies at .022, and highly-deformable flags at an error of .0004. Our algorithm was also a top performer at the 2016 3D Face Alignment in the Wild Challenge competition (done in conjunction with the European Conference on Computer Vision, ECCV) that required the reconstruction of 3D face shape from a single image. The derived algorithm can be trained in a couple hours and testing runs at more than 1, 000 frames/s on an i7 desktop. We also present an innovative data augmentation approach that allows us to train the system efficiently with small number of samples. And the system is robust to noise (e.g., imprecise landmark points) and missing data (e.g., occluded or undetected landmark points).
Tasks	Data Augmentation, Face Alignment
Published	2016-09-28
URL	http://arxiv.org/abs/1609.09058v1
PDF	http://arxiv.org/pdf/1609.09058v1.pdf
PWC	https://paperswithcode.com/paper/a-simple-fast-and-highly-accurate-algorithm
Repo
Framework