Paper Group ANR 317
Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks. Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation. Regularized Optimal Transport and the Rot Mover’s Distance. Heuristic Approaches for Generating Local Process Models through Log Projections. AdversariaLib: An …
Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks
Title | Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks |
Authors | Jinghua Wang, Zhenhua Wang, Dacheng Tao, Simon See, Gang Wang |
Abstract | In this paper, we tackle the problem of RGB-D semantic segmentation of indoor images. We take advantage of deconvolutional networks which can predict pixel-wise class labels, and develop a new structure for deconvolution of multiple modalities. We propose a novel feature transformation network to bridge the convolutional networks and deconvolutional networks. In the feature transformation network, we correlate the two modalities by discovering common features between them, as well as characterize each modality by discovering modality specific features. With the common features, we not only closely correlate the two modalities, but also allow them to borrow features from each other to enhance the representation of shared information. With specific features, we capture the visual patterns that are only visible in one modality. The proposed network achieves competitive segmentation accuracy on NYU depth dataset V1 and V2. |
Tasks | Semantic Segmentation |
Published | 2016-08-03 |
URL | http://arxiv.org/abs/1608.01082v1 |
http://arxiv.org/pdf/1608.01082v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-common-and-specific-features-for-rgb |
Repo | |
Framework | |
Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation
Title | Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation |
Authors | Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen |
Abstract | Object proposals for detecting moving or static video objects need to address issues such as speed, memory complexity and temporal consistency. We propose an efficient Video Object Proposal (VOP) generation method and show its efficacy in learning a better video object detector. A deep-learning based video object detector learned using the proposed VOP achieves state-of-the-art detection performance on the Youtube-Objects dataset. We further propose a clustering of VOPs which can efficiently be used for detecting objects in video in a streaming fashion. As opposed to applying per-frame convolutional neural network (CNN) based object detection, our proposed method called Objects in Video Enabler thRough LAbel Propagation (OVERLAP) needs to classify only a small fraction of all candidate proposals in every video frame through streaming clustering of object proposals and class-label propagation. Source code will be made available soon. |
Tasks | Object Detection |
Published | 2016-01-20 |
URL | http://arxiv.org/abs/1601.05447v1 |
http://arxiv.org/pdf/1601.05447v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-temporally-consistent-objects-in |
Repo | |
Framework | |
Regularized Optimal Transport and the Rot Mover’s Distance
Title | Regularized Optimal Transport and the Rot Mover’s Distance |
Authors | Arnaud Dessein, Nicolas Papadakis, Jean-Luc Rouas |
Abstract | This paper presents a unified framework for smooth convex regularization of discrete optimal transport problems. In this context, the regularized optimal transport turns out to be equivalent to a matrix nearness problem with respect to Bregman divergences. Our framework thus naturally generalizes a previously proposed regularization based on the Boltzmann-Shannon entropy related to the Kullback-Leibler divergence, and solved with the Sinkhorn-Knopp algorithm. We call the regularized optimal transport distance the rot mover’s distance in reference to the classical earth mover’s distance. We develop two generic schemes that we respectively call the alternate scaling algorithm and the non-negative alternate scaling algorithm, to compute efficiently the regularized optimal plans depending on whether the domain of the regularizer lies within the non-negative orthant or not. These schemes are based on Dykstra’s algorithm with alternate Bregman projections, and further exploit the Newton-Raphson method when applied to separable divergences. We enhance the separable case with a sparse extension to deal with high data dimensions. We also instantiate our proposed framework and discuss the inherent specificities for well-known regularizers and statistical divergences in the machine learning and information geometry communities. Finally, we demonstrate the merits of our methods with experiments using synthetic data to illustrate the effect of different regularizers and penalties on the solutions, as well as real-world data for a pattern recognition application to audio scene classification. |
Tasks | Scene Classification |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06447v4 |
http://arxiv.org/pdf/1610.06447v4.pdf | |
PWC | https://paperswithcode.com/paper/regularized-optimal-transport-and-the-rot |
Repo | |
Framework | |
Heuristic Approaches for Generating Local Process Models through Log Projections
Title | Heuristic Approaches for Generating Local Process Models through Log Projections |
Authors | Niek Tax, Natalia Sidorova, Wil M. P. van der Aalst, Reinder Haakma |
Abstract | Local Process Model (LPM) discovery is focused on the mining of a set of process models where each model describes the behavior represented in the event log only partially, i.e. subsets of possible events are taken into account to create so-called local process models. Often such smaller models provide valuable insights into the behavior of the process, especially when no adequate and comprehensible single overall process model exists that is able to describe the traces of the process from start to end. The practical application of LPM discovery is however hindered by computational issues in the case of logs with many activities (problems may already occur when there are more than 17 unique activities). In this paper, we explore three heuristics to discover subsets of activities that lead to useful log projections with the goal of speeding up LPM discovery considerably while still finding high-quality LPMs. We found that a Markov clustering approach to create projection sets results in the largest improvement of execution time, with discovered LPMs still being better than with the use of randomly generated activity sets of the same size. Another heuristic, based on log entropy, yields a more moderate speedup, but enables the discovery of higher quality LPMs. The third heuristic, based on the relative information gain, shows unstable performance: for some data sets the speedup and LPM quality are higher than with the log entropy based method, while for other data sets there is no speedup at all. |
Tasks | |
Published | 2016-10-10 |
URL | http://arxiv.org/abs/1610.02876v1 |
http://arxiv.org/pdf/1610.02876v1.pdf | |
PWC | https://paperswithcode.com/paper/heuristic-approaches-for-generating-local |
Repo | |
Framework | |
AdversariaLib: An Open-source Library for the Security Evaluation of Machine Learning Algorithms Under Attack
Title | AdversariaLib: An Open-source Library for the Security Evaluation of Machine Learning Algorithms Under Attack |
Authors | Igino Corona, Battista Biggio, Davide Maiorca |
Abstract | We present AdversariaLib, an open-source python library for the security evaluation of machine learning (ML) against carefully-targeted attacks. It supports the implementation of several attacks proposed thus far in the literature of adversarial learning, allows for the evaluation of a wide range of ML algorithms, runs on multiple platforms, and has multi-processing enabled. The library has a modular architecture that makes it easy to use and to extend by implementing novel attacks and countermeasures. It relies on other widely-used open-source ML libraries, including scikit-learn and FANN. Classification algorithms are implemented and optimized in C/C++, allowing for a fast evaluation of the simulated attacks. The package is distributed under the GNU General Public License v3, and it is available for download at http://sourceforge.net/projects/adversarialib. |
Tasks | |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04786v1 |
http://arxiv.org/pdf/1611.04786v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarialib-an-open-source-library-for-the |
Repo | |
Framework | |
Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
Title | Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition |
Authors | Théodore Bluche |
Abstract | Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient multi-dimensional long short-term memory recurrent neural networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can recognize one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM database yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents. |
Tasks | |
Published | 2016-04-28 |
URL | http://arxiv.org/abs/1604.08352v1 |
http://arxiv.org/pdf/1604.08352v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-line-segmentation-and-transcription-for |
Repo | |
Framework | |
Refined Lower Bounds for Adversarial Bandits
Title | Refined Lower Bounds for Adversarial Bandits |
Authors | Sébastien Gerchinovitz, Tor Lattimore |
Abstract | We provide new lower bounds on the regret that must be suffered by adversarial bandit algorithms. The new results show that recent upper bounds that either (a) hold with high-probability or (b) depend on the total lossof the best arm or (c) depend on the quadratic variation of the losses, are close to tight. Besides this we prove two impossibility results. First, the existence of a single arm that is optimal in every round cannot improve the regret in the worst case. Second, the regret cannot scale with the effective range of the losses. In contrast, both results are possible in the full-information setting. |
Tasks | |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07416v2 |
http://arxiv.org/pdf/1605.07416v2.pdf | |
PWC | https://paperswithcode.com/paper/refined-lower-bounds-for-adversarial-bandits |
Repo | |
Framework | |
Compressive Imaging with Iterative Forward Models
Title | Compressive Imaging with Iterative Forward Models |
Authors | Hsiou-Yuan Liu, Ulugbek S. Kamilov, Dehong Liu, Hassan Mansour, Petros T. Boufounos |
Abstract | We propose a new compressive imaging method for reconstructing 2D or 3D objects from their scattered wave-field measurements. Our method relies on a novel, nonlinear measurement model that can account for the multiple scattering phenomenon, which makes the method preferable in applications where linear measurement models are inaccurate. We construct the measurement model by expanding the scattered wave-field with an accelerated-gradient method, which is guaranteed to converge and is suitable for large-scale problems. We provide explicit formulas for computing the gradient of our measurement model with respect to the unknown image, which enables image formation with a sparsity- driven numerical optimization algorithm. We validate the method both analytically and with numerical simulations. |
Tasks | |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01852v1 |
http://arxiv.org/pdf/1610.01852v1.pdf | |
PWC | https://paperswithcode.com/paper/compressive-imaging-with-iterative-forward |
Repo | |
Framework | |
Generalization error bounds for learning to rank: Does the length of document lists matter?
Title | Generalization error bounds for learning to rank: Does the length of document lists matter? |
Authors | Ambuj Tewari, Sougata Chaudhuri |
Abstract | We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well known ListNet method, there is \emph{no} degradation in generalization ability as document lists become longer. We also provide novel generalization error bounds under $\ell_1$ regularization and faster convergence rates if the loss function is smooth. |
Tasks | Learning-To-Rank |
Published | 2016-03-06 |
URL | http://arxiv.org/abs/1603.01860v1 |
http://arxiv.org/pdf/1603.01860v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-error-bounds-for-learning-to |
Repo | |
Framework | |
Nonparametric semi-supervised learning of class proportions
Title | Nonparametric semi-supervised learning of class proportions |
Authors | Shantanu Jain, Martha White, Michael W. Trosset, Predrag Radivojac |
Abstract | The problem of developing binary classifiers from positive and unlabeled data is often encountered in machine learning. A common requirement in this setting is to approximate posterior probabilities of positive and negative classes for a previously unseen data point. This problem can be decomposed into two steps: (i) the development of accurate predictors that discriminate between positive and unlabeled data, and (ii) the accurate estimation of the prior probabilities of positive and negative examples. In this work we primarily focus on the latter subproblem. We study nonparametric class prior estimation and formulate this problem as an estimation of mixing proportions in two-component mixture models, given a sample from one of the components and another sample from the mixture itself. We show that estimation of mixing proportions is generally ill-defined and propose a canonical form to obtain identifiability while maintaining the flexibility to model any distribution. We use insights from this theory to elucidate the optimization surface of the class priors and propose an algorithm for estimating them. To address the problems of high-dimensional density estimation, we provide practical transformations to low-dimensional spaces that preserve class priors. Finally, we demonstrate the efficacy of our method on univariate and multivariate data. |
Tasks | Density Estimation |
Published | 2016-01-08 |
URL | http://arxiv.org/abs/1601.01944v1 |
http://arxiv.org/pdf/1601.01944v1.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-semi-supervised-learning-of |
Repo | |
Framework | |
Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs
Title | Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs |
Authors | Yaniv Altshuler, Alex Pentland, Shlomo Bekhor, Yoram Shiftan, Alfred Bruckstein |
Abstract | Current state of the art in the field of UAV activation relies solely on human operators for the design and adaptation of the drones’ flying routes. Furthermore, this is being done today on an individual level (one vehicle per operators), with some exceptions of a handful of new systems, that are comprised of a small number of self-organizing swarms, manually guided by a human operator. Drones-based monitoring is of great importance in variety of civilian domains, such as road safety, homeland security, and even environmental control. In its military aspect, efficiently detecting evading targets by a fleet of unmanned drones has an ever increasing impact on the ability of modern armies to engage in warfare. The latter is true both traditional symmetric conflicts among armies as well as asymmetric ones. Be it a speeding driver, a polluting trailer or a covert convoy, the basic challenge remains the same – how can its detection probability be maximized using as little number of drones as possible. In this work we propose a novel approach for the optimization of large scale swarms of reconnaissance drones – capable of producing on-demand optimal coverage strategies for any given search scenario. Given an estimation cost of the threat’s potential damages, as well as types of monitoring drones available and their comparative performance, our proposed method generates an analytically provable strategy, stating the optimal number and types of drones to be deployed, in order to cost-efficiently monitor a pre-defined region for targets maneuvering using a given roads networks. We demonstrate our model using a unique dataset of the Israeli transportation network, on which different deployment schemes for drones deployment are evaluated. |
Tasks | |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05735v1 |
http://arxiv.org/pdf/1611.05735v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-dynamic-coverage-infrastructure-for |
Repo | |
Framework | |
Generalized Dropout
Title | Generalized Dropout |
Authors | Suraj Srinivas, R. Venkatesh Babu |
Abstract | Deep Neural Networks often require good regularizers to generalize well. Dropout is one such regularizer that is widely used among Deep Learning practitioners. Recent work has shown that Dropout can also be viewed as performing Approximate Bayesian Inference over the network parameters. In this work, we generalize this notion and introduce a rich family of regularizers which we call Generalized Dropout. One set of methods in this family, called Dropout++, is a version of Dropout with trainable parameters. Classical Dropout emerges as a special case of this method. Another member of this family selects the width of neural network layers. Experiments show that these methods help in improving generalization performance over Dropout. |
Tasks | Bayesian Inference |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06791v1 |
http://arxiv.org/pdf/1611.06791v1.pdf | |
PWC | https://paperswithcode.com/paper/generalized-dropout |
Repo | |
Framework | |
Neural Machine Translation with Latent Semantic of Image and Text
Title | Neural Machine Translation with Latent Semantic of Image and Text |
Authors | Joji Toyama, Masanori Misono, Masahiro Suzuki, Kotaro Nakayama, Yutaka Matsuo |
Abstract | Although attention-based Neural Machine Translation have achieved great success, attention-mechanism cannot capture the entire meaning of the source sentence because the attention mechanism generates a target word depending heavily on the relevant parts of the source sentence. The report of earlier studies has introduced a latent variable to capture the entire meaning of sentence and achieved improvement on attention-based Neural Machine Translation. We follow this approach and we believe that the capturing meaning of sentence benefits from image information because human beings understand the meaning of language not only from textual information but also from perceptual information such as that gained from vision. As described herein, we propose a neural machine translation model that introduces a continuous latent variable containing an underlying semantic extracted from texts and images. Our model, which can be trained end-to-end, requires image information only when training. Experiments conducted with an English–German translation task show that our model outperforms over the baseline. |
Tasks | Machine Translation |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08459v1 |
http://arxiv.org/pdf/1611.08459v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-with-latent |
Repo | |
Framework | |
Online Learning for Wireless Distributed Computing
Title | Online Learning for Wireless Distributed Computing |
Authors | Yi-Hsuan Kao, Kwame Wright, Bhaskar Krishnamachari, Fan Bai |
Abstract | There has been a growing interest for Wireless Distributed Computing (WDC), which leverages collaborative computing over multiple wireless devices. WDC enables complex applications that a single device cannot support individually. However, the problem of assigning tasks over multiple devices becomes challenging in the dynamic environments encountered in real-world settings, considering that the resource availability and channel conditions change over time in unpredictable ways due to mobility and other factors. In this paper, we formulate a task assignment problem as an online learning problem using an adversarial multi-armed bandit framework. We propose MABSTA, a novel online learning algorithm that learns the performance of unknown devices and channel qualities continually through exploratory probing and makes task assignment decisions by exploiting the gained knowledge. For maximal adaptability, MABSTA is designed to make no stochastic assumption about the environment. We analyze it mathematically and provide a worst-case performance guarantee for any dynamic environment. We also compare it with the optimal offline policy as well as other baselines via emulations on trace-data obtained from a wireless IoT testbed, and show that it offers competitive and robust performance in all cases. To the best of our knowledge, MABSTA is the first online algorithm in this domain of task assignment problems and provides provable performance guarantee. |
Tasks | |
Published | 2016-11-09 |
URL | http://arxiv.org/abs/1611.02830v1 |
http://arxiv.org/pdf/1611.02830v1.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-for-wireless-distributed |
Repo | |
Framework | |
A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image
Title | A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image |
Authors | Ruiqi Zhao, Yan Wang, Aleix Martinez |
Abstract | Three-dimensional shape reconstruction of 2D landmark points on a single image is a hallmark of human vision, but is a task that has been proven difficult for computer vision algorithms. We define a feed-forward deep neural network algorithm that can reconstruct 3D shapes from 2D landmark points almost perfectly (i.e., with extremely small reconstruction errors), even when these 2D landmarks are from a single image. Our experimental results show an improvement of up to two-fold over state-of-the-art computer vision algorithms; 3D shape reconstruction of human faces is given at a reconstruction error < .004, cars at .0022, human bodies at .022, and highly-deformable flags at an error of .0004. Our algorithm was also a top performer at the 2016 3D Face Alignment in the Wild Challenge competition (done in conjunction with the European Conference on Computer Vision, ECCV) that required the reconstruction of 3D face shape from a single image. The derived algorithm can be trained in a couple hours and testing runs at more than 1, 000 frames/s on an i7 desktop. We also present an innovative data augmentation approach that allows us to train the system efficiently with small number of samples. And the system is robust to noise (e.g., imprecise landmark points) and missing data (e.g., occluded or undetected landmark points). |
Tasks | Data Augmentation, Face Alignment |
Published | 2016-09-28 |
URL | http://arxiv.org/abs/1609.09058v1 |
http://arxiv.org/pdf/1609.09058v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-fast-and-highly-accurate-algorithm |
Repo | |
Framework | |