May 7, 2019

2908 words 14 mins read

Paper Group ANR 10

DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters. Optimizing Quantiles in Preference-based Markov Decision Processes. EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos. Learning Over Long Time Lags. Bayesian Variable Selection for Globally Sparse Probabilisti …

DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters


Title	DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters
Authors	Hanjoo Kim, Jaehong Park, Jaehee Jang, Sungroh Yoon
Abstract	The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data processing pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep learning framework that exploits Apache Spark on commodity clusters. To support parallel operations, DeepSpark automatically distributes workloads and parameters to Caffe/Tensorflow-running nodes using Spark, and iteratively aggregates training results by a novel lock-free asynchronous variant of the popular elastic averaging stochastic gradient descent based update scheme, effectively complementing the synchronized processing capabilities of Spark. DeepSpark is an on-going project, and the current release is available at http://deepspark.snu.ac.kr.
Tasks
Published	2016-02-26
URL	http://arxiv.org/abs/1602.08191v3
PDF	http://arxiv.org/pdf/1602.08191v3.pdf
PWC	https://paperswithcode.com/paper/deepspark-a-spark-based-distributed-deep
Repo
Framework

Optimizing Quantiles in Preference-based Markov Decision Processes


Title	Optimizing Quantiles in Preference-based Markov Decision Processes
Authors	Hugo Gilbert, Paul Weng, Yan Xu
Abstract	In the Markov decision process model, policies are usually evaluated by expected cumulative rewards. As this decision criterion is not always suitable, we propose in this paper an algorithm for computing a policy optimal for the quantile criterion. Both finite and infinite horizons are considered. Finally we experimentally evaluate our approach on random MDPs and on a data center control problem.
Tasks
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00094v1
PDF	http://arxiv.org/pdf/1612.00094v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-quantiles-in-preference-based
Repo
Framework

EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos


Title	EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos
Authors	Shervin Ardeshir, Sandesh Sharma, Ali Broji
Abstract	Human identification remains to be one of the challenging tasks in computer vision community due to drastic changes in visual features across different viewpoints, lighting conditions, occlusion, etc. Most of the literature has been focused on exploring human re-identification across viewpoints that are not too drastically different in nature. Cameras usually capture oblique or side views of humans, leaving room for a lot of geometric and visual reasoning. Given the recent popularity of egocentric and top-view vision, re-identification across these two drastically different views can now be explored. Having an egocentric and a top view video, our goal is to identify the cameraman in the content of the top-view video, and also re-identify the people visible in the egocentric video, by matching them to the identities present in the top-view video. We propose a CRF-based method to address the two problems. Our experimental results demonstrates the efficiency of the proposed approach over a variety of video recorded from two views.
Tasks	Person Re-Identification, Visual Reasoning
Published	2016-12-24
URL	http://arxiv.org/abs/1612.08153v1
PDF	http://arxiv.org/pdf/1612.08153v1.pdf
PWC	https://paperswithcode.com/paper/egoreid-cross-view-self-identification-and
Repo
Framework

Learning Over Long Time Lags


Title	Learning Over Long Time Lags
Authors	Hojjat Salehinejad
Abstract	The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model. Then, provides a surveys of recent advances in different memory enhancements and learning techniques for capturing long term dependencies in RNNs.
Tasks	Time Series
Published	2016-02-13
URL	http://arxiv.org/abs/1602.04335v1
PDF	http://arxiv.org/pdf/1602.04335v1.pdf
PWC	https://paperswithcode.com/paper/learning-over-long-time-lags
Repo
Framework

Bayesian Variable Selection for Globally Sparse Probabilistic PCA


Title	Bayesian Variable Selection for Globally Sparse Probabilistic PCA
Authors	Charles Bouveyron, Pierre Latouche, Pierre-Alexandre Mattei
Abstract	Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables is difficult since each axis has its own sparsity pattern and has to be interpreted separately. To overcome this drawback, we propose a Bayesian procedure called globally sparse probabilistic PCA (GSPPCA) that allows to obtain several sparse components with the same sparsity pattern. This allows the practitioner to identify the original variables which are relevant to describe the data. To this end, using Roweis’ probabilistic interpretation of PCA and a Gaussian prior on the loading matrix, we provide the first exact computation of the marginal likelihood of a Bayesian PCA model. To avoid the drawbacks of discrete model selection, a simple relaxation of this framework is presented. It allows to find a path of models using a variational expectation-maximization algorithm. The exact marginal likelihood is then maximized over this path. This approach is illustrated on real and synthetic data sets. In particular, using unlabeled microarray data, GSPPCA infers much more relevant gene subsets than traditional sparse PCA algorithms.
Tasks	Model Selection
Published	2016-05-19
URL	http://arxiv.org/abs/1605.05918v2
PDF	http://arxiv.org/pdf/1605.05918v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-variable-selection-for-globally
Repo
Framework

Semi-Supervised Representation Learning based on Probabilistic Labeling


Title	Semi-Supervised Representation Learning based on Probabilistic Labeling
Authors	Ershad Banijamali, Ali Ghodsi
Abstract	In this paper, we present a new algorithm for semi-supervised representation learning. In this algorithm, we first find a vector representation for the labels of the data points based on their local positions in the space. Then, we map the data to lower-dimensional space using a linear transformation such that the dependency between the transformed data and the assigned labels is maximized. In fact, we try to find a mapping that is as discriminative as possible. The approach will use Hilber-Schmidt Independence Criterion (HSIC) as the dependence measure. We also present a kernelized version of the algorithm, which allows non-linear transformations and provides more flexibility in finding the appropriate mapping. Use of unlabeled data for learning new representation is not always beneficial and there is no algorithm that can deterministically guarantee the improvement of the performance by exploiting unlabeled data. Therefore, we also propose a bound on the performance of the algorithm, which can be used to determine the effectiveness of using the unlabeled data in the algorithm. We demonstrate the ability of the algorithm in finding the transformation using both toy examples and real-world datasets.
Tasks	Representation Learning
Published	2016-05-10
URL	http://arxiv.org/abs/1605.03072v3
PDF	http://arxiv.org/pdf/1605.03072v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-representation-learning-based
Repo
Framework

Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery


Title	Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery
Authors	Alina Marcu, Marius Leordeanu
Abstract	Visual context is important in object recognition and it is still an open problem in computer vision. Along with the advent of deep convolutional neural networks (CNN), using contextual information with such systems starts to receive attention in the literature. At the same time, aerial imagery is gaining momentum. While advances in deep learning make good progress in aerial image analysis, this problem still poses many great challenges. Aerial images are often taken under poor lighting conditions and contain low resolution objects, many times occluded by trees or taller buildings. In this domain, in particular, visual context could be of great help, but there are still very few papers that consider context in aerial image understanding. Here we introduce context as a complementary way of recognizing objects. We propose a dual-stream deep neural network model that processes information along two independent pathways, one for local and another for global visual reasoning. The two are later combined in the final layers of processing. Our model learns to combine local object appearance as well as information from the larger scene at the same time and in a complementary way, such that together they form a powerful classifier. We test our dual-stream network on the task of segmentation of buildings and roads in aerial images and obtain state-of-the-art results on the Massachusetts Buildings Dataset. We also introduce two new datasets, for buildings and road segmentation, respectively, and study the relative importance of local appearance vs. the larger scene, as well as their performance in combination. While our local-global model could also be useful in general recognition tasks, we clearly demonstrate the effectiveness of visual context in conjunction with deep nets for aerial image understanding.
Tasks	Object Recognition, Visual Reasoning
Published	2016-05-18
URL	http://arxiv.org/abs/1605.05462v1
PDF	http://arxiv.org/pdf/1605.05462v1.pdf
PWC	https://paperswithcode.com/paper/dual-local-global-contextual-pathways-for
Repo
Framework

Filling in the details: Perceiving from low fidelity images


Title	Filling in the details: Perceiving from low fidelity images
Authors	Farahnaz Ahmed Wick, Michael L. Wick, Marc Pomplun
Abstract	Humans perceive their surroundings in great detail even though most of our visual field is reduced to low-fidelity color-deprived (e.g. dichromatic) input by the retina. In contrast, most deep learning architectures are computationally wasteful in that they consider every part of the input when performing an image processing task. Yet, the human visual system is able to perform visual reasoning despite having only a small fovea of high visual acuity. With this in mind, we wish to understand the extent to which connectionist architectures are able to learn from and reason with low acuity, distorted inputs. Specifically, we train autoencoders to generate full-detail images from low-detail “foveations” of those images and then measure their ability to reconstruct the full-detail images from the foveated versions. By varying the type of foveation, we can study how well the architectures can cope with various types of distortion. We find that the autoencoder compensates for lower detail by learning increasingly global feature functions. In many cases, the learnt features are suitable for reconstructing the original full-detail image. For example, we find that the networks accurately perceive color in the periphery, even when 75% of the input is achromatic.
Tasks	Foveation, Visual Reasoning
Published	2016-04-14
URL	http://arxiv.org/abs/1604.04125v1
PDF	http://arxiv.org/pdf/1604.04125v1.pdf
PWC	https://paperswithcode.com/paper/filling-in-the-details-perceiving-from-low
Repo
Framework

Tracking multiple moving objects in images using Markov Chain Monte Carlo


Title	Tracking multiple moving objects in images using Markov Chain Monte Carlo
Authors	Lan Jiang, Sumeetpal S. Singh
Abstract	A new Bayesian state and parameter learning algorithm for multiple target tracking (MTT) models with image observations is proposed. Specifically, a Markov chain Monte Carlo algorithm is designed to sample from the posterior distribution of the unknown number of targets, their birth and death times, states and model parameters, which constitutes the complete solution to the tracking problem. The conventional approach is to pre-process the images to extract point observations and then perform tracking. We model the image generation process directly to avoid potential loss of information when extracting point observations. Numerical examples show that our algorithm has improved tracking performance over commonly used techniques, for both synthetic examples and real florescent microscopy data, especially in the case of dim targets with overlapping illuminated regions.
Tasks	Image Generation
Published	2016-03-17
URL	http://arxiv.org/abs/1603.05522v1
PDF	http://arxiv.org/pdf/1603.05522v1.pdf
PWC	https://paperswithcode.com/paper/tracking-multiple-moving-objects-in-images
Repo
Framework

Structured Convolution Matrices for Energy-efficient Deep learning


Title	Structured Convolution Matrices for Energy-efficient Deep learning
Authors	Rathinakumar Appuswamy, Tapan Nayak, John Arthur, Steven Esser, Paul Merolla, Jeffrey Mckinstry, Timothy Melano, Myron Flickner, Dharmendra Modha
Abstract	We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We also put forward a novel method to train binary convolutional networks by utilising an existing connection between noisy-rectified linear units and binary activations.
Tasks
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02407v1
PDF	http://arxiv.org/pdf/1606.02407v1.pdf
PWC	https://paperswithcode.com/paper/structured-convolution-matrices-for-energy
Repo
Framework

Reaching Unanimous Agreements Within Agent-Based Negotiation Teams With Linear and Monotonic Utility Functions


Title	Reaching Unanimous Agreements Within Agent-Based Negotiation Teams With Linear and Monotonic Utility Functions
Authors	Victor Sanchez-Anguix, Vicente Julian, Vicente Botti, Ana Garcia-Fornes
Abstract	In this article, an agent-based negotiation model for negotiation teams that negotiate a deal with an opponent is presented. Agent-based negotiation teams are groups of agents that join together as a single negotiation party because they share an interest that is related to the negotiation process. The model relies on a trusted mediator that coordinates and helps team members in the decisions that they have to take during the negotiation process: which offer is sent to the opponent, and whether the offers received from the opponent are accepted. The main strength of the proposed negotiation model is the fact that it guarantees unanimity within team decisions since decisions report a utility to team members that is greater than or equal to their aspiration levels at each negotiation round. This work analyzes how unanimous decisions are taken within the team and the robustness of the model against different types of manipulations. An empirical evaluation is also performed to study the impact of the different parameters of the model.
Tasks
Published	2016-04-16
URL	http://arxiv.org/abs/1604.04728v1
PDF	http://arxiv.org/pdf/1604.04728v1.pdf
PWC	https://paperswithcode.com/paper/reaching-unanimous-agreements-within-agent
Repo
Framework

Scale-free network optimization: foundations and algorithms


Title	Scale-free network optimization: foundations and algorithms
Authors	Patrick Rebeschini, Sekhar Tatikonda
Abstract	We investigate the fundamental principles that drive the development of scalable algorithms for network optimization. Despite the significant amount of work on parallel and decentralized algorithms in the optimization community, the methods that have been proposed typically rely on strict separability assumptions for objective function and constraints. Beside sparsity, these methods typically do not exploit the strength of the interaction between variables in the system. We propose a notion of correlation in constrained optimization that is based on the sensitivity of the optimal solution upon perturbations of the constraints. We develop a general theory of sensitivity of optimizers the extends beyond the infinitesimal setting. We present instances in network optimization where the correlation decays exponentially fast with respect to the natural distance in the network, and we design algorithms that can exploit this decay to yield dimension-free optimization. Our results are the first of their kind, and open new possibilities in the theory of local algorithms.
Tasks
Published	2016-02-12
URL	http://arxiv.org/abs/1602.04227v1
PDF	http://arxiv.org/pdf/1602.04227v1.pdf
PWC	https://paperswithcode.com/paper/scale-free-network-optimization-foundations
Repo
Framework

The optimality of coarse categories in decision-making and information storage


Title	The optimality of coarse categories in decision-making and information storage
Authors	Michael Mandler
Abstract	An agent who lacks preferences and instead makes decisions using criteria that are costly to create should select efficient sets of criteria, where the cost of making a given number of choice distinctions is minimized. Under mild conditions, efficiency requires that binary criteria with only two categories per criterion are chosen. When applied to the problem of determining the optimal number of digits in an information storage device, this result implies that binary digits (bits) are the efficient solution, even when the marginal cost of using additional digits declines rapidly to 0. This short paper pays particular attention to the symmetry conditions entailed when sets of criteria are efficient.
Tasks	Decision Making
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07529v1
PDF	http://arxiv.org/pdf/1606.07529v1.pdf
PWC	https://paperswithcode.com/paper/the-optimality-of-coarse-categories-in
Repo
Framework

Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models


Title	Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Authors	Pranay Dighe, Afsaneh Asaei, Herve Bourlard
Abstract	Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate.
Tasks	Speech Recognition
Published	2016-10-18
URL	http://arxiv.org/abs/1610.05688v1
PDF	http://arxiv.org/pdf/1610.05688v1.pdf
PWC	https://paperswithcode.com/paper/low-rank-and-sparse-soft-targets-to-learn
Repo
Framework

Single Image 3D Interpreter Network


Title	Single Image 3D Interpreter Network
Authors	Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman
Abstract	Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles this problem by either solving an optimization task given 2D keypoint positions, or training on synthetic data with ground truth 3D information. In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data. This is made possible mainly by two technical innovations. First, we propose a Projection Layer, which projects estimated 3D structure to 2D space, so that 3D-INN can be trained to predict 3D structural parameters supervised by 2D annotations on real images. Second, heatmaps of keypoints serve as an intermediate representation connecting real and synthetic data, enabling 3D-INN to benefit from the variation and abundance of synthetic 3D objects, without suffering from the difference between the statistics of real and synthesized images due to imperfect rendering. The network achieves state-of-the-art performance on both 2D keypoint estimation and 3D structure recovery. We also show that the recovered 3D information can be used in other vision applications, such as 3D rendering and image retrieval.
Tasks	Image Retrieval
Published	2016-04-29
URL	http://arxiv.org/abs/1604.08685v2
PDF	http://arxiv.org/pdf/1604.08685v2.pdf
PWC	https://paperswithcode.com/paper/single-image-3d-interpreter-network
Repo
Framework