Paper Group ANR 10
DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters. Optimizing Quantiles in Preference-based Markov Decision Processes. EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos. Learning Over Long Time Lags. Bayesian Variable Selection for Globally Sparse Probabilisti …
DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters
Title | DeepSpark: A Spark-Based Distributed Deep Learning Framework for Commodity Clusters |
Authors | Hanjoo Kim, Jaehong Park, Jaehee Jang, Sungroh Yoon |
Abstract | The increasing complexity of deep neural networks (DNNs) has made it challenging to exploit existing large-scale data processing pipelines for handling massive data and parameters involved in DNN training. Distributed computing platforms and GPGPU-based acceleration provide a mainstream solution to this computational challenge. In this paper, we propose DeepSpark, a distributed and parallel deep learning framework that exploits Apache Spark on commodity clusters. To support parallel operations, DeepSpark automatically distributes workloads and parameters to Caffe/Tensorflow-running nodes using Spark, and iteratively aggregates training results by a novel lock-free asynchronous variant of the popular elastic averaging stochastic gradient descent based update scheme, effectively complementing the synchronized processing capabilities of Spark. DeepSpark is an on-going project, and the current release is available at http://deepspark.snu.ac.kr. |
Tasks | |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08191v3 |
http://arxiv.org/pdf/1602.08191v3.pdf | |
PWC | https://paperswithcode.com/paper/deepspark-a-spark-based-distributed-deep |
Repo | |
Framework | |
Optimizing Quantiles in Preference-based Markov Decision Processes
Title | Optimizing Quantiles in Preference-based Markov Decision Processes |
Authors | Hugo Gilbert, Paul Weng, Yan Xu |
Abstract | In the Markov decision process model, policies are usually evaluated by expected cumulative rewards. As this decision criterion is not always suitable, we propose in this paper an algorithm for computing a policy optimal for the quantile criterion. Both finite and infinite horizons are considered. Finally we experimentally evaluate our approach on random MDPs and on a data center control problem. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00094v1 |
http://arxiv.org/pdf/1612.00094v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-quantiles-in-preference-based |
Repo | |
Framework | |
EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos
Title | EgoReID: Cross-view Self-Identification and Human Re-identification in Egocentric and Surveillance Videos |
Authors | Shervin Ardeshir, Sandesh Sharma, Ali Broji |
Abstract | Human identification remains to be one of the challenging tasks in computer vision community due to drastic changes in visual features across different viewpoints, lighting conditions, occlusion, etc. Most of the literature has been focused on exploring human re-identification across viewpoints that are not too drastically different in nature. Cameras usually capture oblique or side views of humans, leaving room for a lot of geometric and visual reasoning. Given the recent popularity of egocentric and top-view vision, re-identification across these two drastically different views can now be explored. Having an egocentric and a top view video, our goal is to identify the cameraman in the content of the top-view video, and also re-identify the people visible in the egocentric video, by matching them to the identities present in the top-view video. We propose a CRF-based method to address the two problems. Our experimental results demonstrates the efficiency of the proposed approach over a variety of video recorded from two views. |
Tasks | Person Re-Identification, Visual Reasoning |
Published | 2016-12-24 |
URL | http://arxiv.org/abs/1612.08153v1 |
http://arxiv.org/pdf/1612.08153v1.pdf | |
PWC | https://paperswithcode.com/paper/egoreid-cross-view-self-identification-and |
Repo | |
Framework | |
Learning Over Long Time Lags
Title | Learning Over Long Time Lags |
Authors | Hojjat Salehinejad |
Abstract | The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model. Then, provides a surveys of recent advances in different memory enhancements and learning techniques for capturing long term dependencies in RNNs. |
Tasks | Time Series |
Published | 2016-02-13 |
URL | http://arxiv.org/abs/1602.04335v1 |
http://arxiv.org/pdf/1602.04335v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-over-long-time-lags |
Repo | |
Framework | |
Bayesian Variable Selection for Globally Sparse Probabilistic PCA
Title | Bayesian Variable Selection for Globally Sparse Probabilistic PCA |
Authors | Charles Bouveyron, Pierre Latouche, Pierre-Alexandre Mattei |
Abstract | Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables is difficult since each axis has its own sparsity pattern and has to be interpreted separately. To overcome this drawback, we propose a Bayesian procedure called globally sparse probabilistic PCA (GSPPCA) that allows to obtain several sparse components with the same sparsity pattern. This allows the practitioner to identify the original variables which are relevant to describe the data. To this end, using Roweis’ probabilistic interpretation of PCA and a Gaussian prior on the loading matrix, we provide the first exact computation of the marginal likelihood of a Bayesian PCA model. To avoid the drawbacks of discrete model selection, a simple relaxation of this framework is presented. It allows to find a path of models using a variational expectation-maximization algorithm. The exact marginal likelihood is then maximized over this path. This approach is illustrated on real and synthetic data sets. In particular, using unlabeled microarray data, GSPPCA infers much more relevant gene subsets than traditional sparse PCA algorithms. |
Tasks | Model Selection |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.05918v2 |
http://arxiv.org/pdf/1605.05918v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-variable-selection-for-globally |
Repo | |
Framework | |
Semi-Supervised Representation Learning based on Probabilistic Labeling
Title | Semi-Supervised Representation Learning based on Probabilistic Labeling |
Authors | Ershad Banijamali, Ali Ghodsi |
Abstract | In this paper, we present a new algorithm for semi-supervised representation learning. In this algorithm, we first find a vector representation for the labels of the data points based on their local positions in the space. Then, we map the data to lower-dimensional space using a linear transformation such that the dependency between the transformed data and the assigned labels is maximized. In fact, we try to find a mapping that is as discriminative as possible. The approach will use Hilber-Schmidt Independence Criterion (HSIC) as the dependence measure. We also present a kernelized version of the algorithm, which allows non-linear transformations and provides more flexibility in finding the appropriate mapping. Use of unlabeled data for learning new representation is not always beneficial and there is no algorithm that can deterministically guarantee the improvement of the performance by exploiting unlabeled data. Therefore, we also propose a bound on the performance of the algorithm, which can be used to determine the effectiveness of using the unlabeled data in the algorithm. We demonstrate the ability of the algorithm in finding the transformation using both toy examples and real-world datasets. |
Tasks | Representation Learning |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.03072v3 |
http://arxiv.org/pdf/1605.03072v3.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-representation-learning-based |
Repo | |
Framework | |
Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery
Title | Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery |
Authors | Alina Marcu, Marius Leordeanu |
Abstract | Visual context is important in object recognition and it is still an open problem in computer vision. Along with the advent of deep convolutional neural networks (CNN), using contextual information with such systems starts to receive attention in the literature. At the same time, aerial imagery is gaining momentum. While advances in deep learning make good progress in aerial image analysis, this problem still poses many great challenges. Aerial images are often taken under poor lighting conditions and contain low resolution objects, many times occluded by trees or taller buildings. In this domain, in particular, visual context could be of great help, but there are still very few papers that consider context in aerial image understanding. Here we introduce context as a complementary way of recognizing objects. We propose a dual-stream deep neural network model that processes information along two independent pathways, one for local and another for global visual reasoning. The two are later combined in the final layers of processing. Our model learns to combine local object appearance as well as information from the larger scene at the same time and in a complementary way, such that together they form a powerful classifier. We test our dual-stream network on the task of segmentation of buildings and roads in aerial images and obtain state-of-the-art results on the Massachusetts Buildings Dataset. We also introduce two new datasets, for buildings and road segmentation, respectively, and study the relative importance of local appearance vs. the larger scene, as well as their performance in combination. While our local-global model could also be useful in general recognition tasks, we clearly demonstrate the effectiveness of visual context in conjunction with deep nets for aerial image understanding. |
Tasks | Object Recognition, Visual Reasoning |
Published | 2016-05-18 |
URL | http://arxiv.org/abs/1605.05462v1 |
http://arxiv.org/pdf/1605.05462v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-local-global-contextual-pathways-for |
Repo | |
Framework | |
Filling in the details: Perceiving from low fidelity images
Title | Filling in the details: Perceiving from low fidelity images |
Authors | Farahnaz Ahmed Wick, Michael L. Wick, Marc Pomplun |
Abstract | Humans perceive their surroundings in great detail even though most of our visual field is reduced to low-fidelity color-deprived (e.g. dichromatic) input by the retina. In contrast, most deep learning architectures are computationally wasteful in that they consider every part of the input when performing an image processing task. Yet, the human visual system is able to perform visual reasoning despite having only a small fovea of high visual acuity. With this in mind, we wish to understand the extent to which connectionist architectures are able to learn from and reason with low acuity, distorted inputs. Specifically, we train autoencoders to generate full-detail images from low-detail “foveations” of those images and then measure their ability to reconstruct the full-detail images from the foveated versions. By varying the type of foveation, we can study how well the architectures can cope with various types of distortion. We find that the autoencoder compensates for lower detail by learning increasingly global feature functions. In many cases, the learnt features are suitable for reconstructing the original full-detail image. For example, we find that the networks accurately perceive color in the periphery, even when 75% of the input is achromatic. |
Tasks | Foveation, Visual Reasoning |
Published | 2016-04-14 |
URL | http://arxiv.org/abs/1604.04125v1 |
http://arxiv.org/pdf/1604.04125v1.pdf | |
PWC | https://paperswithcode.com/paper/filling-in-the-details-perceiving-from-low |
Repo | |
Framework | |
Tracking multiple moving objects in images using Markov Chain Monte Carlo
Title | Tracking multiple moving objects in images using Markov Chain Monte Carlo |
Authors | Lan Jiang, Sumeetpal S. Singh |
Abstract | A new Bayesian state and parameter learning algorithm for multiple target tracking (MTT) models with image observations is proposed. Specifically, a Markov chain Monte Carlo algorithm is designed to sample from the posterior distribution of the unknown number of targets, their birth and death times, states and model parameters, which constitutes the complete solution to the tracking problem. The conventional approach is to pre-process the images to extract point observations and then perform tracking. We model the image generation process directly to avoid potential loss of information when extracting point observations. Numerical examples show that our algorithm has improved tracking performance over commonly used techniques, for both synthetic examples and real florescent microscopy data, especially in the case of dim targets with overlapping illuminated regions. |
Tasks | Image Generation |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05522v1 |
http://arxiv.org/pdf/1603.05522v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-multiple-moving-objects-in-images |
Repo | |
Framework | |
Structured Convolution Matrices for Energy-efficient Deep learning
Title | Structured Convolution Matrices for Energy-efficient Deep learning |
Authors | Rathinakumar Appuswamy, Tapan Nayak, John Arthur, Steven Esser, Paul Merolla, Jeffrey Mckinstry, Timothy Melano, Myron Flickner, Dharmendra Modha |
Abstract | We derive a relationship between network representation in energy-efficient neuromorphic architectures and block Toplitz convolutional matrices. Inspired by this connection, we develop deep convolutional networks using a family of structured convolutional matrices and achieve state-of-the-art trade-off between energy efficiency and classification accuracy for well-known image recognition tasks. We also put forward a novel method to train binary convolutional networks by utilising an existing connection between noisy-rectified linear units and binary activations. |
Tasks | |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02407v1 |
http://arxiv.org/pdf/1606.02407v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-convolution-matrices-for-energy |
Repo | |
Framework | |
Reaching Unanimous Agreements Within Agent-Based Negotiation Teams With Linear and Monotonic Utility Functions
Title | Reaching Unanimous Agreements Within Agent-Based Negotiation Teams With Linear and Monotonic Utility Functions |
Authors | Victor Sanchez-Anguix, Vicente Julian, Vicente Botti, Ana Garcia-Fornes |
Abstract | In this article, an agent-based negotiation model for negotiation teams that negotiate a deal with an opponent is presented. Agent-based negotiation teams are groups of agents that join together as a single negotiation party because they share an interest that is related to the negotiation process. The model relies on a trusted mediator that coordinates and helps team members in the decisions that they have to take during the negotiation process: which offer is sent to the opponent, and whether the offers received from the opponent are accepted. The main strength of the proposed negotiation model is the fact that it guarantees unanimity within team decisions since decisions report a utility to team members that is greater than or equal to their aspiration levels at each negotiation round. This work analyzes how unanimous decisions are taken within the team and the robustness of the model against different types of manipulations. An empirical evaluation is also performed to study the impact of the different parameters of the model. |
Tasks | |
Published | 2016-04-16 |
URL | http://arxiv.org/abs/1604.04728v1 |
http://arxiv.org/pdf/1604.04728v1.pdf | |
PWC | https://paperswithcode.com/paper/reaching-unanimous-agreements-within-agent |
Repo | |
Framework | |
Scale-free network optimization: foundations and algorithms
Title | Scale-free network optimization: foundations and algorithms |
Authors | Patrick Rebeschini, Sekhar Tatikonda |
Abstract | We investigate the fundamental principles that drive the development of scalable algorithms for network optimization. Despite the significant amount of work on parallel and decentralized algorithms in the optimization community, the methods that have been proposed typically rely on strict separability assumptions for objective function and constraints. Beside sparsity, these methods typically do not exploit the strength of the interaction between variables in the system. We propose a notion of correlation in constrained optimization that is based on the sensitivity of the optimal solution upon perturbations of the constraints. We develop a general theory of sensitivity of optimizers the extends beyond the infinitesimal setting. We present instances in network optimization where the correlation decays exponentially fast with respect to the natural distance in the network, and we design algorithms that can exploit this decay to yield dimension-free optimization. Our results are the first of their kind, and open new possibilities in the theory of local algorithms. |
Tasks | |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.04227v1 |
http://arxiv.org/pdf/1602.04227v1.pdf | |
PWC | https://paperswithcode.com/paper/scale-free-network-optimization-foundations |
Repo | |
Framework | |
The optimality of coarse categories in decision-making and information storage
Title | The optimality of coarse categories in decision-making and information storage |
Authors | Michael Mandler |
Abstract | An agent who lacks preferences and instead makes decisions using criteria that are costly to create should select efficient sets of criteria, where the cost of making a given number of choice distinctions is minimized. Under mild conditions, efficiency requires that binary criteria with only two categories per criterion are chosen. When applied to the problem of determining the optimal number of digits in an information storage device, this result implies that binary digits (bits) are the efficient solution, even when the marginal cost of using additional digits declines rapidly to 0. This short paper pays particular attention to the symmetry conditions entailed when sets of criteria are efficient. |
Tasks | Decision Making |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07529v1 |
http://arxiv.org/pdf/1606.07529v1.pdf | |
PWC | https://paperswithcode.com/paper/the-optimality-of-coarse-categories-in |
Repo | |
Framework | |
Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Title | Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models |
Authors | Pranay Dighe, Afsaneh Asaei, Herve Bourlard |
Abstract | Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond to context-dependent tied states or senones. The present work addresses some limitations of GMM-HMM senone alignments for DNN training. We hypothesize that the senone probabilities obtained from a DNN trained with binary labels can provide more accurate targets to learn better acoustic models. However, DNN outputs bear inaccuracies which are exhibited as high dimensional unstructured noise, whereas the informative components are structured and low-dimensional. We exploit principle component analysis (PCA) and sparse coding to characterize the senone subspaces. Enhanced probabilities obtained from low-rank and sparse reconstructions are used as soft-targets for DNN acoustic modeling, that also enables training with untranscribed data. Experiments conducted on AMI corpus shows 4.6% relative reduction in word error rate. |
Tasks | Speech Recognition |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05688v1 |
http://arxiv.org/pdf/1610.05688v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-and-sparse-soft-targets-to-learn |
Repo | |
Framework | |
Single Image 3D Interpreter Network
Title | Single Image 3D Interpreter Network |
Authors | Jiajun Wu, Tianfan Xue, Joseph J. Lim, Yuandong Tian, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman |
Abstract | Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles this problem by either solving an optimization task given 2D keypoint positions, or training on synthetic data with ground truth 3D information. In this work, we propose 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data. This is made possible mainly by two technical innovations. First, we propose a Projection Layer, which projects estimated 3D structure to 2D space, so that 3D-INN can be trained to predict 3D structural parameters supervised by 2D annotations on real images. Second, heatmaps of keypoints serve as an intermediate representation connecting real and synthetic data, enabling 3D-INN to benefit from the variation and abundance of synthetic 3D objects, without suffering from the difference between the statistics of real and synthesized images due to imperfect rendering. The network achieves state-of-the-art performance on both 2D keypoint estimation and 3D structure recovery. We also show that the recovered 3D information can be used in other vision applications, such as 3D rendering and image retrieval. |
Tasks | Image Retrieval |
Published | 2016-04-29 |
URL | http://arxiv.org/abs/1604.08685v2 |
http://arxiv.org/pdf/1604.08685v2.pdf | |
PWC | https://paperswithcode.com/paper/single-image-3d-interpreter-network |
Repo | |
Framework | |