May 6, 2019

3024 words 15 mins read

Paper Group ANR 218

Crafting GBD-Net for Object Detection. A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade. A Probabilistic Framework for Deep Learning. A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data. StruClus: Structural Clustering of Large-Scale Graph Databases. CIFAR-10: KNN-based Ensembl …

Crafting GBD-Net for Object Detection


Title	Crafting GBD-Net for Object Detection
Authors	Xingyu Zeng, Wanli Ouyang, Junjie Yan, Hongsheng Li, Tong Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang, Hui Zhou, Xiaogang Wang
Abstract	The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual samples. Gated functions are therefore needed to control message transmission, whose on-or-offs are controlled by extra visual evidence from the input sample. The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO. This paper also shows the details of our approach in wining the ImageNet object detection challenge of 2016, with source code provided on \url{https://github.com/craftGBD/craftGBD}.
Tasks	Object Detection
Published	2016-10-08
URL	http://arxiv.org/abs/1610.02579v1
PDF	http://arxiv.org/pdf/1610.02579v1.pdf
PWC	https://paperswithcode.com/paper/crafting-gbd-net-for-object-detection
Repo
Framework

A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade


Title	A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade
Authors	Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Daria S. Kopaliani
Abstract	A new architecture and learning algorithms for the multidimensional hybrid cascade neural network with neuron pool optimization in each cascade are proposed in this paper. The proposed system differs from the well-known cascade systems in its capability to process multidimensional time series in an online mode, which makes it possible to process non-stationary stochastic and chaotic signals with the required accuracy. Compared to conventional analogs, the proposed system provides computational simplicity and possesses both tracking and filtering capabilities.
Tasks	Time Series
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06485v1
PDF	http://arxiv.org/pdf/1610.06485v1.pdf
PWC	https://paperswithcode.com/paper/a-multidimensional-cascade-neuro-fuzzy-system
Repo
Framework

A Probabilistic Framework for Deep Learning


Title	A Probabilistic Framework for Deep Learning
Authors	Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk
Abstract	We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that max-sum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first principles derivation. Our framework provides new insights into the successes and shortcomings of DCNs as well as a principled route to their improvement. DRMM training via the Expectation-Maximization (EM) algorithm is a powerful alternative to DCN back-propagation, and initial training results are promising. Classification based on the DRMM and other variants outperforms DCNs in supervised digit classification, training 2-3x faster while achieving similar accuracy. Moreover, the DRMM is applicable to semi-supervised and unsupervised learning tasks, achieving results that are state-of-the-art in several categories on the MNIST benchmark and comparable to state of the art on the CIFAR10 benchmark.
Tasks
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01936v1
PDF	http://arxiv.org/pdf/1612.01936v1.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-framework-for-deep-learning
Repo
Framework

A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data


Title	A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data
Authors	Hussein A. Abbass, George Leu, Kathryn Merrick
Abstract	Despite the advances made in artificial intelligence, software agents, and robotics, there is little we see today that we can truly call a fully autonomous system. We conjecture that the main inhibitor for advancing autonomy is lack of trust. Trusted autonomy is the scientific and engineering field to establish the foundations and ground work for developing trusted autonomous systems (robotics and software agents) that can be used in our daily life, and can be integrated with humans seamlessly, naturally and efficiently. In this paper, we review this literature to reveal opportunities for researchers and practitioners to work on topics that can create a leap forward in advancing the field of trusted autonomy. We focus the paper on the `trust’ component as the uniting technology between humans and machines. Our inquiry into this topic revolves around three sub-topics: (1) reviewing and positioning the trust modelling literature for the purpose of trusted autonomy; (2) reviewing a critical subset of sensor technologies that allow a machine to sense human states; and (3) distilling some critical questions for advancing the field of trusted autonomy. The inquiry is augmented with conceptual models that we propose along the way by recompiling and reshaping the literature into forms that enables trusted autonomous systems to become a reality. The paper offers a vision for a Trusted Cyborg Swarm, an extension of our previous Cognitive Cyber Symbiosis concept, whereby humans and machines meld together in a harmonious, seamless, and coordinated manner. \|
Tasks
Published	2016-03-16
URL	http://arxiv.org/abs/1604.00921v1
PDF	http://arxiv.org/pdf/1604.00921v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-theoretical-and-practical
Repo
Framework

StruClus: Structural Clustering of Large-Scale Graph Databases


Title	StruClus: Structural Clustering of Large-Scale Graph Databases
Authors	Till Schäfer, Petra Mutzel
Abstract	We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the clustering process, and helps to interpret the clustering results. The projection-based nature of the clustering approach allows us to bypass dimensionality and feature extraction problems that arise in the context of graph datasets reduced to pairwise distances or feature vectors. While achieving high quality and (human) interpretable clusterings, the runtime of the algorithm only grows linearly with the number of graphs. Furthermore, the approach is easy to parallelize and therefore suitable for very large datasets. Our extensive experimental evaluation on synthetic and real world datasets demonstrates the superiority of our approach over existing structural and subspace clustering algorithms, both, from a runtime and quality point of view.
Tasks
Published	2016-09-28
URL	http://arxiv.org/abs/1609.09000v1
PDF	http://arxiv.org/pdf/1609.09000v1.pdf
PWC	https://paperswithcode.com/paper/struclus-structural-clustering-of-large-scale
Repo
Framework

CIFAR-10: KNN-based Ensemble of Classifiers


Title	CIFAR-10: KNN-based Ensemble of Classifiers
Authors	Yehya Abouelnaga, Ola S. Ali, Hager Rady, Mohamed Moustafa
Abstract	In this paper, we study the performance of different classifiers on the CIFAR-10 dataset, and build an ensemble of classifiers to reach a better performance. We show that, on CIFAR-10, K-Nearest Neighbors (KNN) and Convolutional Neural Network (CNN), on some classes, are mutually exclusive, thus yield in higher accuracy when combined. We reduce KNN overfitting using Principal Component Analysis (PCA), and ensemble it with a CNN to increase its accuracy. Our approach improves our best CNN model from 93.33% to 94.03%.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04905v1
PDF	http://arxiv.org/pdf/1611.04905v1.pdf
PWC	https://paperswithcode.com/paper/cifar-10-knn-based-ensemble-of-classifiers
Repo
Framework

Semi-supervised deep learning by metric embedding


Title	Semi-supervised deep learning by metric embedding
Authors	Elad Hoffer, Nir Ailon
Abstract	Deep networks are successfully used as classification models yielding state-of-the-art results when trained on a large number of labeled samples. These models, however, are usually much less suited for semi-supervised problems because of their tendency to overfit easily when trained on small amounts of data. In this work we will explore a new training objective that is targeting a semi-supervised regime with only a small subset of labeled data. This criterion is based on a deep metric embedding over distance relations within the set of labeled samples, together with constraints over the embeddings of the unlabeled set. The final learned representations are discriminative in euclidean space, and hence can be used with subsequent nearest-neighbor classification using the labeled samples.
Tasks
Published	2016-11-04
URL	http://arxiv.org/abs/1611.01449v2
PDF	http://arxiv.org/pdf/1611.01449v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-deep-learning-by-metric
Repo
Framework

Finding the Topic of a Set of Images


Title	Finding the Topic of a Set of Images
Authors	Gonzalo Vaca-Castano
Abstract	In this paper we introduce the problem of determining the topic that a set of images is describing, where every topic is represented as a set of words. Different from other problems like tag assignment or similar, a) we assume multiple images are used as input instead of single image, b) Input images are typically not visually related, c) Input images are not necessarily semantically close, and d) Output word space is unconstrained. In our proposed solution, visual information of each query image is used to retrieve similar images with text labels (tags) from an image database. We consider a scenario where the tags are very noisy and diverse, given that they were obtained by implicit crowd-sourcing in a database of 1 million images and over seventy seven thousand tags. The words or tags associated to each query are processed jointly in a word selection algorithm using random walks that allows to refine the search topic, rejecting words that are not part of the topic and produce a set of words that fairly describe the topic. Experiments on a dataset of 300 topics, with up to twenty images per topic, show that our algorithm performs better than the proposed baseline for any number of query images. We also present a new Conditional Random Field (CRF) word mapping algorithm that preserves the semantic similarity of the mapped words, increasing the performance of the results over the baseline.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-06-25
URL	http://arxiv.org/abs/1606.07921v1
PDF	http://arxiv.org/pdf/1606.07921v1.pdf
PWC	https://paperswithcode.com/paper/finding-the-topic-of-a-set-of-images
Repo
Framework

Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation


Title	Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation
Authors	Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu
Abstract	We study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL. Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work we collect and annotate a new data set of continuous fingerspelling videos, compare several types of recognizers, and explore the problem of signer variation. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer-dependent setting, our recognizers achieve up to about 92% letter accuracy. The multi-signer setting is much more challenging, but with neural network adaptation we achieve up to 83% letter accuracies in this setting.
Tasks
Published	2016-09-26
URL	http://arxiv.org/abs/1609.07876v1
PDF	http://arxiv.org/pdf/1609.07876v1.pdf
PWC	https://paperswithcode.com/paper/lexicon-free-fingerspelling-recognition-from
Repo
Framework

Invariant feature extraction from event based stimuli


Title	Invariant feature extraction from event based stimuli
Authors	Thusitha N. Chandrapala, Bertram E. Shi
Abstract	We propose a novel architecture, the event-based GASSOM for learning and extracting invariant representations from event streams originating from neuromorphic vision sensors. The framework is inspired by feed-forward cortical models for visual processing. The model, which is based on the concepts of sparsity and temporal slowness, is able to learn feature extractors that resemble neurons in the primary visual cortex. Layers of units in the proposed model can be cascaded to learn feature extractors with different levels of complexity and selectivity. We explore the applicability of the framework on real world tasks by using the learned network for object recognition. The proposed model achieve higher classification accuracy compared to other state-of-the-art event based processing methods. Our results also demonstrate the generality and robustness of the method, as the recognizers for different data sets and different tasks all used the same set of learned feature detectors, which were trained on data collected independently of the testing data.
Tasks	Object Recognition
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04327v3
PDF	http://arxiv.org/pdf/1604.04327v3.pdf
PWC	https://paperswithcode.com/paper/invariant-feature-extraction-from-event-based
Repo
Framework

Single Image Restoration for Participating Media Based on Prior Fusion


Title	Single Image Restoration for Participating Media Based on Prior Fusion
Authors	Joel D. O. Gaya, Felipe Codevilla, Amanda C. Duarte, Paulo L. Drews-Jr, Silvia S. Botelho
Abstract	This paper describes a method to restore degraded images captured in a participating media – fog, turbid water, sand storm, etc. Differently from the related work that only deal with a medium, we obtain generality by using an image formation model and a fusion of new image priors. The model considers the image color variation produced by the medium. The proposed restoration method is based on the fusion of these priors and supported by statistics collected on images acquired in both non-participating and participating media. The key of the method is to fuse two complementary measures — local contrast and color data. The obtained results on underwater and foggy images demonstrate the capabilities of the proposed method. Moreover, we evaluated our method using a special dataset for which a ground-truth image is available.
Tasks	Image Restoration
Published	2016-03-06
URL	http://arxiv.org/abs/1603.01864v2
PDF	http://arxiv.org/pdf/1603.01864v2.pdf
PWC	https://paperswithcode.com/paper/single-image-restoration-for-participating
Repo
Framework

Fast Methods for Recovering Sparse Parameters in Linear Low Rank Models


Title	Fast Methods for Recovering Sparse Parameters in Linear Low Rank Models
Authors	Ashkan Esmaeili, Arash Amini, Farokh Marvasti
Abstract	In this paper, we investigate the recovery of a sparse weight vector (parameters vector) from a set of noisy linear combinations. However, only partial information about the matrix representing the linear combinations is available. Assuming a low-rank structure for the matrix, one natural solution would be to first apply a matrix completion on the data, and then to solve the resulting compressed sensing problem. In big data applications such as massive MIMO and medical data, the matrix completion step imposes a huge computational burden. Here, we propose to reduce the computational cost of the completion task by ignoring the columns corresponding to zero elements in the sparse vector. To this end, we employ a technique to initially approximate the support of the sparse vector. We further propose to unify the partial matrix completion and sparse vector recovery into an augmented four-step problem. Simulation results reveal that the augmented approach achieves the best performance, while both proposed methods outperform the natural two-step technique with substantially less computational requirements.
Tasks	Matrix Completion
Published	2016-06-26
URL	http://arxiv.org/abs/1606.08009v2
PDF	http://arxiv.org/pdf/1606.08009v2.pdf
PWC	https://paperswithcode.com/paper/fast-methods-for-recovering-sparse-parameters
Repo
Framework

Contextual Symmetries in Probabilistic Graphical Models


Title	Contextual Symmetries in Probabilistic Graphical Models
Authors	Ankit Anand, Aditya Grover, Mausam, Parag Singla
Abstract	An important approach for efficient inference in probabilistic graphical models exploits symmetries among objects in the domain. Symmetric variables (states) are collapsed into meta-variables (meta-states) and inference algorithms are run over the lifted graphical model instead of the flat one. Our paper extends existing definitions of symmetry by introducing the novel notion of contextual symmetry. Two states that are not globally symmetric, can be contextually symmetric under some specific assignment to a subset of variables, referred to as the context variables. Contextual symmetry subsumes previous symmetry definitions and can rep resent a large class of symmetries not representable earlier. We show how to compute contextual symmetries by reducing it to the problem of graph isomorphism. We extend previous work on exploiting symmetries in the MCMC framework to the case of contextual symmetries. Our experiments on several domains of interest demonstrate that exploiting contextual symmetries can result in significant computational gains.
Tasks
Published	2016-06-30
URL	http://arxiv.org/abs/1606.09594v1
PDF	http://arxiv.org/pdf/1606.09594v1.pdf
PWC	https://paperswithcode.com/paper/contextual-symmetries-in-probabilistic
Repo
Framework

Tracking Human-like Natural Motion Using Deep Recurrent Neural Networks


Title	Tracking Human-like Natural Motion Using Deep Recurrent Neural Networks
Authors	Youngbin Park, Sungphill Moon, Il Hong Suh
Abstract	Kinect skeleton tracker is able to achieve considerable human body tracking performance in convenient and a low-cost manner. However, The tracker often captures unnatural human poses such as discontinuous and vibrated motions when self-occlusions occur. A majority of approaches tackle this problem by using multiple Kinect sensors in a workspace. Combination of the measurements from different sensors is then conducted in Kalman filter framework or optimization problem is formulated for sensor fusion. However, these methods usually require heuristics to measure reliability of measurements observed from each Kinect sensor. In this paper, we developed a method to improve Kinect skeleton using single Kinect sensor, in which supervised learning technique was employed to correct unnatural tracking motions. Specifically, deep recurrent neural networks were used for improving joint positions and velocities of Kinect skeleton, and three methods were proposed to integrate the refined positions and velocities for further enhancement. Moreover, we suggested a novel measure to evaluate naturalness of captured motions. We evaluated the proposed approach by comparison with the ground truth obtained using a commercial optical maker-based motion capture system.
Tasks	Motion Capture, Sensor Fusion
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04528v1
PDF	http://arxiv.org/pdf/1604.04528v1.pdf
PWC	https://paperswithcode.com/paper/tracking-human-like-natural-motion-using-deep
Repo
Framework

DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding


Title	DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
Authors	Yinda Zhang, Mingru Bai, Pushmeet Kohli, Shahram Izadi, Jianxiong Xiao
Abstract	While deep neural networks have led to human-level performance on computer vision tasks, they have yet to demonstrate similar gains for holistic scene understanding. In particular, 3D context has been shown to be an extremely important cue for scene understanding - yet very little research has been done on integrating context information with deep models. This paper presents an approach to embed 3D context into the topology of a neural network trained to perform holistic scene understanding. Given a depth image depicting a 3D scene, our network aligns the observed scene with a predefined 3D scene template, and then reasons about the existence and location of each object within the scene template. In doing so, our model recognizes multiple objects in a single forward pass of a 3D convolutional neural network, capturing both global scene and local object information simultaneously. To create training data for this 3D network, we generate partly hallucinated depth images which are rendered by replacing real objects with a repository of CAD models of the same object category. Extensive experiments demonstrate the effectiveness of our algorithm compared to the state-of-the-arts. Source code and data are available at http://deepcontext.cs.princeton.edu.
Tasks	Scene Understanding
Published	2016-03-16
URL	http://arxiv.org/abs/1603.04922v4
PDF	http://arxiv.org/pdf/1603.04922v4.pdf
PWC	https://paperswithcode.com/paper/deepcontext-context-encoding-neural-pathways
Repo
Framework