Paper Group ANR 218
Crafting GBD-Net for Object Detection. A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade. A Probabilistic Framework for Deep Learning. A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data. StruClus: Structural Clustering of Large-Scale Graph Databases. CIFAR-10: KNN-based Ensembl …
Crafting GBD-Net for Object Detection
Title | Crafting GBD-Net for Object Detection |
Authors | Xingyu Zeng, Wanli Ouyang, Junjie Yan, Hongsheng Li, Tong Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang, Hui Zhou, Xiaogang Wang |
Abstract | The visual cues from multiple support regions of different sizes and resolutions are complementary in classifying a candidate box in object detection. Effective integration of local and contextual visual cues from these regions has become a fundamental problem in object detection. In this paper, we propose a gated bi-directional CNN (GBD-Net) to pass messages among features from different support regions during both feature learning and feature extraction. Such message passing can be implemented through convolution between neighboring support regions in two directions and can be conducted in various layers. Therefore, local and contextual visual patterns can validate the existence of each other by learning their nonlinear relationships and their close interactions are modeled in a more complex way. It is also shown that message passing is not always helpful but dependent on individual samples. Gated functions are therefore needed to control message transmission, whose on-or-offs are controlled by extra visual evidence from the input sample. The effectiveness of GBD-Net is shown through experiments on three object detection datasets, ImageNet, Pascal VOC2007 and Microsoft COCO. This paper also shows the details of our approach in wining the ImageNet object detection challenge of 2016, with source code provided on \url{https://github.com/craftGBD/craftGBD}. |
Tasks | Object Detection |
Published | 2016-10-08 |
URL | http://arxiv.org/abs/1610.02579v1 |
http://arxiv.org/pdf/1610.02579v1.pdf | |
PWC | https://paperswithcode.com/paper/crafting-gbd-net-for-object-detection |
Repo | |
Framework | |
A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade
Title | A Multidimensional Cascade Neuro-Fuzzy System with Neuron Pool Optimization in Each Cascade |
Authors | Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Daria S. Kopaliani |
Abstract | A new architecture and learning algorithms for the multidimensional hybrid cascade neural network with neuron pool optimization in each cascade are proposed in this paper. The proposed system differs from the well-known cascade systems in its capability to process multidimensional time series in an online mode, which makes it possible to process non-stationary stochastic and chaotic signals with the required accuracy. Compared to conventional analogs, the proposed system provides computational simplicity and possesses both tracking and filtering capabilities. |
Tasks | Time Series |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06485v1 |
http://arxiv.org/pdf/1610.06485v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multidimensional-cascade-neuro-fuzzy-system |
Repo | |
Framework | |
A Probabilistic Framework for Deep Learning
Title | A Probabilistic Framework for Deep Learning |
Authors | Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk |
Abstract | We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that max-sum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first principles derivation. Our framework provides new insights into the successes and shortcomings of DCNs as well as a principled route to their improvement. DRMM training via the Expectation-Maximization (EM) algorithm is a powerful alternative to DCN back-propagation, and initial training results are promising. Classification based on the DRMM and other variants outperforms DCNs in supervised digit classification, training 2-3x faster while achieving similar accuracy. Moreover, the DRMM is applicable to semi-supervised and unsupervised learning tasks, achieving results that are state-of-the-art in several categories on the MNIST benchmark and comparable to state of the art on the CIFAR10 benchmark. |
Tasks | |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01936v1 |
http://arxiv.org/pdf/1612.01936v1.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-framework-for-deep-learning |
Repo | |
Framework | |
A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data
Title | A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data |
Authors | Hussein A. Abbass, George Leu, Kathryn Merrick |
Abstract | Despite the advances made in artificial intelligence, software agents, and robotics, there is little we see today that we can truly call a fully autonomous system. We conjecture that the main inhibitor for advancing autonomy is lack of trust. Trusted autonomy is the scientific and engineering field to establish the foundations and ground work for developing trusted autonomous systems (robotics and software agents) that can be used in our daily life, and can be integrated with humans seamlessly, naturally and efficiently. In this paper, we review this literature to reveal opportunities for researchers and practitioners to work on topics that can create a leap forward in advancing the field of trusted autonomy. We focus the paper on the `trust’ component as the uniting technology between humans and machines. Our inquiry into this topic revolves around three sub-topics: (1) reviewing and positioning the trust modelling literature for the purpose of trusted autonomy; (2) reviewing a critical subset of sensor technologies that allow a machine to sense human states; and (3) distilling some critical questions for advancing the field of trusted autonomy. The inquiry is augmented with conceptual models that we propose along the way by recompiling and reshaping the literature into forms that enables trusted autonomous systems to become a reality. The paper offers a vision for a Trusted Cyborg Swarm, an extension of our previous Cognitive Cyber Symbiosis concept, whereby humans and machines meld together in a harmonious, seamless, and coordinated manner. | |
Tasks | |
Published | 2016-03-16 |
URL | http://arxiv.org/abs/1604.00921v1 |
http://arxiv.org/pdf/1604.00921v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-theoretical-and-practical |
Repo | |
Framework | |
StruClus: Structural Clustering of Large-Scale Graph Databases
Title | StruClus: Structural Clustering of Large-Scale Graph Databases |
Authors | Till Schäfer, Petra Mutzel |
Abstract | We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the clustering process, and helps to interpret the clustering results. The projection-based nature of the clustering approach allows us to bypass dimensionality and feature extraction problems that arise in the context of graph datasets reduced to pairwise distances or feature vectors. While achieving high quality and (human) interpretable clusterings, the runtime of the algorithm only grows linearly with the number of graphs. Furthermore, the approach is easy to parallelize and therefore suitable for very large datasets. Our extensive experimental evaluation on synthetic and real world datasets demonstrates the superiority of our approach over existing structural and subspace clustering algorithms, both, from a runtime and quality point of view. |
Tasks | |
Published | 2016-09-28 |
URL | http://arxiv.org/abs/1609.09000v1 |
http://arxiv.org/pdf/1609.09000v1.pdf | |
PWC | https://paperswithcode.com/paper/struclus-structural-clustering-of-large-scale |
Repo | |
Framework | |
CIFAR-10: KNN-based Ensemble of Classifiers
Title | CIFAR-10: KNN-based Ensemble of Classifiers |
Authors | Yehya Abouelnaga, Ola S. Ali, Hager Rady, Mohamed Moustafa |
Abstract | In this paper, we study the performance of different classifiers on the CIFAR-10 dataset, and build an ensemble of classifiers to reach a better performance. We show that, on CIFAR-10, K-Nearest Neighbors (KNN) and Convolutional Neural Network (CNN), on some classes, are mutually exclusive, thus yield in higher accuracy when combined. We reduce KNN overfitting using Principal Component Analysis (PCA), and ensemble it with a CNN to increase its accuracy. Our approach improves our best CNN model from 93.33% to 94.03%. |
Tasks | |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04905v1 |
http://arxiv.org/pdf/1611.04905v1.pdf | |
PWC | https://paperswithcode.com/paper/cifar-10-knn-based-ensemble-of-classifiers |
Repo | |
Framework | |
Semi-supervised deep learning by metric embedding
Title | Semi-supervised deep learning by metric embedding |
Authors | Elad Hoffer, Nir Ailon |
Abstract | Deep networks are successfully used as classification models yielding state-of-the-art results when trained on a large number of labeled samples. These models, however, are usually much less suited for semi-supervised problems because of their tendency to overfit easily when trained on small amounts of data. In this work we will explore a new training objective that is targeting a semi-supervised regime with only a small subset of labeled data. This criterion is based on a deep metric embedding over distance relations within the set of labeled samples, together with constraints over the embeddings of the unlabeled set. The final learned representations are discriminative in euclidean space, and hence can be used with subsequent nearest-neighbor classification using the labeled samples. |
Tasks | |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.01449v2 |
http://arxiv.org/pdf/1611.01449v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-deep-learning-by-metric |
Repo | |
Framework | |
Finding the Topic of a Set of Images
Title | Finding the Topic of a Set of Images |
Authors | Gonzalo Vaca-Castano |
Abstract | In this paper we introduce the problem of determining the topic that a set of images is describing, where every topic is represented as a set of words. Different from other problems like tag assignment or similar, a) we assume multiple images are used as input instead of single image, b) Input images are typically not visually related, c) Input images are not necessarily semantically close, and d) Output word space is unconstrained. In our proposed solution, visual information of each query image is used to retrieve similar images with text labels (tags) from an image database. We consider a scenario where the tags are very noisy and diverse, given that they were obtained by implicit crowd-sourcing in a database of 1 million images and over seventy seven thousand tags. The words or tags associated to each query are processed jointly in a word selection algorithm using random walks that allows to refine the search topic, rejecting words that are not part of the topic and produce a set of words that fairly describe the topic. Experiments on a dataset of 300 topics, with up to twenty images per topic, show that our algorithm performs better than the proposed baseline for any number of query images. We also present a new Conditional Random Field (CRF) word mapping algorithm that preserves the semantic similarity of the mapped words, increasing the performance of the results over the baseline. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-06-25 |
URL | http://arxiv.org/abs/1606.07921v1 |
http://arxiv.org/pdf/1606.07921v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-the-topic-of-a-set-of-images |
Repo | |
Framework | |
Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation
Title | Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation |
Authors | Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu |
Abstract | We study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL). Fingerspelling comprises a significant but relatively understudied part of ASL. Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected. In this work we collect and annotate a new data set of continuous fingerspelling videos, compare several types of recognizers, and explore the problem of signer variation. Our best-performing models are segmental (semi-Markov) conditional random fields using deep neural network-based features. In the signer-dependent setting, our recognizers achieve up to about 92% letter accuracy. The multi-signer setting is much more challenging, but with neural network adaptation we achieve up to 83% letter accuracies in this setting. |
Tasks | |
Published | 2016-09-26 |
URL | http://arxiv.org/abs/1609.07876v1 |
http://arxiv.org/pdf/1609.07876v1.pdf | |
PWC | https://paperswithcode.com/paper/lexicon-free-fingerspelling-recognition-from |
Repo | |
Framework | |
Invariant feature extraction from event based stimuli
Title | Invariant feature extraction from event based stimuli |
Authors | Thusitha N. Chandrapala, Bertram E. Shi |
Abstract | We propose a novel architecture, the event-based GASSOM for learning and extracting invariant representations from event streams originating from neuromorphic vision sensors. The framework is inspired by feed-forward cortical models for visual processing. The model, which is based on the concepts of sparsity and temporal slowness, is able to learn feature extractors that resemble neurons in the primary visual cortex. Layers of units in the proposed model can be cascaded to learn feature extractors with different levels of complexity and selectivity. We explore the applicability of the framework on real world tasks by using the learned network for object recognition. The proposed model achieve higher classification accuracy compared to other state-of-the-art event based processing methods. Our results also demonstrate the generality and robustness of the method, as the recognizers for different data sets and different tasks all used the same set of learned feature detectors, which were trained on data collected independently of the testing data. |
Tasks | Object Recognition |
Published | 2016-04-15 |
URL | http://arxiv.org/abs/1604.04327v3 |
http://arxiv.org/pdf/1604.04327v3.pdf | |
PWC | https://paperswithcode.com/paper/invariant-feature-extraction-from-event-based |
Repo | |
Framework | |
Single Image Restoration for Participating Media Based on Prior Fusion
Title | Single Image Restoration for Participating Media Based on Prior Fusion |
Authors | Joel D. O. Gaya, Felipe Codevilla, Amanda C. Duarte, Paulo L. Drews-Jr, Silvia S. Botelho |
Abstract | This paper describes a method to restore degraded images captured in a participating media – fog, turbid water, sand storm, etc. Differently from the related work that only deal with a medium, we obtain generality by using an image formation model and a fusion of new image priors. The model considers the image color variation produced by the medium. The proposed restoration method is based on the fusion of these priors and supported by statistics collected on images acquired in both non-participating and participating media. The key of the method is to fuse two complementary measures — local contrast and color data. The obtained results on underwater and foggy images demonstrate the capabilities of the proposed method. Moreover, we evaluated our method using a special dataset for which a ground-truth image is available. |
Tasks | Image Restoration |
Published | 2016-03-06 |
URL | http://arxiv.org/abs/1603.01864v2 |
http://arxiv.org/pdf/1603.01864v2.pdf | |
PWC | https://paperswithcode.com/paper/single-image-restoration-for-participating |
Repo | |
Framework | |
Fast Methods for Recovering Sparse Parameters in Linear Low Rank Models
Title | Fast Methods for Recovering Sparse Parameters in Linear Low Rank Models |
Authors | Ashkan Esmaeili, Arash Amini, Farokh Marvasti |
Abstract | In this paper, we investigate the recovery of a sparse weight vector (parameters vector) from a set of noisy linear combinations. However, only partial information about the matrix representing the linear combinations is available. Assuming a low-rank structure for the matrix, one natural solution would be to first apply a matrix completion on the data, and then to solve the resulting compressed sensing problem. In big data applications such as massive MIMO and medical data, the matrix completion step imposes a huge computational burden. Here, we propose to reduce the computational cost of the completion task by ignoring the columns corresponding to zero elements in the sparse vector. To this end, we employ a technique to initially approximate the support of the sparse vector. We further propose to unify the partial matrix completion and sparse vector recovery into an augmented four-step problem. Simulation results reveal that the augmented approach achieves the best performance, while both proposed methods outperform the natural two-step technique with substantially less computational requirements. |
Tasks | Matrix Completion |
Published | 2016-06-26 |
URL | http://arxiv.org/abs/1606.08009v2 |
http://arxiv.org/pdf/1606.08009v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-methods-for-recovering-sparse-parameters |
Repo | |
Framework | |
Contextual Symmetries in Probabilistic Graphical Models
Title | Contextual Symmetries in Probabilistic Graphical Models |
Authors | Ankit Anand, Aditya Grover, Mausam, Parag Singla |
Abstract | An important approach for efficient inference in probabilistic graphical models exploits symmetries among objects in the domain. Symmetric variables (states) are collapsed into meta-variables (meta-states) and inference algorithms are run over the lifted graphical model instead of the flat one. Our paper extends existing definitions of symmetry by introducing the novel notion of contextual symmetry. Two states that are not globally symmetric, can be contextually symmetric under some specific assignment to a subset of variables, referred to as the context variables. Contextual symmetry subsumes previous symmetry definitions and can rep resent a large class of symmetries not representable earlier. We show how to compute contextual symmetries by reducing it to the problem of graph isomorphism. We extend previous work on exploiting symmetries in the MCMC framework to the case of contextual symmetries. Our experiments on several domains of interest demonstrate that exploiting contextual symmetries can result in significant computational gains. |
Tasks | |
Published | 2016-06-30 |
URL | http://arxiv.org/abs/1606.09594v1 |
http://arxiv.org/pdf/1606.09594v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-symmetries-in-probabilistic |
Repo | |
Framework | |
Tracking Human-like Natural Motion Using Deep Recurrent Neural Networks
Title | Tracking Human-like Natural Motion Using Deep Recurrent Neural Networks |
Authors | Youngbin Park, Sungphill Moon, Il Hong Suh |
Abstract | Kinect skeleton tracker is able to achieve considerable human body tracking performance in convenient and a low-cost manner. However, The tracker often captures unnatural human poses such as discontinuous and vibrated motions when self-occlusions occur. A majority of approaches tackle this problem by using multiple Kinect sensors in a workspace. Combination of the measurements from different sensors is then conducted in Kalman filter framework or optimization problem is formulated for sensor fusion. However, these methods usually require heuristics to measure reliability of measurements observed from each Kinect sensor. In this paper, we developed a method to improve Kinect skeleton using single Kinect sensor, in which supervised learning technique was employed to correct unnatural tracking motions. Specifically, deep recurrent neural networks were used for improving joint positions and velocities of Kinect skeleton, and three methods were proposed to integrate the refined positions and velocities for further enhancement. Moreover, we suggested a novel measure to evaluate naturalness of captured motions. We evaluated the proposed approach by comparison with the ground truth obtained using a commercial optical maker-based motion capture system. |
Tasks | Motion Capture, Sensor Fusion |
Published | 2016-04-15 |
URL | http://arxiv.org/abs/1604.04528v1 |
http://arxiv.org/pdf/1604.04528v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-human-like-natural-motion-using-deep |
Repo | |
Framework | |
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
Title | DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding |
Authors | Yinda Zhang, Mingru Bai, Pushmeet Kohli, Shahram Izadi, Jianxiong Xiao |
Abstract | While deep neural networks have led to human-level performance on computer vision tasks, they have yet to demonstrate similar gains for holistic scene understanding. In particular, 3D context has been shown to be an extremely important cue for scene understanding - yet very little research has been done on integrating context information with deep models. This paper presents an approach to embed 3D context into the topology of a neural network trained to perform holistic scene understanding. Given a depth image depicting a 3D scene, our network aligns the observed scene with a predefined 3D scene template, and then reasons about the existence and location of each object within the scene template. In doing so, our model recognizes multiple objects in a single forward pass of a 3D convolutional neural network, capturing both global scene and local object information simultaneously. To create training data for this 3D network, we generate partly hallucinated depth images which are rendered by replacing real objects with a repository of CAD models of the same object category. Extensive experiments demonstrate the effectiveness of our algorithm compared to the state-of-the-arts. Source code and data are available at http://deepcontext.cs.princeton.edu. |
Tasks | Scene Understanding |
Published | 2016-03-16 |
URL | http://arxiv.org/abs/1603.04922v4 |
http://arxiv.org/pdf/1603.04922v4.pdf | |
PWC | https://paperswithcode.com/paper/deepcontext-context-encoding-neural-pathways |
Repo | |
Framework | |