May 6, 2019

2703 words 13 mins read

Paper Group ANR 172

Paper Group ANR 172

Example-Based Image Synthesis via Randomized Patch-Matching. DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation. Dynamic Decomposition of Spatiotemporal Neural Signals. Development of a 3D tongue motion visualization platform based on ultrasound image sequences. Photographic home styles in Congress: a computer vision approach. Partia …

Example-Based Image Synthesis via Randomized Patch-Matching

Title Example-Based Image Synthesis via Randomized Patch-Matching
Authors Yi Ren, Yaniv Romano, Michael Elad
Abstract Image and texture synthesis is a challenging task that has long been drawing attention in the fields of image processing, graphics, and machine learning. This problem consists of modelling the desired type of images, either through training examples or via a parametric modeling, and then generating images that belong to the same statistical origin. This work addresses the image synthesis task, focusing on two specific families of images – handwritten digits and face images. This paper offers two main contributions. First, we suggest a simple and intuitive algorithm capable of generating such images in a unified way. The proposed approach taken is pyramidal, consisting of upscaling and refining the estimated image several times. For each upscaling stage, the algorithm randomly draws small patches from a patch database, and merges these to form a coherent and novel image with high visual quality. The second contribution is a general framework for the evaluation of the generation performance, which combines three aspects: the likelihood, the originality and the spread of the synthesized images. We assess the proposed synthesis scheme and show that the results are similar in nature, and yet different from the ones found in the training set, suggesting that true synthesis effect has been obtained.
Tasks Image Generation, Texture Synthesis
Published 2016-09-23
URL http://arxiv.org/abs/1609.07370v1
PDF http://arxiv.org/pdf/1609.07370v1.pdf
PWC https://paperswithcode.com/paper/example-based-image-synthesis-via-randomized
Repo
Framework

DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation

Title DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation
Authors Hao Chen, Xiaojuan Qi, Lequan Yu, Pheng-Ann Heng
Abstract The morphology of glands has been used routinely by pathologists to assess the malignancy degree of adenocarcinomas. Accurate segmentation of glands from histology images is a crucial step to obtain reliable morphological statistics for quantitative diagnosis. In this paper, we proposed an efficient deep contour-aware network (DCAN) to solve this challenging problem under a unified multi-task learning framework. In the proposed network, multi-level contextual features from the hierarchical architecture are explored with auxiliary supervision for accurate gland segmentation. When incorporated with multi-task regularization during the training, the discriminative capability of intermediate features can be further improved. Moreover, our network can not only output accurate probability maps of glands, but also depict clear contours simultaneously for separating clustered objects, which further boosts the gland segmentation performance. This unified framework can be efficient when applied to large-scale histopathological data without resorting to additional steps to generate contours based on low-level cues for post-separating. Our method won the 2015 MICCAI Gland Segmentation Challenge out of 13 competitive teams, surpassing all the other methods by a significant margin.
Tasks Multi-Task Learning
Published 2016-04-10
URL http://arxiv.org/abs/1604.02677v1
PDF http://arxiv.org/pdf/1604.02677v1.pdf
PWC https://paperswithcode.com/paper/dcan-deep-contour-aware-networks-for-accurate
Repo
Framework

Dynamic Decomposition of Spatiotemporal Neural Signals

Title Dynamic Decomposition of Spatiotemporal Neural Signals
Authors Luca Ambrogioni, Marcel A. J. van Gerven, Eric Maris
Abstract Neural signals are characterized by rich temporal and spatiotemporal dynamics that reflect the organization of cortical networks. Theoretical research has shown how neural networks can operate at different dynamic ranges that correspond to specific types of information processing. Here we present a data analysis framework that uses a linearized model of these dynamic states in order to decompose the measured neural signal into a series of components that capture both rhythmic and non-rhythmic neural activity. The method is based on stochastic differential equations and Gaussian process regression. Through computer simulations and analysis of magnetoencephalographic data, we demonstrate the efficacy of the method in identifying meaningful modulations of oscillatory signals corrupted by structured temporal and spatiotemporal noise. These results suggest that the method is particularly suitable for the analysis and interpretation of complex temporal and spatiotemporal neural signals.
Tasks
Published 2016-05-09
URL http://arxiv.org/abs/1605.02609v1
PDF http://arxiv.org/pdf/1605.02609v1.pdf
PWC https://paperswithcode.com/paper/dynamic-decomposition-of-spatiotemporal
Repo
Framework

Development of a 3D tongue motion visualization platform based on ultrasound image sequences

Title Development of a 3D tongue motion visualization platform based on ultrasound image sequences
Authors Kele Xu, Yin Yang, Aurore Jaumard-Hakoun, Clemence Leboullenger, Gerard Dreyfus, Pierre Roussel, Maureen Stone, Bruce Denby
Abstract This article describes the development of a platform designed to visualize the 3D motion of the tongue using ultrasound image sequences. An overview of the system design is given and promising results are presented. Compared to the analysis of motion in 2D image sequences, such a system can provide additional visual information and a quantitative description of the tongue 3D motion. The platform can be useful in a variety of fields, such as speech production, articulation training, etc.
Tasks
Published 2016-05-19
URL http://arxiv.org/abs/1605.06106v1
PDF http://arxiv.org/pdf/1605.06106v1.pdf
PWC https://paperswithcode.com/paper/development-of-a-3d-tongue-motion
Repo
Framework

Photographic home styles in Congress: a computer vision approach

Title Photographic home styles in Congress: a computer vision approach
Authors L. Jason Anastasopoulos, Dhruvil Badani, Crystal Lee, Shiry Ginosar, Jake Williams
Abstract While members of Congress now routinely communicate with constituents using images on a variety of internet platforms, little is known about how images are used as a means of strategic political communication. This is due primarily to computational limitations which have prevented large-scale, systematic analyses of image features. New developments in computer vision, however, are bringing the systematic study of images within reach. Here, we develop a framework for understanding visual political communication by extending Fenno’s analysis of home style (Fenno 1978) to images and introduce “photographic” home styles. Using approximately 192,000 photographs collected from MCs Facebook profiles, we build machine learning software with convolutional neural networks and conduct an image manipulation experiment to explore how the race of people that MCs pose with shape photographic home styles. We find evidence that electoral pressures shape photographic home styles and demonstrate that Democratic and Republican members of Congress use images in very different ways.
Tasks
Published 2016-11-29
URL http://arxiv.org/abs/1611.09942v2
PDF http://arxiv.org/pdf/1611.09942v2.pdf
PWC https://paperswithcode.com/paper/photographic-home-styles-in-congress-a
Repo
Framework

Partial Least Squares Regression on Riemannian Manifolds and Its Application in Classifications

Title Partial Least Squares Regression on Riemannian Manifolds and Its Application in Classifications
Authors Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu, Baocai Yin
Abstract Partial least squares regression (PLSR) has been a popular technique to explore the linear relationship between two datasets. However, most of algorithm implementations of PLSR may only achieve a suboptimal solution through an optimization on the Euclidean space. In this paper, we propose several novel PLSR models on Riemannian manifolds and develop optimization algorithms based on Riemannian geometry of manifolds. This algorithm can calculate all the factors of PLSR globally to avoid suboptimal solutions. In a number of experiments, we have demonstrated the benefits of applying the proposed model and algorithm to a variety of learning tasks in pattern recognition and object classification.
Tasks Object Classification
Published 2016-09-21
URL http://arxiv.org/abs/1609.06434v1
PDF http://arxiv.org/pdf/1609.06434v1.pdf
PWC https://paperswithcode.com/paper/partial-least-squares-regression-on
Repo
Framework

An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering

Title An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering
Authors Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Olena O. Boiko
Abstract A new approach to data stream clustering with the help of an ensemble of adaptive neuro-fuzzy systems is proposed. The proposed ensemble is formed with adaptive neuro-fuzzy self-organizing Kohonen maps in a parallel processing mode. A final result is chosen by the best neuro-fuzzy self-organizing Kohonen map.
Tasks
Published 2016-10-20
URL http://arxiv.org/abs/1610.06490v1
PDF http://arxiv.org/pdf/1610.06490v1.pdf
PWC https://paperswithcode.com/paper/an-ensemble-of-adaptive-neuro-fuzzy-kohonen
Repo
Framework

Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

Title Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition
Authors Pranay Dighe, Gil Luyet, Afsaneh Asaei, Herve Bourlard
Abstract We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dimensions of the posterior subspaces are indeed very small and the matrix of all posteriors belonging to a class has a very low rank, we demonstrate how low-dimensional structures enable further enhancement of the posteriors and rectify the spurious errors due to mismatch conditions. The enhanced acoustic modeling method leads to improvements in continuous speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in both clean and noisy conditions, where upto 15.4% relative reduction in word error rate (WER) is achieved.
Tasks Dictionary Learning, Speech Recognition
Published 2016-01-22
URL http://arxiv.org/abs/1601.05936v1
PDF http://arxiv.org/pdf/1601.05936v1.pdf
PWC https://paperswithcode.com/paper/exploiting-low-dimensional-structures-to
Repo
Framework

Diversified Visual Attention Networks for Fine-Grained Object Classification

Title Diversified Visual Attention Networks for Fine-Grained Object Classification
Authors Bo Zhao, Xiao Wu, Jiashi Feng, Qiang Peng, Shuicheng Yan
Abstract Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation. Recently, visual attention models have been applied to automatically localize the discriminative regions of an image for better capturing critical difference and demonstrated promising performance. However, without consideration of the diversity in attention process, most of existing attention models perform poorly in classifying fine-grained objects. In this paper, we propose a diversified visual attention network (DVAN) to address the problems of fine-grained object classification, which substan- tially relieves the dependency on strongly-supervised information for learning to localize discriminative regions compared with attentionless models. More importantly, DVAN explicitly pursues the diversity of attention and is able to gather discriminative information to the maximal extent. Multiple attention canvases are generated to extract convolutional features for attention. An LSTM recurrent unit is employed to learn the attentiveness and discrimination of attention canvases. The proposed DVAN has the ability to attend the object from coarse to fine granularity, and a dynamic internal representation for classification is built up by incrementally combining the information from different locations and scales of the image. Extensive experiments con- ducted on CUB-2011, Stanford Dogs and Stanford Cars datasets have demonstrated that the proposed diversified visual attention networks achieve competitive performance compared to the state- of-the-art approaches, without using any prior knowledge, user interaction or external resource in training or testing.
Tasks Object Classification
Published 2016-06-28
URL http://arxiv.org/abs/1606.08572v2
PDF http://arxiv.org/pdf/1606.08572v2.pdf
PWC https://paperswithcode.com/paper/diversified-visual-attention-networks-for
Repo
Framework

A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique

Title A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique
Authors Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, Masao Utiyama, Eiichiro Sumita
Abstract Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods.
Tasks Dimensionality Reduction
Published 2016-07-29
URL http://arxiv.org/abs/1607.08692v2
PDF http://arxiv.org/pdf/1607.08692v2.pdf
PWC https://paperswithcode.com/paper/a-novel-bilingual-word-embedding-method-for
Repo
Framework

On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition

Title On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition
Authors Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, Ian McGraw
Abstract We study the problem of compressing recurrent neural networks (RNNs). In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can be run efficiently on mobile devices. In this work, we present a technique for general recurrent model compression that jointly compresses both recurrent and non-recurrent inter-layer weight matrices. We find that the proposed technique allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to a third of its original size with negligible loss in accuracy.
Tasks Large Vocabulary Continuous Speech Recognition, Model Compression, Speech Recognition
Published 2016-03-25
URL http://arxiv.org/abs/1603.08042v2
PDF http://arxiv.org/pdf/1603.08042v2.pdf
PWC https://paperswithcode.com/paper/on-the-compression-of-recurrent-neural
Repo
Framework

Robotic Grasp Detection using Deep Convolutional Neural Networks

Title Robotic Grasp Detection using Deep Convolutional Neural Networks
Authors Sulabh Kumra, Christopher Kanan
Abstract Deep learning has significantly advanced computer vision and natural language processing. While there have been some successes in robotics using deep learning, it has not been widely adopted. In this paper, we present a novel robotic grasp detection system that predicts the best grasping pose of a parallel-plate robotic gripper for novel objects using the RGB-D image of the scene. The proposed model uses a deep convolutional neural network to extract features from the scene and then uses a shallow convolutional neural network to predict the grasp configuration for the object of interest. Our multi-modal model achieved an accuracy of 89.21% on the standard Cornell Grasp Dataset and runs at real-time speeds. This redefines the state-of-the-art for robotic grasp detection.
Tasks Robotic Grasping
Published 2016-11-24
URL http://arxiv.org/abs/1611.08036v4
PDF http://arxiv.org/pdf/1611.08036v4.pdf
PWC https://paperswithcode.com/paper/robotic-grasp-detection-using-deep
Repo
Framework

LCNN: Lookup-based Convolutional Neural Network

Title LCNN: Lookup-based Convolutional Neural Network
Authors Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi
Abstract Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables efficient learning and inference. We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs. Training LCNN involves jointly learning a dictionary and a small set of linear combinations. The size of the dictionary naturally traces a spectrum of trade-offs between efficiency and accuracy. Our experimental results on ImageNet challenge show that LCNN can offer 3.2x speedup while achieving 55.1% top-1 accuracy using AlexNet architecture. Our fastest LCNN offers 37.6x speed up over AlexNet while maintaining 44.3% top-1 accuracy. LCNN not only offers dramatic speed ups at inference, but it also enables efficient training. In this paper, we show the benefits of LCNN in few-shot learning and few-iteration learning, two crucial aspects of on-device training of deep learning models.
Tasks Few-Shot Learning
Published 2016-11-20
URL http://arxiv.org/abs/1611.06473v2
PDF http://arxiv.org/pdf/1611.06473v2.pdf
PWC https://paperswithcode.com/paper/lcnn-lookup-based-convolutional-neural
Repo
Framework

Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons

Title Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons
Authors Vivek S. Borkar, Nikhil Karamchandani, Sharad Mirani
Abstract We revisit the problem of inferring the overall ranking among entities in the framework of Bradley-Terry-Luce (BTL) model, based on available empirical data on pairwise preferences. By a simple transformation, we can cast the problem as that of solving a noisy linear system, for which a ready algorithm is available in the form of the randomized Kaczmarz method. This scheme is provably convergent, has excellent empirical performance, and is amenable to on-line, distributed and asynchronous variants. Convergence, convergence rate, and error analysis of the proposed algorithm are presented and several numerical experiments are conducted whose results validate our theoretical findings.
Tasks
Published 2016-05-09
URL http://arxiv.org/abs/1605.02470v1
PDF http://arxiv.org/pdf/1605.02470v1.pdf
PWC https://paperswithcode.com/paper/randomized-kaczmarz-for-rank-aggregation-from
Repo
Framework

Bounded Rational Decision-Making in Feedforward Neural Networks

Title Bounded Rational Decision-Making in Feedforward Neural Networks
Authors Felix Leibfried, Daniel Alexander Braun
Abstract Bounded rational decision-makers transform sensory input into motor output under limited computational resources. Mathematically, such decision-makers can be modeled as information-theoretic channels with limited transmission rate. Here, we apply this formalism for the first time to multilayer feedforward neural networks. We derive synaptic weight update rules for two scenarios, where either each neuron is considered as a bounded rational decision-maker or the network as a whole. In the update rules, bounded rationality translates into information-theoretically motivated types of regularization in weight space. In experiments on the MNIST benchmark classification task for handwritten digits, we show that such information-theoretic regularization successfully prevents overfitting across different architectures and attains results that are competitive with other recent techniques like dropout, dropconnect and Bayes by backprop, for both ordinary and convolutional neural networks.
Tasks Decision Making
Published 2016-02-26
URL http://arxiv.org/abs/1602.08332v2
PDF http://arxiv.org/pdf/1602.08332v2.pdf
PWC https://paperswithcode.com/paper/bounded-rational-decision-making-in
Repo
Framework
comments powered by Disqus