Paper Group ANR 172
Example-Based Image Synthesis via Randomized Patch-Matching. DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation. Dynamic Decomposition of Spatiotemporal Neural Signals. Development of a 3D tongue motion visualization platform based on ultrasound image sequences. Photographic home styles in Congress: a computer vision approach. Partia …
Example-Based Image Synthesis via Randomized Patch-Matching
Title | Example-Based Image Synthesis via Randomized Patch-Matching |
Authors | Yi Ren, Yaniv Romano, Michael Elad |
Abstract | Image and texture synthesis is a challenging task that has long been drawing attention in the fields of image processing, graphics, and machine learning. This problem consists of modelling the desired type of images, either through training examples or via a parametric modeling, and then generating images that belong to the same statistical origin. This work addresses the image synthesis task, focusing on two specific families of images – handwritten digits and face images. This paper offers two main contributions. First, we suggest a simple and intuitive algorithm capable of generating such images in a unified way. The proposed approach taken is pyramidal, consisting of upscaling and refining the estimated image several times. For each upscaling stage, the algorithm randomly draws small patches from a patch database, and merges these to form a coherent and novel image with high visual quality. The second contribution is a general framework for the evaluation of the generation performance, which combines three aspects: the likelihood, the originality and the spread of the synthesized images. We assess the proposed synthesis scheme and show that the results are similar in nature, and yet different from the ones found in the training set, suggesting that true synthesis effect has been obtained. |
Tasks | Image Generation, Texture Synthesis |
Published | 2016-09-23 |
URL | http://arxiv.org/abs/1609.07370v1 |
http://arxiv.org/pdf/1609.07370v1.pdf | |
PWC | https://paperswithcode.com/paper/example-based-image-synthesis-via-randomized |
Repo | |
Framework | |
DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation
Title | DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation |
Authors | Hao Chen, Xiaojuan Qi, Lequan Yu, Pheng-Ann Heng |
Abstract | The morphology of glands has been used routinely by pathologists to assess the malignancy degree of adenocarcinomas. Accurate segmentation of glands from histology images is a crucial step to obtain reliable morphological statistics for quantitative diagnosis. In this paper, we proposed an efficient deep contour-aware network (DCAN) to solve this challenging problem under a unified multi-task learning framework. In the proposed network, multi-level contextual features from the hierarchical architecture are explored with auxiliary supervision for accurate gland segmentation. When incorporated with multi-task regularization during the training, the discriminative capability of intermediate features can be further improved. Moreover, our network can not only output accurate probability maps of glands, but also depict clear contours simultaneously for separating clustered objects, which further boosts the gland segmentation performance. This unified framework can be efficient when applied to large-scale histopathological data without resorting to additional steps to generate contours based on low-level cues for post-separating. Our method won the 2015 MICCAI Gland Segmentation Challenge out of 13 competitive teams, surpassing all the other methods by a significant margin. |
Tasks | Multi-Task Learning |
Published | 2016-04-10 |
URL | http://arxiv.org/abs/1604.02677v1 |
http://arxiv.org/pdf/1604.02677v1.pdf | |
PWC | https://paperswithcode.com/paper/dcan-deep-contour-aware-networks-for-accurate |
Repo | |
Framework | |
Dynamic Decomposition of Spatiotemporal Neural Signals
Title | Dynamic Decomposition of Spatiotemporal Neural Signals |
Authors | Luca Ambrogioni, Marcel A. J. van Gerven, Eric Maris |
Abstract | Neural signals are characterized by rich temporal and spatiotemporal dynamics that reflect the organization of cortical networks. Theoretical research has shown how neural networks can operate at different dynamic ranges that correspond to specific types of information processing. Here we present a data analysis framework that uses a linearized model of these dynamic states in order to decompose the measured neural signal into a series of components that capture both rhythmic and non-rhythmic neural activity. The method is based on stochastic differential equations and Gaussian process regression. Through computer simulations and analysis of magnetoencephalographic data, we demonstrate the efficacy of the method in identifying meaningful modulations of oscillatory signals corrupted by structured temporal and spatiotemporal noise. These results suggest that the method is particularly suitable for the analysis and interpretation of complex temporal and spatiotemporal neural signals. |
Tasks | |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02609v1 |
http://arxiv.org/pdf/1605.02609v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-decomposition-of-spatiotemporal |
Repo | |
Framework | |
Development of a 3D tongue motion visualization platform based on ultrasound image sequences
Title | Development of a 3D tongue motion visualization platform based on ultrasound image sequences |
Authors | Kele Xu, Yin Yang, Aurore Jaumard-Hakoun, Clemence Leboullenger, Gerard Dreyfus, Pierre Roussel, Maureen Stone, Bruce Denby |
Abstract | This article describes the development of a platform designed to visualize the 3D motion of the tongue using ultrasound image sequences. An overview of the system design is given and promising results are presented. Compared to the analysis of motion in 2D image sequences, such a system can provide additional visual information and a quantitative description of the tongue 3D motion. The platform can be useful in a variety of fields, such as speech production, articulation training, etc. |
Tasks | |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.06106v1 |
http://arxiv.org/pdf/1605.06106v1.pdf | |
PWC | https://paperswithcode.com/paper/development-of-a-3d-tongue-motion |
Repo | |
Framework | |
Photographic home styles in Congress: a computer vision approach
Title | Photographic home styles in Congress: a computer vision approach |
Authors | L. Jason Anastasopoulos, Dhruvil Badani, Crystal Lee, Shiry Ginosar, Jake Williams |
Abstract | While members of Congress now routinely communicate with constituents using images on a variety of internet platforms, little is known about how images are used as a means of strategic political communication. This is due primarily to computational limitations which have prevented large-scale, systematic analyses of image features. New developments in computer vision, however, are bringing the systematic study of images within reach. Here, we develop a framework for understanding visual political communication by extending Fenno’s analysis of home style (Fenno 1978) to images and introduce “photographic” home styles. Using approximately 192,000 photographs collected from MCs Facebook profiles, we build machine learning software with convolutional neural networks and conduct an image manipulation experiment to explore how the race of people that MCs pose with shape photographic home styles. We find evidence that electoral pressures shape photographic home styles and demonstrate that Democratic and Republican members of Congress use images in very different ways. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09942v2 |
http://arxiv.org/pdf/1611.09942v2.pdf | |
PWC | https://paperswithcode.com/paper/photographic-home-styles-in-congress-a |
Repo | |
Framework | |
Partial Least Squares Regression on Riemannian Manifolds and Its Application in Classifications
Title | Partial Least Squares Regression on Riemannian Manifolds and Its Application in Classifications |
Authors | Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu, Baocai Yin |
Abstract | Partial least squares regression (PLSR) has been a popular technique to explore the linear relationship between two datasets. However, most of algorithm implementations of PLSR may only achieve a suboptimal solution through an optimization on the Euclidean space. In this paper, we propose several novel PLSR models on Riemannian manifolds and develop optimization algorithms based on Riemannian geometry of manifolds. This algorithm can calculate all the factors of PLSR globally to avoid suboptimal solutions. In a number of experiments, we have demonstrated the benefits of applying the proposed model and algorithm to a variety of learning tasks in pattern recognition and object classification. |
Tasks | Object Classification |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06434v1 |
http://arxiv.org/pdf/1609.06434v1.pdf | |
PWC | https://paperswithcode.com/paper/partial-least-squares-regression-on |
Repo | |
Framework | |
An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering
Title | An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering |
Authors | Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Olena O. Boiko |
Abstract | A new approach to data stream clustering with the help of an ensemble of adaptive neuro-fuzzy systems is proposed. The proposed ensemble is formed with adaptive neuro-fuzzy self-organizing Kohonen maps in a parallel processing mode. A final result is chosen by the best neuro-fuzzy self-organizing Kohonen map. |
Tasks | |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06490v1 |
http://arxiv.org/pdf/1610.06490v1.pdf | |
PWC | https://paperswithcode.com/paper/an-ensemble-of-adaptive-neuro-fuzzy-kohonen |
Repo | |
Framework | |
Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition
Title | Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition |
Authors | Pranay Dighe, Gil Luyet, Afsaneh Asaei, Herve Bourlard |
Abstract | We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low-dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dimensions of the posterior subspaces are indeed very small and the matrix of all posteriors belonging to a class has a very low rank, we demonstrate how low-dimensional structures enable further enhancement of the posteriors and rectify the spurious errors due to mismatch conditions. The enhanced acoustic modeling method leads to improvements in continuous speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in both clean and noisy conditions, where upto 15.4% relative reduction in word error rate (WER) is achieved. |
Tasks | Dictionary Learning, Speech Recognition |
Published | 2016-01-22 |
URL | http://arxiv.org/abs/1601.05936v1 |
http://arxiv.org/pdf/1601.05936v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-low-dimensional-structures-to |
Repo | |
Framework | |
Diversified Visual Attention Networks for Fine-Grained Object Classification
Title | Diversified Visual Attention Networks for Fine-Grained Object Classification |
Authors | Bo Zhao, Xiao Wu, Jiashi Feng, Qiang Peng, Shuicheng Yan |
Abstract | Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation. Recently, visual attention models have been applied to automatically localize the discriminative regions of an image for better capturing critical difference and demonstrated promising performance. However, without consideration of the diversity in attention process, most of existing attention models perform poorly in classifying fine-grained objects. In this paper, we propose a diversified visual attention network (DVAN) to address the problems of fine-grained object classification, which substan- tially relieves the dependency on strongly-supervised information for learning to localize discriminative regions compared with attentionless models. More importantly, DVAN explicitly pursues the diversity of attention and is able to gather discriminative information to the maximal extent. Multiple attention canvases are generated to extract convolutional features for attention. An LSTM recurrent unit is employed to learn the attentiveness and discrimination of attention canvases. The proposed DVAN has the ability to attend the object from coarse to fine granularity, and a dynamic internal representation for classification is built up by incrementally combining the information from different locations and scales of the image. Extensive experiments con- ducted on CUB-2011, Stanford Dogs and Stanford Cars datasets have demonstrated that the proposed diversified visual attention networks achieve competitive performance compared to the state- of-the-art approaches, without using any prior knowledge, user interaction or external resource in training or testing. |
Tasks | Object Classification |
Published | 2016-06-28 |
URL | http://arxiv.org/abs/1606.08572v2 |
http://arxiv.org/pdf/1606.08572v2.pdf | |
PWC | https://paperswithcode.com/paper/diversified-visual-attention-networks-for |
Repo | |
Framework | |
A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique
Title | A Novel Bilingual Word Embedding Method for Lexical Translation Using Bilingual Sense Clique |
Authors | Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, Masao Utiyama, Eiichiro Sumita |
Abstract | Most of the existing methods for bilingual word embedding only consider shallow context or simple co-occurrence information. In this paper, we propose a latent bilingual sense unit (Bilingual Sense Clique, BSC), which is derived from a maximum complete sub-graph of pointwise mutual information based graph over bilingual corpus. In this way, we treat source and target words equally and a separated bilingual projection processing that have to be used in most existing works is not necessary any more. Several dimension reduction methods are evaluated to summarize the BSC-word relationship. The proposed method is evaluated on bilingual lexicon translation tasks and empirical results show that bilingual sense embedding methods outperform existing bilingual word embedding methods. |
Tasks | Dimensionality Reduction |
Published | 2016-07-29 |
URL | http://arxiv.org/abs/1607.08692v2 |
http://arxiv.org/pdf/1607.08692v2.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-bilingual-word-embedding-method-for |
Repo | |
Framework | |
On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition
Title | On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition |
Authors | Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, Ian McGraw |
Abstract | We study the problem of compressing recurrent neural networks (RNNs). In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can be run efficiently on mobile devices. In this work, we present a technique for general recurrent model compression that jointly compresses both recurrent and non-recurrent inter-layer weight matrices. We find that the proposed technique allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to a third of its original size with negligible loss in accuracy. |
Tasks | Large Vocabulary Continuous Speech Recognition, Model Compression, Speech Recognition |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.08042v2 |
http://arxiv.org/pdf/1603.08042v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-compression-of-recurrent-neural |
Repo | |
Framework | |
Robotic Grasp Detection using Deep Convolutional Neural Networks
Title | Robotic Grasp Detection using Deep Convolutional Neural Networks |
Authors | Sulabh Kumra, Christopher Kanan |
Abstract | Deep learning has significantly advanced computer vision and natural language processing. While there have been some successes in robotics using deep learning, it has not been widely adopted. In this paper, we present a novel robotic grasp detection system that predicts the best grasping pose of a parallel-plate robotic gripper for novel objects using the RGB-D image of the scene. The proposed model uses a deep convolutional neural network to extract features from the scene and then uses a shallow convolutional neural network to predict the grasp configuration for the object of interest. Our multi-modal model achieved an accuracy of 89.21% on the standard Cornell Grasp Dataset and runs at real-time speeds. This redefines the state-of-the-art for robotic grasp detection. |
Tasks | Robotic Grasping |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08036v4 |
http://arxiv.org/pdf/1611.08036v4.pdf | |
PWC | https://paperswithcode.com/paper/robotic-grasp-detection-using-deep |
Repo | |
Framework | |
LCNN: Lookup-based Convolutional Neural Network
Title | LCNN: Lookup-based Convolutional Neural Network |
Authors | Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi |
Abstract | Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for convolutional neural networks that enables efficient learning and inference. We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs. Training LCNN involves jointly learning a dictionary and a small set of linear combinations. The size of the dictionary naturally traces a spectrum of trade-offs between efficiency and accuracy. Our experimental results on ImageNet challenge show that LCNN can offer 3.2x speedup while achieving 55.1% top-1 accuracy using AlexNet architecture. Our fastest LCNN offers 37.6x speed up over AlexNet while maintaining 44.3% top-1 accuracy. LCNN not only offers dramatic speed ups at inference, but it also enables efficient training. In this paper, we show the benefits of LCNN in few-shot learning and few-iteration learning, two crucial aspects of on-device training of deep learning models. |
Tasks | Few-Shot Learning |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06473v2 |
http://arxiv.org/pdf/1611.06473v2.pdf | |
PWC | https://paperswithcode.com/paper/lcnn-lookup-based-convolutional-neural |
Repo | |
Framework | |
Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons
Title | Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons |
Authors | Vivek S. Borkar, Nikhil Karamchandani, Sharad Mirani |
Abstract | We revisit the problem of inferring the overall ranking among entities in the framework of Bradley-Terry-Luce (BTL) model, based on available empirical data on pairwise preferences. By a simple transformation, we can cast the problem as that of solving a noisy linear system, for which a ready algorithm is available in the form of the randomized Kaczmarz method. This scheme is provably convergent, has excellent empirical performance, and is amenable to on-line, distributed and asynchronous variants. Convergence, convergence rate, and error analysis of the proposed algorithm are presented and several numerical experiments are conducted whose results validate our theoretical findings. |
Tasks | |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02470v1 |
http://arxiv.org/pdf/1605.02470v1.pdf | |
PWC | https://paperswithcode.com/paper/randomized-kaczmarz-for-rank-aggregation-from |
Repo | |
Framework | |
Bounded Rational Decision-Making in Feedforward Neural Networks
Title | Bounded Rational Decision-Making in Feedforward Neural Networks |
Authors | Felix Leibfried, Daniel Alexander Braun |
Abstract | Bounded rational decision-makers transform sensory input into motor output under limited computational resources. Mathematically, such decision-makers can be modeled as information-theoretic channels with limited transmission rate. Here, we apply this formalism for the first time to multilayer feedforward neural networks. We derive synaptic weight update rules for two scenarios, where either each neuron is considered as a bounded rational decision-maker or the network as a whole. In the update rules, bounded rationality translates into information-theoretically motivated types of regularization in weight space. In experiments on the MNIST benchmark classification task for handwritten digits, we show that such information-theoretic regularization successfully prevents overfitting across different architectures and attains results that are competitive with other recent techniques like dropout, dropconnect and Bayes by backprop, for both ordinary and convolutional neural networks. |
Tasks | Decision Making |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08332v2 |
http://arxiv.org/pdf/1602.08332v2.pdf | |
PWC | https://paperswithcode.com/paper/bounded-rational-decision-making-in |
Repo | |
Framework | |