July 29, 2019

3056 words 15 mins read

Paper Group ANR 146

Bio-Inspired Multi-Layer Spiking Neural Network Extracts Discriminative Features from Speech Signals. EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation. Uncertainty-Aware Organ Classification for Surgical Data Science Applications in Laparoscopy. Greedy Structure Learning of Hierarchical Compositional Models. Technical Report: …

Bio-Inspired Multi-Layer Spiking Neural Network Extracts Discriminative Features from Speech Signals


Title	Bio-Inspired Multi-Layer Spiking Neural Network Extracts Discriminative Features from Speech Signals
Authors	Amirhossein Tavanaei, Anthony Maida
Abstract	Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.
Tasks
Published	2017-06-10
URL	http://arxiv.org/abs/1706.03170v1
PDF	http://arxiv.org/pdf/1706.03170v1.pdf
PWC	https://paperswithcode.com/paper/bio-inspired-multi-layer-spiking-neural
Repo
Framework

EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation


Title	EZLearn: Exploiting Organic Supervision in Large-Scale Data Annotation
Authors	Maxim Grechkin, Hoifung Poon, Bill Howe
Abstract	Many real-world applications require automated data annotation, such as identifying tissue origins based on gene expressions and classifying images into semantic categories. Annotation classes are often numerous and subject to changes over time, and annotating examples has become the major bottleneck for supervised learning methods. In science and other high-value domains, large repositories of data samples are often available, together with two sources of organic supervision: a lexicon for the annotation classes, and text descriptions that accompany some data samples. Distant supervision has emerged as a promising paradigm for exploiting such indirect supervision by automatically annotating examples where the text description contains a class mention in the lexicon. However, due to linguistic variations and ambiguities, such training data is inherently noisy, which limits the accuracy of this approach. In this paper, we introduce an auxiliary natural language processing system for the text modality, and incorporate co-training to reduce noise and augment signal in distant supervision. Without using any manually labeled data, our EZLearn system learned to accurately annotate data samples in functional genomics and scientific figure comprehension, substantially outperforming state-of-the-art supervised methods trained on tens of thousands of annotated examples.
Tasks
Published	2017-09-25
URL	http://arxiv.org/abs/1709.08600v3
PDF	http://arxiv.org/pdf/1709.08600v3.pdf
PWC	https://paperswithcode.com/paper/ezlearn-exploiting-organic-supervision-in
Repo
Framework

Uncertainty-Aware Organ Classification for Surgical Data Science Applications in Laparoscopy


Title	Uncertainty-Aware Organ Classification for Surgical Data Science Applications in Laparoscopy
Authors	S. Moccia, S. J. Wirkert, H. Kenngott, A. S. Vemuri, M. Apitz, B. Mayer, E. De Momi, L. S. Mattos, L. Maier-Hein
Abstract	Objective: Surgical data science is evolving into a research field that aims to observe everything occurring within and around the treatment process to provide situation-aware data-driven assistance. In the context of endoscopic video analysis, the accurate classification of organs in the field of view of the camera proffers a technical challenge. Herein, we propose a new approach to anatomical structure classification and image tagging that features an intrinsic measure of confidence to estimate its own performance with high reliability and which can be applied to both RGB and multispectral imaging (MI) data. Methods: Organ recognition is performed using a superpixel classification strategy based on textural and reflectance information. Classification confidence is estimated by analyzing the dispersion of class probabilities. Assessment of the proposed technology is performed through a comprehensive in vivo study with seven pigs. Results: When applied to image tagging, mean accuracy in our experiments increased from 65% (RGB) and 80% (MI) to 90% (RGB) and 96% (MI) with the confidence measure. Conclusion: Results showed that the confidence measure had a significant influence on the classification accuracy, and MI data are better suited for anatomical structure labeling than RGB data. Significance: This work significantly enhances the state of art in automatic labeling of endoscopic videos by introducing the use of the confidence metric, and by being the first study to use MI data for in vivo laparoscopic tissue classification. The data of our experiments will be released as the first in vivo MI dataset upon publication of this paper.
Tasks
Published	2017-06-21
URL	http://arxiv.org/abs/1706.07002v2
PDF	http://arxiv.org/pdf/1706.07002v2.pdf
PWC	https://paperswithcode.com/paper/uncertainty-aware-organ-classification-for
Repo
Framework

Greedy Structure Learning of Hierarchical Compositional Models


Title	Greedy Structure Learning of Hierarchical Compositional Models
Authors	Adam Kortylewski, Aleksander Wieczorek, Mario Wieser, Clemens Blumer, Sonali Parbhoo, Andreas Morel-Forster, Volker Roth, Thomas Vetter
Abstract	In this work, we consider the problem of learning a hierarchical generative model of an object from a set of images which show examples of the object in the presence of variable background clutter. Existing approaches to this problem are limited by making strong a-priori assumptions about the object’s geometric structure and require segmented training data for learning. In this paper, we propose a novel framework for learning hierarchical compositional models (HCMs) which do not suffer from the mentioned limitations. We present a generalized formulation of HCMs and describe a greedy structure learning framework that consists of two phases: Bottom-up part learning and top-down model composition. Our framework integrates the foreground-background segmentation problem into the structure learning task via a background model. As a result, we can jointly optimize for the number of layers in the hierarchy, the number of parts per layer and a foreground-background segmentation based on class labels only. We show that the learned HCMs are semantically meaningful and achieve competitive results when compared to other generative object models at object classification on a standard transfer learning dataset.
Tasks	Object Classification, Transfer Learning
Published	2017-01-22
URL	http://arxiv.org/abs/1701.06171v4
PDF	http://arxiv.org/pdf/1701.06171v4.pdf
PWC	https://paperswithcode.com/paper/greedy-structure-learning-of-hierarchical
Repo
Framework

Technical Report: Implementation and Validation of a Smart Health Application


Title	Technical Report: Implementation and Validation of a Smart Health Application
Authors	Fran Casino, Constantinos Patsakis, Antoni Martinez-Balleste, Frederic Borras, Edgar Batista
Abstract	In this article, we explain in detail the internal structures and databases of a smart health application. Moreover, we describe how to generate a statistically sound synthetic dataset using real-world medical data.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.04109v1
PDF	http://arxiv.org/pdf/1706.04109v1.pdf
PWC	https://paperswithcode.com/paper/technical-report-implementation-and
Repo
Framework

A First Empirical Study of Emphatic Temporal Difference Learning


Title	A First Empirical Study of Emphatic Temporal Difference Learning
Authors	Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton
Abstract	In this paper we present the first empirical study of the emphatic temporal-difference learning algorithm (ETD), comparing it with conventional temporal-difference learning, in particular, with linear TD(0), on on-policy and off-policy variations of the Mountain Car problem. The initial motivation for developing ETD was that it has good convergence properties under off-policy training (Sutton, Mahmood and White 2016), but it is also a new algorithm for the on-policy case. In both our on-policy and off-policy experiments, we found that each method converged to a characteristic asymptotic level of error, with ETD better than TD(0). TD(0) achieved a still lower error level temporarily before falling back to its higher asymptote, whereas ETD never showed this kind of “bounce”. In the off-policy case (in which TD(0) is not guaranteed to converge), ETD was significantly slower.
Tasks
Published	2017-05-11
URL	http://arxiv.org/abs/1705.04185v2
PDF	http://arxiv.org/pdf/1705.04185v2.pdf
PWC	https://paperswithcode.com/paper/a-first-empirical-study-of-emphatic-temporal
Repo
Framework

Compressive Estimation of a Stochastic Process with Unknown Autocorrelation Function


Title	Compressive Estimation of a Stochastic Process with Unknown Autocorrelation Function
Authors	Mahdi Barzegar Khalilsarai, Saeid Haghighatshoar, Giuseppe Caire, Gerhard Wunder
Abstract	In this paper, we study the prediction of a circularly symmetric zero-mean stationary Gaussian process from a window of observations consisting of finitely many samples. This is a prevalent problem in a wide range of applications in communication theory and signal processing. Due to stationarity, when the autocorrelation function or equivalently the power spectral density (PSD) of the process is available, the Minimum Mean Squared Error (MMSE) predictor is readily obtained. In particular, it is given by a linear operator that depends on autocorrelation of the process as well as the noise power in the observed samples. The prediction becomes, however, quite challenging when the PSD of the process is unknown. In this paper, we propose a blind predictor that does not require the a priori knowledge of the PSD of the process and compare its performance with that of an MMSE predictor that has a full knowledge of the PSD. To design such a blind predictor, we use the random spectral representation of a stationary Gaussian process. We apply the well-known atomic-norm minimization technique to the observed samples to obtain a discrete quantization of the underlying random spectrum, which we use to predict the process. Our simulation results show that this estimator has a good performance comparable with that of the MMSE estimator.
Tasks	Quantization
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03420v1
PDF	http://arxiv.org/pdf/1705.03420v1.pdf
PWC	https://paperswithcode.com/paper/compressive-estimation-of-a-stochastic
Repo
Framework

Facial Expression Recognition using Visual Saliency and Deep Learning


Title	Facial Expression Recognition using Visual Saliency and Deep Learning
Authors	Viraj Mavani, Shanmuganathan Raman, Krishna P Miyapuram
Abstract	We have developed a convolutional neural network for the purpose of recognizing facial expressions in human beings. We have fine-tuned the existing convolutional neural network model trained on the visual recognition dataset used in the ILSVRC2012 to two widely used facial expression datasets - CFEE and RaFD, which when trained and tested independently yielded test accuracies of 74.79% and 95.71%, respectively. Generalization of results was evident by training on one dataset and testing on the other. Further, the image product of the cropped faces and their visual saliency maps were computed using Deep Multi-Layer Network for saliency prediction and were fed to the facial expression recognition CNN. In the most generalized experiment, we observed the top-1 accuracy in the test set to be 65.39%. General confusion trends between different facial expressions as exhibited by humans were also observed.
Tasks	Facial Expression Recognition, Saliency Prediction
Published	2017-08-26
URL	http://arxiv.org/abs/1708.08016v1
PDF	http://arxiv.org/pdf/1708.08016v1.pdf
PWC	https://paperswithcode.com/paper/facial-expression-recognition-using-visual
Repo
Framework

Faithful to the Original: Fact Aware Neural Abstractive Summarization


Title	Faithful to the Original: Fact Aware Neural Abstractive Summarization
Authors	Ziqiang Cao, Furu Wei, Wenjie Li, Sujian Li
Abstract	Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.
Tasks	Abstractive Text Summarization, Open Information Extraction
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04434v1
PDF	http://arxiv.org/pdf/1711.04434v1.pdf
PWC	https://paperswithcode.com/paper/faithful-to-the-original-fact-aware-neural
Repo
Framework

Machine Learning for Survival Analysis: A Survey


Title	Machine Learning for Survival Analysis: A Survey
Authors	Ping Wang, Yan Li, Chandan K. Reddy
Abstract	Accurately predicting the time of occurrence of an event of interest is a critical problem in longitudinal data analysis. One of the main challenges in this context is the presence of instances whose event outcomes become unobservable after a certain time point or when some instances do not experience any event during the monitoring period. Such a phenomenon is called censoring which can be effectively handled using survival analysis techniques. Traditionally, statistical approaches have been widely developed in the literature to overcome this censoring issue. In addition, many machine learning algorithms are adapted to effectively handle survival data and tackle other challenging problems that arise in real-world data. In this survey, we provide a comprehensive and structured review of the representative statistical methods along with the machine learning techniques used in survival analysis and provide a detailed taxonomy of the existing methods. We also discuss several topics that are closely related to survival analysis and illustrate several successful applications in various real-world application domains. We hope that this paper will provide a more thorough understanding of the recent advances in survival analysis and offer some guidelines on applying these approaches to solve new problems that arise in applications with censored data.
Tasks	Survival Analysis
Published	2017-08-15
URL	http://arxiv.org/abs/1708.04649v1
PDF	http://arxiv.org/pdf/1708.04649v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-survival-analysis-a
Repo
Framework

Generative and Discriminative Text Classification with Recurrent Neural Networks


Title	Generative and Discriminative Text Classification with Recurrent Neural Networks
Authors	Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom
Abstract	We empirically characterize the performance of discriminative and generative LSTM models for text classification. We find that although RNN-based generative models are more powerful than their bag-of-words ancestors (e.g., they account for conditional dependencies across words in a document), they have higher asymptotic error rates than discriminatively trained RNN models. However we also find that generative models approach their asymptotic error rate more rapidly than their discriminative counterparts—the same pattern that Ng & Jordan (2001) proved holds for linear classification models that make more naive conditional independence assumptions. Building on this finding, we hypothesize that RNN-based generative classification models will be more robust to shifts in the data distribution. This hypothesis is confirmed in a series of experiments in zero-shot and continual learning settings that show that generative models substantially outperform discriminative models.
Tasks	Continual Learning, Text Classification
Published	2017-03-06
URL	http://arxiv.org/abs/1703.01898v2
PDF	http://arxiv.org/pdf/1703.01898v2.pdf
PWC	https://paperswithcode.com/paper/generative-and-discriminative-text
Repo
Framework

Optimizing expected word error rate via sampling for speech recognition


Title	Optimizing expected word error rate via sampling for speech recognition
Authors	Matt Shannon
Abstract	State-level minimum Bayes risk (sMBR) training has become the de facto standard for sequence-level training of speech recognition acoustic models. It has an elegant formulation using the expectation semiring, and gives large improvements in word error rate (WER) over models trained solely using cross-entropy (CE) or connectionist temporal classification (CTC). sMBR training optimizes the expected number of frames at which the reference and hypothesized acoustic states differ. It may be preferable to optimize the expected WER, but WER does not interact well with the expectation semiring, and previous approaches based on computing expected WER exactly involve expanding the lattices used during training. In this paper we show how to perform optimization of the expected WER by sampling paths from the lattices used during conventional sMBR training. The gradient of the expected WER is itself an expectation, and so may be approximated using Monte Carlo sampling. We show experimentally that optimizing WER during acoustic model training gives 5% relative improvement in WER over a well-tuned sMBR baseline on a 2-channel query recognition task (Google Home).
Tasks	Speech Recognition
Published	2017-06-08
URL	http://arxiv.org/abs/1706.02776v1
PDF	http://arxiv.org/pdf/1706.02776v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-expected-word-error-rate-via
Repo
Framework

Reconstruction from Periodic Nonlinearities, With Applications to HDR Imaging


Title	Reconstruction from Periodic Nonlinearities, With Applications to HDR Imaging
Authors	Viraj Shah, Mohammadreza Soltani, Chinmay Hegde
Abstract	We consider the problem of reconstructing signals and images from periodic nonlinearities. For such problems, we design a measurement scheme that supports efficient reconstruction; moreover, our method can be adapted to extend to compressive sensing-based signal and image acquisition systems. Our techniques can be potentially useful for reducing the measurement complexity of high dynamic range (HDR) imaging systems, with little loss in reconstruction quality. Several numerical experiments on real data demonstrate the effectiveness of our approach.
Tasks	Compressive Sensing
Published	2017-09-29
URL	http://arxiv.org/abs/1710.00109v1
PDF	http://arxiv.org/pdf/1710.00109v1.pdf
PWC	https://paperswithcode.com/paper/reconstruction-from-periodic-nonlinearities
Repo
Framework

Direct detection of pixel-level myocardial infarction areas via a deep-learning algorithm


Title	Direct detection of pixel-level myocardial infarction areas via a deep-learning algorithm
Authors	Chenchu Xu, Lei Xu, Zhifan Gao, Shen zhao, Heye Zhang, Yanping Zhang, Xiuquan Du, Shu Zhao, Dhanjoo Ghista, Shuo Li
Abstract	Accurate detection of the myocardial infarction (MI) area is crucial for early diagnosis planning and follow-up management. In this study, we propose an end-to-end deep-learning algorithm framework (OF-RNN ) to accurately detect the MI area at the pixel level. Our OF-RNN consists of three different function layers: the heart localization layers, which can accurately and automatically crop the region-of-interest (ROI) sequences, including the left ventricle, using the whole cardiac magnetic resonance image sequences; the motion statistical layers, which are used to build a time-series architecture to capture two types of motion features (at the pixel-level) by integrating the local motion features generated by long short-term memory-recurrent neural networks and the global motion features generated by deep optical flows from the whole ROI sequence, which can effectively characterize myocardial physiologic function; and the fully connected discriminate layers, which use stacked auto-encoders to further learn these features, and they use a softmax classifier to build the correspondences from the motion features to the tissue identities (infarction or not) for each pixel. Through the seamless connection of each layer, our OF-RNN can obtain the area, position, and shape of the MI for each patient. Our proposed framework yielded an overall classification accuracy of 94.35% at the pixel level, from 114 clinical subjects. These results indicate the potential of our proposed method in aiding standardized MI assessments.
Tasks	Time Series
Published	2017-06-10
URL	http://arxiv.org/abs/1706.03182v1
PDF	http://arxiv.org/pdf/1706.03182v1.pdf
PWC	https://paperswithcode.com/paper/direct-detection-of-pixel-level-myocardial
Repo
Framework

Eigen Evolution Pooling for Human Action Recognition


Title	Eigen Evolution Pooling for Human Action Recognition
Authors	Yang Wang, Vinh Tran, Minh Hoai
Abstract	We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors. Eigen evolution pooling is designed to produce compact feature representations for a sequence of feature vectors, while maximally preserving as much information about the sequence as possible, especially the temporal evolution of the features over time. Eigen evolution pooling is a general pooling method that can be applied to any sequence of feature vectors, from low-level RGB values to high-level Convolutional Neural Network (CNN) feature vectors. We show that eigen evolution pooling is more effective than average, max, and rank pooling for encoding the dynamics of human actions in video. We demonstrate the power of eigen evolution pooling on UCF101 and Hollywood2 datasets, two human action recognition benchmarks, and achieve state-of-the-art performance.
Tasks	Temporal Action Localization
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05465v1
PDF	http://arxiv.org/pdf/1708.05465v1.pdf
PWC	https://paperswithcode.com/paper/eigen-evolution-pooling-for-human-action
Repo
Framework