May 7, 2019

2625 words 13 mins read

Paper Group ANR 24

Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization. Toward Interpretable Topic Discovery via Anchored Correlation Explanation. Recycle deep features for better object detection. Deep vs. shallow networks : An approximation theory perspective. Preorder-Based Triangle: A Modified Version …

Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization


Title	Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization
Authors	Sunil Thulasidasan, Jeffrey Bilmes
Abstract	We describe a graph-based semi-supervised learning framework in the context of deep neural networks that uses a graph-based entropic regularizer to favor smooth solutions over a graph induced by the data. The main contribution of this work is a computationally efficient, stochastic graph-regularization technique that uses mini-batches that are consistent with the graph structure, but also provides enough stochasticity (in terms of mini-batch data diversity) for convergence of stochastic gradient descent methods to good solutions. For this work, we focus on results of frame-level phone classification accuracy on the TIMIT speech corpus but our method is general and scalable to much larger data sets. Results indicate that our method significantly improves classification accuracy compared to the fully-supervised case when the fraction of labeled data is low, and it is competitive with other methods in the fully labeled case.
Tasks
Published	2016-12-15
URL	http://arxiv.org/abs/1612.04899v2
PDF	http://arxiv.org/pdf/1612.04899v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-phone-classification-using
Repo
Framework

Toward Interpretable Topic Discovery via Anchored Correlation Explanation


Title	Toward Interpretable Topic Discovery via Anchored Correlation Explanation
Authors	Kyle Reing, David C. Kale, Greg Ver Steeg, Aram Galstyan
Abstract	Many predictive tasks, such as diagnosing a patient based on their medical chart, are ultimately defined by the decisions of human experts. Unfortunately, encoding experts’ knowledge is often time consuming and expensive. We propose a simple way to use fuzzy and informal knowledge from experts to guide discovery of interpretable latent topics in text. The underlying intuition of our approach is that latent factors should be informative about both correlations in the data and a set of relevance variables specified by an expert. Mathematically, this approach is a combination of the information bottleneck and Total Correlation Explanation (CorEx). We give a preliminary evaluation of Anchored CorEx, showing that it produces more coherent and interpretable topics on two distinct corpora.
Tasks
Published	2016-06-22
URL	http://arxiv.org/abs/1606.07043v1
PDF	http://arxiv.org/pdf/1606.07043v1.pdf
PWC	https://paperswithcode.com/paper/toward-interpretable-topic-discovery-via
Repo
Framework

Recycle deep features for better object detection


Title	Recycle deep features for better object detection
Authors	Wei Li, Matthias Breier, Dorit Merhof
Abstract	Aiming at improving the performance of existing detection algorithms developed for different applications, we propose a region regression-based multi-stage class-agnostic detection pipeline, whereby the existing algorithms are employed for providing the initial detection proposals. Better detection is obtained by exploiting the power of deep learning in the region regress scheme while avoiding the requirement on a huge amount of reference data for training deep neural networks. Additionally, a novel network architecture with recycled deep features is proposed, which provides superior regression results compared to the commonly used architectures. As demonstrated on a data set with ~1200 samples of different classes, it is feasible to successfully train a deep neural network in our proposed architecture and use it to obtain the desired detection performance. Since only slight modifications are required to common network architectures and since the deep neural network is trained using the standard hyperparameters, the proposed detection is well accessible and can be easily adopted to a broad variety of detection tasks.
Tasks	Object Detection
Published	2016-07-18
URL	http://arxiv.org/abs/1607.05066v1
PDF	http://arxiv.org/pdf/1607.05066v1.pdf
PWC	https://paperswithcode.com/paper/recycle-deep-features-for-better-object
Repo
Framework

Deep vs. shallow networks : An approximation theory perspective


Title	Deep vs. shallow networks : An approximation theory perspective
Authors	Hrushikesh Mhaskar, Tomaso Poggio
Abstract	The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of relative dimension to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning.
Tasks
Published	2016-08-10
URL	http://arxiv.org/abs/1608.03287v1
PDF	http://arxiv.org/pdf/1608.03287v1.pdf
PWC	https://paperswithcode.com/paper/deep-vs-shallow-networks-an-approximation
Repo
Framework

Preorder-Based Triangle: A Modified Version of Bilattice-Based Triangle for Belief Revision in Nonmonotonic Reasoning


Title	Preorder-Based Triangle: A Modified Version of Bilattice-Based Triangle for Belief Revision in Nonmonotonic Reasoning
Authors	Kumar Sankar Ray, Sandip Paul, Diganta Saha
Abstract	Bilattice-based triangle provides an elegant algebraic structure for reasoning with vague and uncertain information. But the truth and knowledge ordering of intervals in bilattice-based triangle can not handle repetitive belief revisions which is an essential characteristic of nonmonotonic reasoning. Moreover the ordering induced over the intervals by the bilattice-based triangle is not sometimes intuitive. In this work, we construct an alternative algebraic structure, namely preorder-based triangle and we formulate proper logical connectives for this. It is also demonstrated that Preorder-based triangle serves to be a better alternative to the bilattice-based triangle for reasoning in application areas, that involve nonmonotonic fuzzy reasoning with uncertain information.
Tasks
Published	2016-09-19
URL	http://arxiv.org/abs/1609.05616v4
PDF	http://arxiv.org/pdf/1609.05616v4.pdf
PWC	https://paperswithcode.com/paper/preorder-based-triangle-a-modified-version-of
Repo
Framework

A Novel Online Real-time Classifier for Multi-label Data Streams


Title	A Novel Online Real-time Classifier for Multi-label Data Streams
Authors	Rajasekar Venkatesan, Meng Joo Er, Shiqian Wu, Mahardhika Pratama
Abstract	In this paper, a novel extreme learning machine based online multi-label classifier for real-time data streams is proposed. Multi-label classification is one of the actively researched machine learning paradigm that has gained much attention in the recent years due to its rapidly increasing real world applications. In contrast to traditional binary and multi-class classification, multi-label classification involves association of each of the input samples with a set of target labels simultaneously. There are no real-time online neural network based multi-label classifier available in the literature. In this paper, we exploit the inherent nature of high speed exhibited by the extreme learning machines to develop a novel online real-time classifier for multi-label data streams. The developed classifier is experimented with datasets from different application domains for consistency, performance and speed. The experimental studies show that the proposed method outperforms the existing state-of-the-art techniques in terms of speed and accuracy and can classify multi-label data streams in real-time.
Tasks	Multi-Label Classification
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08905v1
PDF	http://arxiv.org/pdf/1608.08905v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-online-real-time-classifier-for-multi
Repo
Framework

A Reduction for Optimizing Lattice Submodular Functions with Diminishing Returns


Title	A Reduction for Optimizing Lattice Submodular Functions with Diminishing Returns
Authors	Alina Ene, Huy L. Nguyen
Abstract	A function $f: \mathbb{Z}_+^E \rightarrow \mathbb{R}_+$ is DR-submodular if it satisfies $f({\bf x} + \chi_i) -f ({\bf x}) \ge f({\bf y} + \chi_i) - f({\bf y})$ for all ${\bf x}\le {\bf y}, i\in E$. Recently, the problem of maximizing a DR-submodular function $f: \mathbb{Z}_+^E \rightarrow \mathbb{R}_+$ subject to a budget constraint ${\bf x}_1 \leq B$ as well as additional constraints has received significant attention \cite{SKIK14,SY15,MYK15,SY16}. In this note, we give a generic reduction from the DR-submodular setting to the submodular setting. The running time of the reduction and the size of the resulting submodular instance depends only \emph{logarithmically} on $B$. Using this reduction, one can translate the results for unconstrained and constrained submodular maximization to the DR-submodular setting for many types of constraints in a unified manner.
Tasks
Published	2016-06-27
URL	http://arxiv.org/abs/1606.08362v2
PDF	http://arxiv.org/pdf/1606.08362v2.pdf
PWC	https://paperswithcode.com/paper/a-reduction-for-optimizing-lattice-submodular
Repo
Framework

Scope for Machine Learning in Digital Manufacturing


Title	Scope for Machine Learning in Digital Manufacturing
Authors	Martin Baumers, Ender Ozcan
Abstract	This provocation paper provides an overview of the underlying optimisation problem in the emerging field of Digital Manufacturing. Initially, this paper discusses how the notion of Digital Manufacturing is transforming from a term describing a suite of software tools for the integration of production and design functions towards a more general concept incorporating computerised manufacturing and supply chain processes, as well as information collection and utilisation across the product life cycle. On this basis, we use the example of one such manufacturing process, Additive Manufacturing, to identify an integrated multi-objective optimisation problem underlying Digital Manufacturing. Forming an opportunity for a concurrent application of data science and optimisation, a set of challenges arising from this problem is outlined.
Tasks
Published	2016-09-19
URL	http://arxiv.org/abs/1609.05835v1
PDF	http://arxiv.org/pdf/1609.05835v1.pdf
PWC	https://paperswithcode.com/paper/scope-for-machine-learning-in-digital
Repo
Framework

Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders


Title	Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders
Authors	Will Grathwohl, Aaron Wilson
Abstract	There are many forms of feature information present in video data. Principle among them are object identity information which is largely static across multiple video frames, and object pose and style information which continuously transforms from frame to frame. Most existing models confound these two types of representation by mapping them to a shared feature space. In this paper we propose a probabilistic approach for learning separable representations of object identity and pose information using unsupervised video data. Our approach leverages a deep generative model with a factored prior distribution that encodes properties of temporal invariances in the hidden feature set. Learning is achieved via variational inference. We present results of learning identity and pose information on a dataset of moving characters as well as a dataset of rotating 3D objects. Our experimental results demonstrate our model’s success in factoring its representation, and demonstrate that the model achieves improved performance in transfer learning tasks.
Tasks	Transfer Learning
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04440v2
PDF	http://arxiv.org/pdf/1612.04440v2.pdf
PWC	https://paperswithcode.com/paper/disentangling-space-and-time-in-video-with
Repo
Framework

Text Flow: A Unified Text Detection System in Natural Scene Images


Title	Text Flow: A Unified Text Detection System in Natural Scene Images
Authors	Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan
Abstract	The prevalent scene text detection approach follows four sequential steps comprising character candidate detection, false character candidate removal, text line extraction, and text line verification. However, errors occur and accumulate throughout each of these sequential steps which often lead to low detection performance. To address these issues, we propose a unified scene text detection system, namely Text Flow, by utilizing the minimum cost (min-cost) flow network model. With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively. The proposed technique has been tested on three public datasets, i.e, ICDAR2011 dataset, ICDAR2013 dataset and a multilingual dataset and it outperforms the state-of-the-art methods on all three datasets with much higher recall and F-score. The good performance on the multilingual dataset shows that the proposed technique can be used for the detection of texts in different languages.
Tasks	Scene Text Detection
Published	2016-04-23
URL	http://arxiv.org/abs/1604.06877v1
PDF	http://arxiv.org/pdf/1604.06877v1.pdf
PWC	https://paperswithcode.com/paper/text-flow-a-unified-text-detection-system-in
Repo
Framework

On Coreset Constructions for the Fuzzy $K$-Means Problem


Title	On Coreset Constructions for the Fuzzy $K$-Means Problem
Authors	Johannes Blömer, Sascha Brauer, Kathrin Bujna
Abstract	The fuzzy $K$-means problem is a popular generalization of the well-known $K$-means problem to soft clusterings. We present the first coresets for fuzzy $K$-means with size linear in the dimension, polynomial in the number of clusters, and poly-logarithmic in the number of points. We show that these coresets can be employed in the computation of a $(1+\epsilon)$-approximation for fuzzy $K$-means, improving previously presented results. We further show that our coresets can be maintained in an insertion-only streaming setting, where data points arrive one-by-one.
Tasks
Published	2016-12-22
URL	http://arxiv.org/abs/1612.07516v3
PDF	http://arxiv.org/pdf/1612.07516v3.pdf
PWC	https://paperswithcode.com/paper/on-coreset-constructions-for-the-fuzzy-k
Repo
Framework

PAC-Bayesian Theory Meets Bayesian Inference


Title	PAC-Bayesian Theory Meets Bayesian Inference
Authors	Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien
Abstract	We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam’s razor criteria, under the assumption that the data is generated by an i.i.d distribution. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks.
Tasks	Bayesian Inference
Published	2016-05-27
URL	http://arxiv.org/abs/1605.08636v4
PDF	http://arxiv.org/pdf/1605.08636v4.pdf
PWC	https://paperswithcode.com/paper/pac-bayesian-theory-meets-bayesian-inference
Repo
Framework

Structured Sparse Regression via Greedy Hard-Thresholding


Title	Structured Sparse Regression via Greedy Hard-Thresholding
Authors	Prateek Jain, Nikhil Rao, Inderjit Dhillon
Abstract	Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation $\geq 0.99$), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity.
Tasks
Published	2016-02-19
URL	http://arxiv.org/abs/1602.06042v2
PDF	http://arxiv.org/pdf/1602.06042v2.pdf
PWC	https://paperswithcode.com/paper/structured-sparse-regression-via-greedy-hard
Repo
Framework

Latent Model Ensemble with Auto-localization


Title	Latent Model Ensemble with Auto-localization
Authors	Miao Sun, Tony X. Han, Xun Xu, Ming-Chang Liu, Ahmad Khodayari-Rostamabad
Abstract	Deep Convolutional Neural Networks (CNN) have exhibited superior performance in many visual recognition tasks including image classification, object detection, and scene label- ing, due to their large learning capacity and resistance to overfit. For the image classification task, most of the current deep CNN- based approaches take the whole size-normalized image as input and have achieved quite promising results. Compared with the previously dominating approaches based on feature extraction, pooling, and classification, the deep CNN-based approaches mainly rely on the learning capability of deep CNN to achieve superior results: the burden of minimizing intra-class variation while maximizing inter-class difference is entirely dependent on the implicit feature learning component of deep CNN; we rely upon the implicitly learned filters and pooling component to select the discriminative regions, which correspond to the activated neurons. However, if the irrelevant regions constitute a large portion of the image of interest, the classification performance of the deep CNN, which takes the whole image as input, can be heavily affected. To solve this issue, we propose a novel latent CNN framework, which treats the most discriminate region as a latent variable. We can jointly learn the global CNN with the latent CNN to avoid the aforementioned big irrelevant region issue, and our experimental results show the evident advantage of the proposed latent CNN over traditional deep CNN: latent CNN outperforms the state-of-the-art performance of deep CNN on standard benchmark datasets including the CIFAR-10, CIFAR- 100, MNIST and PASCAL VOC 2007 Classification dataset.
Tasks	Image Classification, Object Detection
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04333v2
PDF	http://arxiv.org/pdf/1604.04333v2.pdf
PWC	https://paperswithcode.com/paper/latent-model-ensemble-with-auto-localization
Repo
Framework

Towards End-to-End Audio-Sheet-Music Retrieval


Title	Towards End-to-End Audio-Sheet-Music Retrieval
Authors	Matthias Dorfer, Andreas Arzt, Gerhard Widmer
Abstract	This paper demonstrates the feasibility of learning to retrieve short snippets of sheet music (images) when given a short query excerpt of music (audio) – and vice versa –, without any symbolic representation of music or scores. This would be highly useful in many content-based musical retrieval scenarios. Our approach is based on Deep Canonical Correlation Analysis (DCCA) and learns correlated latent spaces allowing for cross-modality retrieval in both directions. Initial experiments with relatively simple monophonic music show promising results.
Tasks
Published	2016-12-15
URL	http://arxiv.org/abs/1612.05070v1
PDF	http://arxiv.org/pdf/1612.05070v1.pdf
PWC	https://paperswithcode.com/paper/towards-end-to-end-audio-sheet-music
Repo
Framework