Paper Group ANR 24
Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization. Toward Interpretable Topic Discovery via Anchored Correlation Explanation. Recycle deep features for better object detection. Deep vs. shallow networks : An approximation theory perspective. Preorder-Based Triangle: A Modified Version …
Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization
Title | Semi-Supervised Phone Classification using Deep Neural Networks and Stochastic Graph-Based Entropic Regularization |
Authors | Sunil Thulasidasan, Jeffrey Bilmes |
Abstract | We describe a graph-based semi-supervised learning framework in the context of deep neural networks that uses a graph-based entropic regularizer to favor smooth solutions over a graph induced by the data. The main contribution of this work is a computationally efficient, stochastic graph-regularization technique that uses mini-batches that are consistent with the graph structure, but also provides enough stochasticity (in terms of mini-batch data diversity) for convergence of stochastic gradient descent methods to good solutions. For this work, we focus on results of frame-level phone classification accuracy on the TIMIT speech corpus but our method is general and scalable to much larger data sets. Results indicate that our method significantly improves classification accuracy compared to the fully-supervised case when the fraction of labeled data is low, and it is competitive with other methods in the fully labeled case. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.04899v2 |
http://arxiv.org/pdf/1612.04899v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-phone-classification-using |
Repo | |
Framework | |
Toward Interpretable Topic Discovery via Anchored Correlation Explanation
Title | Toward Interpretable Topic Discovery via Anchored Correlation Explanation |
Authors | Kyle Reing, David C. Kale, Greg Ver Steeg, Aram Galstyan |
Abstract | Many predictive tasks, such as diagnosing a patient based on their medical chart, are ultimately defined by the decisions of human experts. Unfortunately, encoding experts’ knowledge is often time consuming and expensive. We propose a simple way to use fuzzy and informal knowledge from experts to guide discovery of interpretable latent topics in text. The underlying intuition of our approach is that latent factors should be informative about both correlations in the data and a set of relevance variables specified by an expert. Mathematically, this approach is a combination of the information bottleneck and Total Correlation Explanation (CorEx). We give a preliminary evaluation of Anchored CorEx, showing that it produces more coherent and interpretable topics on two distinct corpora. |
Tasks | |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.07043v1 |
http://arxiv.org/pdf/1606.07043v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-interpretable-topic-discovery-via |
Repo | |
Framework | |
Recycle deep features for better object detection
Title | Recycle deep features for better object detection |
Authors | Wei Li, Matthias Breier, Dorit Merhof |
Abstract | Aiming at improving the performance of existing detection algorithms developed for different applications, we propose a region regression-based multi-stage class-agnostic detection pipeline, whereby the existing algorithms are employed for providing the initial detection proposals. Better detection is obtained by exploiting the power of deep learning in the region regress scheme while avoiding the requirement on a huge amount of reference data for training deep neural networks. Additionally, a novel network architecture with recycled deep features is proposed, which provides superior regression results compared to the commonly used architectures. As demonstrated on a data set with ~1200 samples of different classes, it is feasible to successfully train a deep neural network in our proposed architecture and use it to obtain the desired detection performance. Since only slight modifications are required to common network architectures and since the deep neural network is trained using the standard hyperparameters, the proposed detection is well accessible and can be easily adopted to a broad variety of detection tasks. |
Tasks | Object Detection |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.05066v1 |
http://arxiv.org/pdf/1607.05066v1.pdf | |
PWC | https://paperswithcode.com/paper/recycle-deep-features-for-better-object |
Repo | |
Framework | |
Deep vs. shallow networks : An approximation theory perspective
Title | Deep vs. shallow networks : An approximation theory perspective |
Authors | Hrushikesh Mhaskar, Tomaso Poggio |
Abstract | The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks. We propose a new definition of relative dimension to encapsulate different notions of sparsity of a function class that can possibly be exploited by deep networks but not by shallow ones to drastically reduce the complexity required for approximation and learning. |
Tasks | |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03287v1 |
http://arxiv.org/pdf/1608.03287v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-vs-shallow-networks-an-approximation |
Repo | |
Framework | |
Preorder-Based Triangle: A Modified Version of Bilattice-Based Triangle for Belief Revision in Nonmonotonic Reasoning
Title | Preorder-Based Triangle: A Modified Version of Bilattice-Based Triangle for Belief Revision in Nonmonotonic Reasoning |
Authors | Kumar Sankar Ray, Sandip Paul, Diganta Saha |
Abstract | Bilattice-based triangle provides an elegant algebraic structure for reasoning with vague and uncertain information. But the truth and knowledge ordering of intervals in bilattice-based triangle can not handle repetitive belief revisions which is an essential characteristic of nonmonotonic reasoning. Moreover the ordering induced over the intervals by the bilattice-based triangle is not sometimes intuitive. In this work, we construct an alternative algebraic structure, namely preorder-based triangle and we formulate proper logical connectives for this. It is also demonstrated that Preorder-based triangle serves to be a better alternative to the bilattice-based triangle for reasoning in application areas, that involve nonmonotonic fuzzy reasoning with uncertain information. |
Tasks | |
Published | 2016-09-19 |
URL | http://arxiv.org/abs/1609.05616v4 |
http://arxiv.org/pdf/1609.05616v4.pdf | |
PWC | https://paperswithcode.com/paper/preorder-based-triangle-a-modified-version-of |
Repo | |
Framework | |
A Novel Online Real-time Classifier for Multi-label Data Streams
Title | A Novel Online Real-time Classifier for Multi-label Data Streams |
Authors | Rajasekar Venkatesan, Meng Joo Er, Shiqian Wu, Mahardhika Pratama |
Abstract | In this paper, a novel extreme learning machine based online multi-label classifier for real-time data streams is proposed. Multi-label classification is one of the actively researched machine learning paradigm that has gained much attention in the recent years due to its rapidly increasing real world applications. In contrast to traditional binary and multi-class classification, multi-label classification involves association of each of the input samples with a set of target labels simultaneously. There are no real-time online neural network based multi-label classifier available in the literature. In this paper, we exploit the inherent nature of high speed exhibited by the extreme learning machines to develop a novel online real-time classifier for multi-label data streams. The developed classifier is experimented with datasets from different application domains for consistency, performance and speed. The experimental studies show that the proposed method outperforms the existing state-of-the-art techniques in terms of speed and accuracy and can classify multi-label data streams in real-time. |
Tasks | Multi-Label Classification |
Published | 2016-08-31 |
URL | http://arxiv.org/abs/1608.08905v1 |
http://arxiv.org/pdf/1608.08905v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-online-real-time-classifier-for-multi |
Repo | |
Framework | |
A Reduction for Optimizing Lattice Submodular Functions with Diminishing Returns
Title | A Reduction for Optimizing Lattice Submodular Functions with Diminishing Returns |
Authors | Alina Ene, Huy L. Nguyen |
Abstract | A function $f: \mathbb{Z}_+^E \rightarrow \mathbb{R}_+$ is DR-submodular if it satisfies $f({\bf x} + \chi_i) -f ({\bf x}) \ge f({\bf y} + \chi_i) - f({\bf y})$ for all ${\bf x}\le {\bf y}, i\in E$. Recently, the problem of maximizing a DR-submodular function $f: \mathbb{Z}_+^E \rightarrow \mathbb{R}_+$ subject to a budget constraint ${\bf x}_1 \leq B$ as well as additional constraints has received significant attention \cite{SKIK14,SY15,MYK15,SY16}. In this note, we give a generic reduction from the DR-submodular setting to the submodular setting. The running time of the reduction and the size of the resulting submodular instance depends only \emph{logarithmically} on $B$. Using this reduction, one can translate the results for unconstrained and constrained submodular maximization to the DR-submodular setting for many types of constraints in a unified manner. |
Tasks | |
Published | 2016-06-27 |
URL | http://arxiv.org/abs/1606.08362v2 |
http://arxiv.org/pdf/1606.08362v2.pdf | |
PWC | https://paperswithcode.com/paper/a-reduction-for-optimizing-lattice-submodular |
Repo | |
Framework | |
Scope for Machine Learning in Digital Manufacturing
Title | Scope for Machine Learning in Digital Manufacturing |
Authors | Martin Baumers, Ender Ozcan |
Abstract | This provocation paper provides an overview of the underlying optimisation problem in the emerging field of Digital Manufacturing. Initially, this paper discusses how the notion of Digital Manufacturing is transforming from a term describing a suite of software tools for the integration of production and design functions towards a more general concept incorporating computerised manufacturing and supply chain processes, as well as information collection and utilisation across the product life cycle. On this basis, we use the example of one such manufacturing process, Additive Manufacturing, to identify an integrated multi-objective optimisation problem underlying Digital Manufacturing. Forming an opportunity for a concurrent application of data science and optimisation, a set of challenges arising from this problem is outlined. |
Tasks | |
Published | 2016-09-19 |
URL | http://arxiv.org/abs/1609.05835v1 |
http://arxiv.org/pdf/1609.05835v1.pdf | |
PWC | https://paperswithcode.com/paper/scope-for-machine-learning-in-digital |
Repo | |
Framework | |
Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders
Title | Disentangling Space and Time in Video with Hierarchical Variational Auto-encoders |
Authors | Will Grathwohl, Aaron Wilson |
Abstract | There are many forms of feature information present in video data. Principle among them are object identity information which is largely static across multiple video frames, and object pose and style information which continuously transforms from frame to frame. Most existing models confound these two types of representation by mapping them to a shared feature space. In this paper we propose a probabilistic approach for learning separable representations of object identity and pose information using unsupervised video data. Our approach leverages a deep generative model with a factored prior distribution that encodes properties of temporal invariances in the hidden feature set. Learning is achieved via variational inference. We present results of learning identity and pose information on a dataset of moving characters as well as a dataset of rotating 3D objects. Our experimental results demonstrate our model’s success in factoring its representation, and demonstrate that the model achieves improved performance in transfer learning tasks. |
Tasks | Transfer Learning |
Published | 2016-12-14 |
URL | http://arxiv.org/abs/1612.04440v2 |
http://arxiv.org/pdf/1612.04440v2.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-space-and-time-in-video-with |
Repo | |
Framework | |
Text Flow: A Unified Text Detection System in Natural Scene Images
Title | Text Flow: A Unified Text Detection System in Natural Scene Images |
Authors | Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan |
Abstract | The prevalent scene text detection approach follows four sequential steps comprising character candidate detection, false character candidate removal, text line extraction, and text line verification. However, errors occur and accumulate throughout each of these sequential steps which often lead to low detection performance. To address these issues, we propose a unified scene text detection system, namely Text Flow, by utilizing the minimum cost (min-cost) flow network model. With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively. The proposed technique has been tested on three public datasets, i.e, ICDAR2011 dataset, ICDAR2013 dataset and a multilingual dataset and it outperforms the state-of-the-art methods on all three datasets with much higher recall and F-score. The good performance on the multilingual dataset shows that the proposed technique can be used for the detection of texts in different languages. |
Tasks | Scene Text Detection |
Published | 2016-04-23 |
URL | http://arxiv.org/abs/1604.06877v1 |
http://arxiv.org/pdf/1604.06877v1.pdf | |
PWC | https://paperswithcode.com/paper/text-flow-a-unified-text-detection-system-in |
Repo | |
Framework | |
On Coreset Constructions for the Fuzzy $K$-Means Problem
Title | On Coreset Constructions for the Fuzzy $K$-Means Problem |
Authors | Johannes Blömer, Sascha Brauer, Kathrin Bujna |
Abstract | The fuzzy $K$-means problem is a popular generalization of the well-known $K$-means problem to soft clusterings. We present the first coresets for fuzzy $K$-means with size linear in the dimension, polynomial in the number of clusters, and poly-logarithmic in the number of points. We show that these coresets can be employed in the computation of a $(1+\epsilon)$-approximation for fuzzy $K$-means, improving previously presented results. We further show that our coresets can be maintained in an insertion-only streaming setting, where data points arrive one-by-one. |
Tasks | |
Published | 2016-12-22 |
URL | http://arxiv.org/abs/1612.07516v3 |
http://arxiv.org/pdf/1612.07516v3.pdf | |
PWC | https://paperswithcode.com/paper/on-coreset-constructions-for-the-fuzzy-k |
Repo | |
Framework | |
PAC-Bayesian Theory Meets Bayesian Inference
Title | PAC-Bayesian Theory Meets Bayesian Inference |
Authors | Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien |
Abstract | We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam’s razor criteria, under the assumption that the data is generated by an i.i.d distribution. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks. |
Tasks | Bayesian Inference |
Published | 2016-05-27 |
URL | http://arxiv.org/abs/1605.08636v4 |
http://arxiv.org/pdf/1605.08636v4.pdf | |
PWC | https://paperswithcode.com/paper/pac-bayesian-theory-meets-bayesian-inference |
Repo | |
Framework | |
Structured Sparse Regression via Greedy Hard-Thresholding
Title | Structured Sparse Regression via Greedy Hard-Thresholding |
Authors | Prateek Jain, Nikhil Rao, Inderjit Dhillon |
Abstract | Several learning applications require solving high-dimensional regression problems where the relevant features belong to a small number of (overlapping) groups. For very large datasets and under standard sparsity constraints, hard thresholding methods have proven to be extremely efficient, but such methods require NP hard projections when dealing with overlapping groups. In this paper, we show that such NP-hard projections can not only be avoided by appealing to submodular optimization, but such methods come with strong theoretical guarantees even in the presence of poorly conditioned data (i.e. say when two features have correlation $\geq 0.99$), which existing analyses cannot handle. These methods exhibit an interesting computation-accuracy trade-off and can be extended to significantly harder problems such as sparse overlapping groups. Experiments on both real and synthetic data validate our claims and demonstrate that the proposed methods are orders of magnitude faster than other greedy and convex relaxation techniques for learning with group-structured sparsity. |
Tasks | |
Published | 2016-02-19 |
URL | http://arxiv.org/abs/1602.06042v2 |
http://arxiv.org/pdf/1602.06042v2.pdf | |
PWC | https://paperswithcode.com/paper/structured-sparse-regression-via-greedy-hard |
Repo | |
Framework | |
Latent Model Ensemble with Auto-localization
Title | Latent Model Ensemble with Auto-localization |
Authors | Miao Sun, Tony X. Han, Xun Xu, Ming-Chang Liu, Ahmad Khodayari-Rostamabad |
Abstract | Deep Convolutional Neural Networks (CNN) have exhibited superior performance in many visual recognition tasks including image classification, object detection, and scene label- ing, due to their large learning capacity and resistance to overfit. For the image classification task, most of the current deep CNN- based approaches take the whole size-normalized image as input and have achieved quite promising results. Compared with the previously dominating approaches based on feature extraction, pooling, and classification, the deep CNN-based approaches mainly rely on the learning capability of deep CNN to achieve superior results: the burden of minimizing intra-class variation while maximizing inter-class difference is entirely dependent on the implicit feature learning component of deep CNN; we rely upon the implicitly learned filters and pooling component to select the discriminative regions, which correspond to the activated neurons. However, if the irrelevant regions constitute a large portion of the image of interest, the classification performance of the deep CNN, which takes the whole image as input, can be heavily affected. To solve this issue, we propose a novel latent CNN framework, which treats the most discriminate region as a latent variable. We can jointly learn the global CNN with the latent CNN to avoid the aforementioned big irrelevant region issue, and our experimental results show the evident advantage of the proposed latent CNN over traditional deep CNN: latent CNN outperforms the state-of-the-art performance of deep CNN on standard benchmark datasets including the CIFAR-10, CIFAR- 100, MNIST and PASCAL VOC 2007 Classification dataset. |
Tasks | Image Classification, Object Detection |
Published | 2016-04-15 |
URL | http://arxiv.org/abs/1604.04333v2 |
http://arxiv.org/pdf/1604.04333v2.pdf | |
PWC | https://paperswithcode.com/paper/latent-model-ensemble-with-auto-localization |
Repo | |
Framework | |
Towards End-to-End Audio-Sheet-Music Retrieval
Title | Towards End-to-End Audio-Sheet-Music Retrieval |
Authors | Matthias Dorfer, Andreas Arzt, Gerhard Widmer |
Abstract | This paper demonstrates the feasibility of learning to retrieve short snippets of sheet music (images) when given a short query excerpt of music (audio) – and vice versa –, without any symbolic representation of music or scores. This would be highly useful in many content-based musical retrieval scenarios. Our approach is based on Deep Canonical Correlation Analysis (DCCA) and learns correlated latent spaces allowing for cross-modality retrieval in both directions. Initial experiments with relatively simple monophonic music show promising results. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05070v1 |
http://arxiv.org/pdf/1612.05070v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-end-to-end-audio-sheet-music |
Repo | |
Framework | |