Paper Group ANR 366
Light Cascaded Convolutional Neural Networks for Accurate Player Detection. Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing. Clustering Patients with Tensor Decomposition. Performance Prediction and Optimization of Solar Water Heater via a Knowledge-Based Machine Learning Method. Provable Alternating Gradient Descent for …
Light Cascaded Convolutional Neural Networks for Accurate Player Detection
Title | Light Cascaded Convolutional Neural Networks for Accurate Player Detection |
Authors | Keyu Lu, Jianhui Chen, James J. Little, Hangen He |
Abstract | Vision based player detection is important in sports applications. Accuracy, efficiency, and low memory consumption are desirable for real-time tasks such as intelligent broadcasting and automatic event classification. In this paper, we present a cascaded convolutional neural network (CNN) that satisfies all three of these requirements. Our method first trains a binary (player/non-player) classification network from labeled image patches. Then, our method efficiently applies the network to a whole image in testing. We conducted experiments on basketball and soccer games. Experimental results demonstrate that our method can accurately detect players under challenging conditions such as varying illumination, highly dynamic camera movements and motion blur. Comparing with conventional CNNs, our approach achieves state-of-the-art accuracy on both games with 1000x fewer parameters (i.e., it is light}. |
Tasks | |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1709.10230v1 |
http://arxiv.org/pdf/1709.10230v1.pdf | |
PWC | https://paperswithcode.com/paper/light-cascaded-convolutional-neural-networks |
Repo | |
Framework | |
Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing
Title | Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing |
Authors | Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell |
Abstract | This work considers the trade-off between accuracy and test-time computational cost of deep neural networks (DNNs) via \emph{anytime} predictions from auxiliary predictions. Specifically, we optimize auxiliary losses jointly in an \emph{adaptive} weighted sum, where the weights are inversely proportional to average of each loss. Intuitively, this balances the losses to have the same scale. We demonstrate theoretical considerations that motivate this approach from multiple viewpoints, including connecting it to optimizing the geometric mean of the expectation of each loss, an objective that ignores the scale of losses. Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally. In particular, anytime neural networks (ANNs) can achieve the same accuracy faster using adaptive weights on a small network than using static constant weights on a large one. For problems with high performance saturation, we also show a sequence of exponentially deepening ANNscan achieve near-optimal anytime results at any budget, at the cost of a const fraction of extra computation. |
Tasks | |
Published | 2017-08-22 |
URL | http://arxiv.org/abs/1708.06832v3 |
http://arxiv.org/pdf/1708.06832v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-anytime-predictions-in-neural |
Repo | |
Framework | |
Clustering Patients with Tensor Decomposition
Title | Clustering Patients with Tensor Decomposition |
Authors | Matteo Ruffini, Ricard Gavaldà, Esther Limón |
Abstract | In this paper we present a method for the unsupervised clustering of high-dimensional binary data, with a special focus on electronic healthcare records. We present a robust and efficient heuristic to face this problem using tensor decomposition. We present the reasons why this approach is preferable for tasks such as clustering patient records, to more commonly used distance-based methods. We run the algorithm on two datasets of healthcare records, obtaining clinically meaningful results. |
Tasks | |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08994v1 |
http://arxiv.org/pdf/1708.08994v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-patients-with-tensor-decomposition |
Repo | |
Framework | |
Performance Prediction and Optimization of Solar Water Heater via a Knowledge-Based Machine Learning Method
Title | Performance Prediction and Optimization of Solar Water Heater via a Knowledge-Based Machine Learning Method |
Authors | Hao Li, Zhijian Liu |
Abstract | Measuring the performance of solar energy and heat transfer systems requires a lot of time, economic cost and manpower. Meanwhile, directly predicting their performance is challenging due to the complicated internal structures. Fortunately, a knowledge-based machine learning method can provide a promising prediction and optimization strategy for the performance of energy systems. In this Chapter, the authors will show how they utilize the machine learning models trained from a large experimental database to perform precise prediction and optimization on a solar water heater (SWH) system. A new energy system optimization strategy based on a high-throughput screening (HTS) process is proposed. This Chapter consists of: i) Comparative studies on varieties of machine learning models (artificial neural networks (ANNs), support vector machine (SVM) and extreme learning machine (ELM)) to predict the performances of SWHs; ii) Development of an ANN-based software to assist the quick prediction and iii) Introduction of a computational HTS method to design a high-performance SWH system. |
Tasks | |
Published | 2017-10-06 |
URL | http://arxiv.org/abs/1710.02511v1 |
http://arxiv.org/pdf/1710.02511v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-prediction-and-optimization-of |
Repo | |
Framework | |
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
Title | Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations |
Authors | Yuanzhi Li, Yingyu Liang |
Abstract | Non-negative matrix factorization is a basic tool for decomposing data into the feature and weight matrices under non-negativity constraints, and in practice is often solved in the alternating minimization framework. However, it is unclear whether such algorithms can recover the ground-truth feature matrix when the weights for different features are highly correlated, which is common in applications. This paper proposes a simple and natural alternating gradient descent based algorithm, and shows that with a mild initialization it provably recovers the ground-truth in the presence of strong correlations. In most interesting cases, the correlation can be in the same order as the highest possible. Our analysis also reveals its several favorable features including robustness to noise. We complement our theoretical results with empirical studies on semi-synthetic datasets, demonstrating its advantage over several popular methods in recovering the ground-truth. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.04097v1 |
http://arxiv.org/pdf/1706.04097v1.pdf | |
PWC | https://paperswithcode.com/paper/provable-alternating-gradient-descent-for-non |
Repo | |
Framework | |
Randomized Kernel Methods for Least-Squares Support Vector Machines
Title | Randomized Kernel Methods for Least-Squares Support Vector Machines |
Authors | M. Andrecut |
Abstract | The least-squares support vector machine is a frequently used kernel method for non-linear regression and classification tasks. Here we discuss several approximation algorithms for the least-squares support vector machine classifier. The proposed methods are based on randomized block kernel matrices, and we show that they provide good accuracy and reliable scaling for multi-class classification problems with relatively large data sets. Also, we present several numerical experiments that illustrate the practical applicability of the proposed methods. |
Tasks | |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07830v1 |
http://arxiv.org/pdf/1703.07830v1.pdf | |
PWC | https://paperswithcode.com/paper/randomized-kernel-methods-for-least-squares |
Repo | |
Framework | |
Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions
Title | Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions |
Authors | Roy R. Lederman, Amit Singer |
Abstract | Single particle cryo-electron microscopy (EM) is an increasingly popular method for determining the 3-D structure of macromolecules from noisy 2-D images of single macromolecules whose orientations and positions are random and unknown. One of the great opportunities in cryo-EM is to recover the structure of macromolecules in heterogeneous samples, where multiple types or multiple conformations are mixed together. Indeed, in recent years, many tools have been introduced for the analysis of multiple discrete classes of molecules mixed together in a cryo-EM experiment. However, many interesting structures have a continuum of conformations which do not fit discrete models nicely; the analysis of such continuously heterogeneous models has remained a more elusive goal. In this manuscript, we propose to represent heterogeneous molecules and similar structures as higher dimensional objects. We generalize the basic operations used in many existing reconstruction algorithms, making our approach generic in the sense that, in principle, existing algorithms can be adapted to reconstruct those higher dimensional objects. As proof of concept, we present a prototype of a new algorithm which we use to solve simulated reconstruction problems. |
Tasks | |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.02899v1 |
http://arxiv.org/pdf/1704.02899v1.pdf | |
PWC | https://paperswithcode.com/paper/continuously-heterogeneous-hyper-objects-in |
Repo | |
Framework | |
HMM-based Writer Identification in Music Score Documents without Staff-Line Removal
Title | HMM-based Writer Identification in Music Score Documents without Staff-Line Removal |
Authors | Partha Pratim Roy, Ayan Kumar Bhunia, Umapada Pal |
Abstract | Writer identification from musical score documents is a challenging task due to its inherent problem of overlapping of musical symbols with staff lines. Most of the existing works in the literature of writer identification in musical score documents were performed after a preprocessing stage of staff lines removal. In this paper we propose a novel writer identification framework in musical documents without removing staff lines from documents. In our approach, Hidden Markov Model has been used to model the writing style of the writers without removing staff lines. The sliding window features are extracted from musical score lines and they are used to build writer specific HMM models. Given a query musical sheet, writer specific confidence for each musical line is returned by each writer specific model using a loglikelihood score. Next, a loglikelihood score in page level is computed by weighted combination of these scores from the corresponding line images of the page. A novel Factor Analysis based feature selection technique is applied in sliding window features to reduce the noise appearing from staff lines which proves efficiency in writer identification performance.In our framework we have also proposed a novel score line detection approach in musical sheet using HMM. The experiment has been performed in CVC-MUSCIMA dataset and the results obtained that the proposed approach is efficient for score line detection and writer identification without removing staff lines. To get the idea of computation time of our method, detail analysis of execution time is also provided. |
Tasks | Feature Selection |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06828v2 |
http://arxiv.org/pdf/1707.06828v2.pdf | |
PWC | https://paperswithcode.com/paper/hmm-based-writer-identification-in-music |
Repo | |
Framework | |
Intelligence Quotient and Intelligence Grade of Artificial Intelligence
Title | Intelligence Quotient and Intelligence Grade of Artificial Intelligence |
Authors | Feng Liu, Yong Shi, Ying Liu |
Abstract | Although artificial intelligence is currently one of the most interesting areas in scientific research, the potential threats posed by emerging AI systems remain a source of persistent controversy. To address the issue of AI threat, this study proposes a standard intelligence model that unifies AI and human characteristics in terms of four aspects of knowledge, i.e., input, output, mastery, and creation. Using this model, we observe three challenges, namely, expanding of the von Neumann architecture; testing and ranking the intelligence quotient of naturally and artificially intelligent systems, including humans, Google, Bing, Baidu, and Siri; and finally, the dividing of artificially intelligent systems into seven grades from robots to Google Brain. Based on this, we conclude that AlphaGo belongs to the third grade. |
Tasks | |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1709.10242v2 |
http://arxiv.org/pdf/1709.10242v2.pdf | |
PWC | https://paperswithcode.com/paper/intelligence-quotient-and-intelligence-grade |
Repo | |
Framework | |
Hybrid Dialog State Tracker with ASR Features
Title | Hybrid Dialog State Tracker with ASR Features |
Authors | Miroslav Vodolán, Rudolf Kadlec, Jan Kleindienst |
Abstract | This paper presents a hybrid dialog state tracker enhanced by trainable Spoken Language Understanding (SLU) for slot-filling dialog systems. Our architecture is inspired by previously proposed neural-network-based belief-tracking systems. In addition, we extended some parts of our modular architecture with differentiable rules to allow end-to-end training. We hypothesize that these rules allow our tracker to generalize better than pure machine-learning based systems. For evaluation, we used the Dialog State Tracking Challenge (DSTC) 2 dataset - a popular belief tracking testbed with dialogs from restaurant information system. To our knowledge, our hybrid tracker sets a new state-of-the-art result in three out of four categories within the DSTC2. |
Tasks | Slot Filling, Spoken Language Understanding |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06336v1 |
http://arxiv.org/pdf/1702.06336v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-dialog-state-tracker-with-asr-features |
Repo | |
Framework | |
Cold Fusion: Training Seq2Seq Models Together with Language Models
Title | Cold Fusion: Training Seq2Seq Models Together with Language Models |
Authors | Anuroop Sriram, Heewoo Jun, Sanjeev Satheesh, Adam Coates |
Abstract | Sequence-to-sequence (Seq2Seq) models with attention have excelled at tasks which involve generating natural language sentences such as machine translation, image captioning and speech recognition. Performance has further been improved by leveraging unlabeled data, often in the form of a language model. In this work, we present the Cold Fusion method, which leverages a pre-trained language model during training, and show its effectiveness on the speech recognition task. We show that Seq2Seq models with Cold Fusion are able to better utilize language information enjoying i) faster convergence and better generalization, and ii) almost complete transfer to a new domain while using less than 10% of the labeled training data. |
Tasks | Image Captioning, Language Modelling, Machine Translation, Speech Recognition |
Published | 2017-08-21 |
URL | http://arxiv.org/abs/1708.06426v1 |
http://arxiv.org/pdf/1708.06426v1.pdf | |
PWC | https://paperswithcode.com/paper/cold-fusion-training-seq2seq-models-together |
Repo | |
Framework | |
Nonparametric Variational Auto-encoders for Hierarchical Representation Learning
Title | Nonparametric Variational Auto-encoders for Hierarchical Representation Learning |
Authors | Prasoon Goyal, Zhiting Hu, Xiaodan Liang, Chenyu Wang, Eric Xing |
Abstract | The recently developed variational autoencoders (VAEs) have proved to be an effective confluence of the rich representational power of neural networks with Bayesian methods. However, most work on VAEs use a rather simple prior over the latent variables such as standard normal distribution, thereby restricting its applications to relatively simple phenomena. In this work, we propose hierarchical nonparametric variational autoencoders, which combines tree-structured Bayesian nonparametric priors with VAEs, to enable infinite flexibility of the latent representation space. Both the neural parameters and Bayesian priors are learned jointly using tailored variational inference. The resulting model induces a hierarchical structure of latent semantic concepts underlying the data corpus, and infers accurate representations of data instances. We apply our model in video representation learning. Our method is able to discover highly interpretable activity hierarchies, and obtain improved clustering accuracy and generalization capacity based on the learned rich representations. |
Tasks | Representation Learning |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07027v2 |
http://arxiv.org/pdf/1703.07027v2.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-variational-auto-encoders-for |
Repo | |
Framework | |
Regret Minimization in Behaviorally-Constrained Zero-Sum Games
Title | Regret Minimization in Behaviorally-Constrained Zero-Sum Games |
Authors | Gabriele Farina, Christian Kroer, Tuomas Sandholm |
Abstract | No-regret learning has emerged as a powerful tool for solving extensive-form games. This was facilitated by the counterfactual-regret minimization (CFR) framework, which relies on the instantiation of regret minimizers for simplexes at each information set of the game. We use an instantiation of the CFR framework to develop algorithms for solving behaviorally-constrained (and, as a special case, perturbed in the Selten sense) extensive-form games, which allows us to compute approximate Nash equilibrium refinements. Nash equilibrium refinements are motivated by a major deficiency in Nash equilibrium: it provides virtually no guarantees on how it will play in parts of the game tree that are reached with zero probability. Refinements can mend this issue, but have not been adopted in practice, mostly due to a lack of scalable algorithms. We show that, compared to standard algorithms, our method finds solutions that have substantially better refinement properties, while enjoying a convergence rate that is comparable to that of state-of-the-art algorithms for Nash equilibrium computation both in theory and practice. |
Tasks | |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03441v1 |
http://arxiv.org/pdf/1711.03441v1.pdf | |
PWC | https://paperswithcode.com/paper/regret-minimization-in-behaviorally |
Repo | |
Framework | |
Linear Time Computation of Moments in Sum-Product Networks
Title | Linear Time Computation of Moments in Sum-Product Networks |
Authors | Han Zhao, Geoff Gordon |
Abstract | Bayesian online algorithms for Sum-Product Networks (SPNs) need to update their posterior distribution after seeing one single additional instance. To do so, they must compute moments of the model parameters under this distribution. The best existing method for computing such moments scales quadratically in the size of the SPN, although it scales linearly for trees. This unfortunate scaling makes Bayesian online algorithms prohibitively expensive, except for small or tree-structured SPNs. We propose an optimal linear-time algorithm that works even when the SPN is a general directed acyclic graph (DAG), which significantly broadens the applicability of Bayesian online algorithms for SPNs. There are three key ingredients in the design and analysis of our algorithm: 1). For each edge in the graph, we construct a linear time reduction from the moment computation problem to a joint inference problem in SPNs. 2). Using the property that each SPN computes a multilinear polynomial, we give an efficient procedure for polynomial evaluation by differentiation without expanding the network that may contain exponentially many monomials. 3). We propose a dynamic programming method to further reduce the computation of the moments of all the edges in the graph from quadratic to linear. We demonstrate the usefulness of our linear time algorithm by applying it to develop a linear time assume density filter (ADF) for SPNs. |
Tasks | |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04767v2 |
http://arxiv.org/pdf/1702.04767v2.pdf | |
PWC | https://paperswithcode.com/paper/linear-time-computation-of-moments-in-sum |
Repo | |
Framework | |
Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs
Title | Ensembles of Randomized Time Series Shapelets Provide Improved Accuracy while Reducing Computational Costs |
Authors | Atif Raza, Stefan Kramer |
Abstract | Shapelets are discriminative time series subsequences that allow generation of interpretable classification models, which provide faster and generally better classification than the nearest neighbor approach. However, the shapelet discovery process requires the evaluation of all possible subsequences of all time series in the training set, making it extremely computation intensive. Consequently, shapelet discovery for large time series datasets quickly becomes intractable. A number of improvements have been proposed to reduce the training time. These techniques use approximation or discretization and often lead to reduced classification accuracy compared to the exact method. We are proposing the use of ensembles of shapelet-based classifiers obtained using random sampling of the shapelet candidates. Using random sampling reduces the number of evaluated candidates and consequently the required computational cost, while the classification accuracy of the resulting models is also not significantly different than that of the exact algorithm. The combination of randomized classifiers rectifies the inaccuracies of individual models because of the diversity of the solutions. Based on the experiments performed, it is shown that the proposed approach of using an ensemble of inexpensive classifiers provides better classification accuracy compared to the exact method at a significantly lesser computational cost. |
Tasks | Time Series |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06712v1 |
http://arxiv.org/pdf/1702.06712v1.pdf | |
PWC | https://paperswithcode.com/paper/ensembles-of-randomized-time-series-shapelets |
Repo | |
Framework | |