May 6, 2019

3211 words 16 mins read

Paper Group ANR 308

The observer-assisted method for adjusting hyper-parameters in deep learning algorithms. Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification. Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce. Document Clustering Games in Static and Dynamic Scen …

The observer-assisted method for adjusting hyper-parameters in deep learning algorithms


Title	The observer-assisted method for adjusting hyper-parameters in deep learning algorithms
Authors	Maciej Wielgosz
Abstract	This paper presents a concept of a novel method for adjusting hyper-parameters in Deep Learning (DL) algorithms. An external agent-observer monitors a performance of a selected Deep Learning algorithm. The observer learns to model the DL algorithm using a series of random experiments. Consequently, it may be used for predicting a response of the DL algorithm in terms of a selected quality measurement to a set of hyper-parameters. This allows to construct an ensemble composed of a series of evaluators which constitute an observer-assisted architecture. The architecture may be used to gradually iterate towards to the best achievable quality score in tiny steps governed by a unit of progress. The algorithm is stopped when the maximum number of steps is reached or no further progress is made.
Tasks
Published	2016-11-30
URL	http://arxiv.org/abs/1611.10328v1
PDF	http://arxiv.org/pdf/1611.10328v1.pdf
PWC	https://paperswithcode.com/paper/the-observer-assisted-method-for-adjusting
Repo
Framework

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification


Title	Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification
Authors	Amir Ahooye Atashin, Kamaledin Ghiasi-Shirazi, Ahad Harati
Abstract	Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a well-known discriminative method that has been successfully used for this task. However, LDCRF can only be trained with pre-segmented data sequences in which the label of each frame is available apriori. In the realm of neural networks, the invention of connectionist temporal classification (CTC) made it possible to train recurrent neural networks on unsegmented sequences with great success. In this paper, we use CTC to train an LDCRF model on unsegmented sequences. Experimental results on two gesture recognition tasks show that the proposed method outperforms LDCRFs, hidden Markov models, and conditional random fields.
Tasks	Gesture Recognition, Speech Recognition
Published	2016-06-26
URL	http://arxiv.org/abs/1606.08051v3
PDF	http://arxiv.org/pdf/1606.08051v3.pdf
PWC	https://paperswithcode.com/paper/training-ldcrf-model-on-unsegmented-sequences
Repo
Framework


Title	Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce
Authors	Tom Zahavy, Alessandro Magnani, Abhinandan Krishnan, Shie Mannor
Abstract	Classifying products into categories precisely and efficiently is a major challenge in modern e-commerce. The high traffic of new products uploaded daily and the dynamic nature of the categories raise the need for machine learning models that can reduce the cost and time of human editors. In this paper, we propose a decision level fusion approach for multi-modal product classification using text and image inputs. We train input specific state-of-the-art deep neural networks for each input source, show the potential of forging them together into a multi-modal architecture and train a novel policy network that learns to choose between them. Finally, we demonstrate that our multi-modal network improves the top-1 accuracy % over both networks on a real-world large-scale product classification dataset that we collected fromWalmart.com. While we focus on image-text fusion that characterizes e-commerce domains, our algorithms can be easily applied to other modalities such as audio, video, physical sensors, etc.
Tasks
Published	2016-11-29
URL	http://arxiv.org/abs/1611.09534v1
PDF	http://arxiv.org/pdf/1611.09534v1.pdf
PWC	https://paperswithcode.com/paper/is-a-picture-worth-a-thousand-words-a-deep
Repo
Framework

Document Clustering Games in Static and Dynamic Scenarios


Title	Document Clustering Games in Static and Dynamic Scenarios
Authors	Rocco Tripodi, Marcello Pelillo
Abstract	In this work we propose a game theoretic model for document clustering. Each document to be clustered is represented as a player and each cluster as a strategy. The players receive a reward interacting with other players that they try to maximize choosing their best strategies. The geometry of the data is modeled with a weighted graph that encodes the pairwise similarity among documents, so that similar players are constrained to choose similar strategies, updating their strategy preferences at each iteration of the games. We used different approaches to find the prototypical elements of the clusters and with this information we divided the players into two disjoint sets, one collecting players with a definite strategy and the other one collecting players that try to learn from others the correct strategy to play. The latter set of players can be considered as new data points that have to be clustered according to previous information. This representation is useful in scenarios in which the data are streamed continuously. The evaluation of the system was conducted on 13 document datasets using different settings. It shows that the proposed method performs well compared to different document clustering algorithms.
Tasks
Published	2016-07-08
URL	http://arxiv.org/abs/1607.02436v1
PDF	http://arxiv.org/pdf/1607.02436v1.pdf
PWC	https://paperswithcode.com/paper/document-clustering-games-in-static-and
Repo
Framework

Automatic Identification of Scenedesmus Polymorphic Microalgae from Microscopic Images


Title	Automatic Identification of Scenedesmus Polymorphic Microalgae from Microscopic Images
Authors	Jhony-Heriberto Giraldo-Zuluaga, Geman Diez, Alexander Gomez, Tatiana Martinez, Mariana Peñuela Vasquez, Jesus Francisco Vargas Bonilla, Augusto Salazar
Abstract	Microalgae counting is used to measure biomass quantity. Usually, it is performed in a manual way using a Neubauer chamber and expert criterion, with the risk of a high error rate. This paper addresses the methodology for automatic identification of Scenedesmus microalgae (used in the methane production and food industry) and applies it to images captured by a digital microscope. The use of contrast adaptive histogram equalization for pre-processing, and active contours for segmentation are presented. The calculation of statistical features (Histogram of Oriented Gradients, Hu and Zernike moments) with texture features (Haralick and Local Binary Patterns descriptors) are proposed for algae characterization. Scenedesmus algae can build coenobia consisting of 1, 2, 4 and 8 cells. The amount of algae of each coenobium helps to determine the amount of lipids, proteins, and other substances in a given sample of a algae crop. The knowledge of the quantity of those elements improves the quality of bioprocess applications. Classification of coenobia achieves accuracies of 98.63% and 97.32% with Support Vector Machine (SVM) and Artificial Neural Network (ANN), respectively. According to the results it is possible to consider the proposed methodology as an alternative to the traditional technique for algae counting. The database used in this paper is publicly available for download.
Tasks
Published	2016-12-21
URL	http://arxiv.org/abs/1612.07379v2
PDF	http://arxiv.org/pdf/1612.07379v2.pdf
PWC	https://paperswithcode.com/paper/automatic-identification-of-scenedesmus
Repo
Framework

Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM


Title	Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM
Authors	Ferhat Özgür Çatak
Abstract	In machine learning area, as the number of labeled input samples becomes very large, it is very difficult to build a classification model because of input data set is not fit in a memory in training phase of the algorithm, therefore, it is necessary to utilize data partitioning to handle overall data set. Bagging and boosting based data partitioning methods have been broadly used in data mining and pattern recognition area. Both of these methods have shown a great possibility for improving classification model performance. This study is concerned with the analysis of data set partitioning with noise removal and its impact on the performance of multiple classifier models. In this study, we propose noise filtering preprocessing at each data set partition to increment classifier model performance. We applied Gini impurity approach to find the best split percentage of noise filter ratio. The filtered sub data set is then used to train individual ensemble models.
Tasks
Published	2016-02-09
URL	http://arxiv.org/abs/1602.02888v1
PDF	http://arxiv.org/pdf/1602.02888v1.pdf
PWC	https://paperswithcode.com/paper/robust-ensemble-classifier-combination-based
Repo
Framework

Sub-Sampled Newton Methods II: Local Convergence Rates


Title	Sub-Sampled Newton Methods II: Local Convergence Rates
Authors	Farbod Roosta-Khorasani, Michael W. Mahoney
Abstract	Many data-fitting applications require the solution of an optimization problem involving a sum of large number of functions of high dimensional parameter. Here, we consider the problem of minimizing a sum of $n$ functions over a convex constraint set $\mathcal{X} \subseteq \mathbb{R}^{p}$ where both $n$ and $p$ are large. In such problems, sub-sampling as a way to reduce $n$ can offer great amount of computational efficiency. Within the context of second order methods, we first give quantitative local convergence results for variants of Newton’s method where the Hessian is uniformly sub-sampled. Using random matrix concentration inequalities, one can sub-sample in a way that the curvature information is preserved. Using such sub-sampling strategy, we establish locally Q-linear and Q-superlinear convergence rates. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or Levenberg-type regularization. Finally, in addition to Hessian sub-sampling, we consider sub-sampling the gradient as way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra (RandNLA) to obtain the proper sampling strategy and we establish locally R-linear convergence rates. In such a setting, we also show that a very aggressive sample size increase results in a R-superlinearly convergent algorithm. While the sample size depends on the condition number of the problem, our convergence rates are problem-independent, i.e., they do not depend on the quantities related to the problem. Hence, our analysis here can be used to complement the results of our basic framework from the companion paper, [38], by exploring algorithmic trade-offs that are important in practice.
Tasks
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04738v3
PDF	http://arxiv.org/pdf/1601.04738v3.pdf
PWC	https://paperswithcode.com/paper/sub-sampled-newton-methods-ii-local
Repo
Framework

Provable learning of Noisy-or Networks


Title	Provable learning of Noisy-or Networks
Authors	Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski
Abstract	Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Finding parameters with the maximum likelihood is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structures: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer {\em noisy or} network, which is a textbook example of a Bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn’t decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.
Tasks	Latent Variable Models, Topic Models
Published	2016-12-28
URL	http://arxiv.org/abs/1612.08795v1
PDF	http://arxiv.org/pdf/1612.08795v1.pdf
PWC	https://paperswithcode.com/paper/provable-learning-of-noisy-or-networks
Repo
Framework

A Geometrical Approach to Topic Model Estimation


Title	A Geometrical Approach to Topic Model Estimation
Authors	Zheng Tracy Ke
Abstract	In the probabilistic topic models, the quantity of interest—a low-rank matrix consisting of topic vectors—is hidden in the text corpus matrix, masked by noise, and the Singular Value Decomposition (SVD) is a potentially useful tool for learning such a low-rank matrix. However, the connection between this low-rank matrix and the singular vectors of the text corpus matrix are usually complicated and hard to spell out, so how to use SVD for learning topic models faces challenges. In this paper, we overcome the challenge by revealing a surprising insight: there is a low-dimensional simplex structure which can be viewed as a bridge between the low-rank matrix of interest and the SVD of the text corpus matrix, and allows us to conveniently reconstruct the former using the latter. Such an insight motivates a new SVD approach to learning topic models, which we analyze with delicate random matrix theory and derive the rate of convergence. We support our methods and theory numerically, using both simulated data and real data.
Tasks	Topic Models
Published	2016-08-16
URL	http://arxiv.org/abs/1608.04478v1
PDF	http://arxiv.org/pdf/1608.04478v1.pdf
PWC	https://paperswithcode.com/paper/a-geometrical-approach-to-topic-model
Repo
Framework

Multi Model Data mining approach for Heart failure prediction


Title	Multi Model Data mining approach for Heart failure prediction
Authors	Priyanka H U, Vivek R
Abstract	Developing predictive modelling solutions for risk estimation is extremely challenging in health-care informatics. Risk estimation involves integration of heterogeneous clinical sources having different representation from different health-care provider making the task increasingly complex. Such sources are typically voluminous, diverse, and significantly change over the time. Therefore, distributed and parallel computing tools collectively termed big data tools are in need which can synthesize and assist the physician to make right clinical decisions. In this work we propose multi-model predictive architecture, a novel approach for combining the predictive ability of multiple models for better prediction accuracy. We demonstrate the effectiveness and efficiency of the proposed work on data from Framingham Heart study. Results show that the proposed multi-model predictive architecture is able to provide better accuracy than best model approach. By modelling the error of predictive models we are able to choose sub set of models which yields accurate results. More information was modelled into system by multi-level mining which has resulted in enhanced predictive accuracy.
Tasks
Published	2016-09-29
URL	http://arxiv.org/abs/1609.09194v1
PDF	http://arxiv.org/pdf/1609.09194v1.pdf
PWC	https://paperswithcode.com/paper/multi-model-data-mining-approach-for-heart
Repo
Framework

Stylometric Analysis of Early Modern Period English Plays


Title	Stylometric Analysis of Early Modern Period English Plays
Authors	Mark Eisen, Santiago Segarra, Gabriel Egan, Alejandro Ribeiro
Abstract	Function word adjacency networks (WANs) are used to study the authorship of plays from the Early Modern English period. In these networks, nodes are function words and directed edges between two nodes represent the relative frequency of directed co-appearance of the two words. For every analyzed play, a WAN is constructed and these are aggregated to generate author profile networks. We first study the similarity of writing styles between Early English playwrights by comparing the profile WANs. The accuracy of using WANs for authorship attribution is then demonstrated by attributing known plays among six popular playwrights. Moreover, the WAN method is shown to outperform other frequency-based methods on attributing Early English plays. In addition, WANs are shown to be reliable classifiers even when attributing collaborative plays. For several plays of disputed co-authorship, a deeper analysis is performed by attributing every act and scene separately, in which we both corroborate existing breakdowns and provide evidence of new assignments.
Tasks
Published	2016-10-18
URL	http://arxiv.org/abs/1610.05670v2
PDF	http://arxiv.org/pdf/1610.05670v2.pdf
PWC	https://paperswithcode.com/paper/stylometric-analysis-of-early-modern-period
Repo
Framework

Resource Allocation in a MAC with and without security via Game Theoretic Learning


Title	Resource Allocation in a MAC with and without security via Game Theoretic Learning
Authors	Shahid Mehraj Shah, Krishna Chaitanya A, Vinod Sharma
Abstract	In this paper a $K$-user fading multiple access channel with and without security constraints is studied. First we consider a F-MAC without the security constraints. Under the assumption of individual CSI of users, we propose the problem of power allocation as a stochastic game when the receiver sends an ACK or a NACK depending on whether it was able to decode the message or not. We have used Multiplicative weight no-regret algorithm to obtain a Coarse Correlated Equilibrium (CCE). Then we consider the case when the users can decode ACK/NACK of each other. In this scenario we provide an algorithm to maximize the weighted sum-utility of all the users and obtain a Pareto optimal point. PP is socially optimal but may be unfair to individual users. Next we consider the case where the users can cooperate with each other so as to disagree with the policy which will be unfair to individual user. We then obtain a Nash bargaining solution, which in addition to being Pareto optimal, is also fair to each user. Next we study a $K$-user fading multiple access wiretap Channel with CSI of Eve available to the users. We use the previous algorithms to obtain a CCE, PP and a NBS. Next we consider the case where each user does not know the CSI of Eve but only its distribution. In that case we use secrecy outage as the criterion for the receiver to send an ACK or a NACK. Here also we use the previous algorithms to obtain a CCE, PP or a NBS. Finally we show that our algorithms can be extended to the case where a user can transmit at different rates. At the end we provide a few examples to compute different solutions and compare them under different CSI scenarios.
Tasks
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01346v1
PDF	http://arxiv.org/pdf/1607.01346v1.pdf
PWC	https://paperswithcode.com/paper/resource-allocation-in-a-mac-with-and-without
Repo
Framework

Stability revisited: new generalisation bounds for the Leave-one-Out


Title	Stability revisited: new generalisation bounds for the Leave-one-Out
Authors	Alain Celisse, Benjamin Guedj
Abstract	The present paper provides a new generic strategy leading to non-asymptotic theoretical guarantees on the Leave-one-Out procedure applied to a broad class of learning algorithms. This strategy relies on two main ingredients: the new notion of $L^q$ stability, and the strong use of moment inequalities. $L^q$ stability extends the ongoing notion of hypothesis stability while remaining weaker than the uniform stability. It leads to new PAC exponential generalisation bounds for Leave-one-Out under mild assumptions. In the literature, such bounds are available only for uniform stable algorithms under boundedness for instance. Our generic strategy is applied to the Ridge regression algorithm as a first step.
Tasks
Published	2016-08-23
URL	http://arxiv.org/abs/1608.06412v1
PDF	http://arxiv.org/pdf/1608.06412v1.pdf
PWC	https://paperswithcode.com/paper/stability-revisited-new-generalisation-bounds
Repo
Framework

Learning rotation invariant convolutional filters for texture classification


Title	Learning rotation invariant convolutional filters for texture classification
Authors	Diego Marcos, Michele Volpi, Devis Tuia
Abstract	We present a method for learning discriminative filters using a shallow Convolutional Neural Network (CNN). We encode rotation invariance directly in the model by tying the weights of groups of filters to several rotated versions of the canonical filter in the group. These filters can be used to extract rotation invariant features well-suited for image classification. We test this learning procedure on a texture classification benchmark, where the orientations of the training images differ from those of the test images. We obtain results comparable to the state-of-the-art. Compared to standard shallow CNNs, the proposed method obtains higher classification performance while reducing by an order of magnitude the number of parameters to be learned.
Tasks	Image Classification, Texture Classification
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06720v2
PDF	http://arxiv.org/pdf/1604.06720v2.pdf
PWC	https://paperswithcode.com/paper/learning-rotation-invariant-convolutional
Repo
Framework

Sequential ranking under random semi-bandit feedback


Title	Sequential ranking under random semi-bandit feedback
Authors	Hossein Vahabi, Paul Lagrée, Claire Vernade, Olivier Cappé
Abstract	In many web applications, a recommendation is not a single item suggested to a user but a list of possibly interesting contents that may be ranked in some contexts. The combinatorial bandit problem has been studied quite extensively these last two years and many theoretical results now exist : lower bounds on the regret or asymptotically optimal algorithms. However, because of the variety of situations that can be considered, results are designed to solve the problem for a specific reward structure such as the Cascade Model. The present work focuses on the problem of ranking items when the user is allowed to click on several items while scanning the list from top to bottom.
Tasks
Published	2016-03-04
URL	http://arxiv.org/abs/1603.01450v2
PDF	http://arxiv.org/pdf/1603.01450v2.pdf
PWC	https://paperswithcode.com/paper/sequential-ranking-under-random-semi-bandit
Repo
Framework