Paper Group ANR 125
Performance Comparison of Algorithms for Movie Rating Estimation. Theory II: Landscape of the Empirical Risk in Deep Learning. Users prefer Guetzli JPEG over same-sized libjpeg. To tune or not to tune the number of trees in random forest?. An Automatic Contextual Analysis and Clustering Classifiers Ensemble approach to Sentiment Analysis. Sparsity, …
Performance Comparison of Algorithms for Movie Rating Estimation
Title | Performance Comparison of Algorithms for Movie Rating Estimation |
Authors | Alper Kose, Can Kanbak, Noyan Evirgen |
Abstract | In this paper, our goal is to compare performances of three different algorithms to predict the ratings that will be given to movies by potential users where we are given a user-movie rating matrix based on the past observations. To this end, we evaluate User-Based Collaborative Filtering, Iterative Matrix Factorization and Yehuda Koren’s Integrated model using neighborhood and factorization where we use root mean square error (RMSE) as the performance evaluation metric. In short, we do not observe significant differences between performances, especially when the complexity increase is considered. We can conclude that Iterative Matrix Factorization performs fairly well despite its simplicity. |
Tasks | |
Published | 2017-11-05 |
URL | http://arxiv.org/abs/1711.01647v1 |
http://arxiv.org/pdf/1711.01647v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-comparison-of-algorithms-for |
Repo | |
Framework | |
Theory II: Landscape of the Empirical Risk in Deep Learning
Title | Theory II: Landscape of the Empirical Risk in Deep Learning |
Authors | Qianli Liao, Tomaso Poggio |
Abstract | Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima. However, the practical observation is that, at least in the case of the most successful Deep Convolutional Neural Networks (DCNNs), practitioners can always increase the network size to fit the training data (an extreme example would be [1]). The most successful DCNNs such as VGG and ResNets are best used with a degree of “overparametrization”. In this work, we characterize with a mix of theory and experiments, the landscape of the empirical risk of overparametrized DCNNs. We first prove in the regression framework the existence of a large number of degenerate global minimizers with zero empirical error (modulo inconsistent equations). The argument that relies on the use of Bezout theorem is rigorous when the RELUs are replaced by a polynomial nonlinearity (which empirically works as well). As described in our Theory III [2] paper, the same minimizers are degenerate and thus very likely to be found by SGD that will furthermore select with higher probability the most robust zero-minimizer. We further experimentally explored and visualized the landscape of empirical risk of a DCNN on CIFAR-10 during the entire training process and especially the global minima. Finally, based on our theoretical and experimental results, we propose an intuitive model of the landscape of DCNN’s empirical loss surface, which might not be as complicated as people commonly believe. |
Tasks | |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09833v2 |
http://arxiv.org/pdf/1703.09833v2.pdf | |
PWC | https://paperswithcode.com/paper/theory-ii-landscape-of-the-empirical-risk-in |
Repo | |
Framework | |
Users prefer Guetzli JPEG over same-sized libjpeg
Title | Users prefer Guetzli JPEG over same-sized libjpeg |
Authors | Jyrki Alakuijala, Robert Obryk, Zoltan Szabadka, Jan Wassenberg |
Abstract | We report on pairwise comparisons by human raters of JPEG images from libjpeg and our new Guetzli encoder. Although both files are size-matched, 75% of ratings are in favor of Guetzli. This implies the Butteraugli psychovisual image similarity metric which guides Guetzli is reasonably close to human perception at high quality levels. We provide access to the raw ratings and source images for further analysis and study. |
Tasks | |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04416v1 |
http://arxiv.org/pdf/1703.04416v1.pdf | |
PWC | https://paperswithcode.com/paper/users-prefer-guetzli-jpeg-over-same-sized |
Repo | |
Framework | |
To tune or not to tune the number of trees in random forest?
Title | To tune or not to tune the number of trees in random forest? |
Authors | Philipp Probst, Anne-Laure Boulesteix |
Abstract | The number of trees T in the random forest (RF) algorithm for supervised learning has to be set by the user. It is controversial whether T should simply be set to the largest computationally manageable value or whether a smaller T may in some cases be better. While the principle underlying bagging is that “more trees are better”, in practice the classification error rate sometimes reaches a minimum before increasing again for increasing number of trees. The goal of this paper is four-fold: (i) providing theoretical results showing that the expected error rate may be a non-monotonous function of the number of trees and explaining under which circumstances this happens; (ii) providing theoretical results showing that such non-monotonous patterns cannot be observed for other performance measures such as the Brier score and the logarithmic loss (for classification) and the mean squared error (for regression); (iii) illustrating the extent of the problem through an application to a large number (n = 306) of datasets from the public database OpenML; (iv) finally arguing in favor of setting it to a computationally feasible large number, depending on convergence properties of the desired performance measure. |
Tasks | |
Published | 2017-05-16 |
URL | http://arxiv.org/abs/1705.05654v1 |
http://arxiv.org/pdf/1705.05654v1.pdf | |
PWC | https://paperswithcode.com/paper/to-tune-or-not-to-tune-the-number-of-trees-in |
Repo | |
Framework | |
An Automatic Contextual Analysis and Clustering Classifiers Ensemble approach to Sentiment Analysis
Title | An Automatic Contextual Analysis and Clustering Classifiers Ensemble approach to Sentiment Analysis |
Authors | Murtadha Talib AL-Sharuee, Fei Liu, Mahardhika Pratama |
Abstract | Products reviews are one of the major resources to determine the public sentiment. The existing literature on reviews sentiment analysis mainly utilizes supervised paradigm, which needs labeled data to be trained on and suffers from domain-dependency. This article addresses these issues by describes a completely automatic approach for sentiment analysis based on unsupervised ensemble learning. The method consists of two phases. The first phase is contextual analysis, which has five processes, namely (1) data preparation; (2) spelling correction; (3) intensifier handling; (4) negation handling and (5) contrast handling. The second phase comprises the unsupervised learning approach, which is an ensemble of clustering classifiers using a majority voting mechanism with different weight schemes. The base classifier of the ensemble method is a modified k-means algorithm. The base classifier is modified by extracting initial centroids from the feature set via using SentWordNet (SWN). We also introduce new sentiment analysis problems of Australian airlines and home builders which offer potential benchmark problems in the sentiment analysis field. Our experiments on datasets from different domains show that contextual analysis and the ensemble phases improve the clustering performance in term of accuracy, stability and generalization ability. |
Tasks | Sentiment Analysis, Spelling Correction |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10130v1 |
http://arxiv.org/pdf/1705.10130v1.pdf | |
PWC | https://paperswithcode.com/paper/an-automatic-contextual-analysis-and |
Repo | |
Framework | |
Sparsity, variance and curvature in multi-armed bandits
Title | Sparsity, variance and curvature in multi-armed bandits |
Authors | Sébastien Bubeck, Michael B. Cohen, Yuanzhi Li |
Abstract | In (online) learning theory the concepts of sparsity, variance and curvature are well-understood and are routinely used to obtain refined regret and generalization bounds. In this paper we further our understanding of these concepts in the more challenging limited feedback scenario. We consider the adversarial multi-armed bandit and linear bandit settings and solve several open problems pertaining to the existence of algorithms with favorable regret bounds under the following assumptions: (i) sparsity of the individual losses, (ii) small variation of the loss sequence, and (iii) curvature of the action set. Specifically we show that (i) for $s$-sparse losses one can obtain $\tilde{O}(\sqrt{s T})$-regret (solving an open problem by Kwon and Perchet), (ii) for loss sequences with variation bounded by $Q$ one can obtain $\tilde{O}(\sqrt{Q})$-regret (solving an open problem by Kale and Hazan), and (iii) for linear bandit on an $\ell_p^n$ ball one can obtain $\tilde{O}(\sqrt{n T})$-regret for $p \in [1,2]$ and one has $\tilde{\Omega}(n \sqrt{T})$-regret for $p>2$ (solving an open problem by Bubeck, Cesa-Bianchi and Kakade). A key new insight to obtain these results is to use regularizers satisfying more refined conditions than general self-concordance |
Tasks | Multi-Armed Bandits |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.01037v1 |
http://arxiv.org/pdf/1711.01037v1.pdf | |
PWC | https://paperswithcode.com/paper/sparsity-variance-and-curvature-in-multi |
Repo | |
Framework | |
A Hierarchical Probabilistic Model for Facial Feature Detection
Title | A Hierarchical Probabilistic Model for Facial Feature Detection |
Authors | Yue Wu, Ziheng Wang, Qiang Ji |
Abstract | Facial feature detection from facial images has attracted great attention in the field of computer vision. It is a nontrivial task since the appearance and shape of the face tend to change under different conditions. In this paper, we propose a hierarchical probabilistic model that could infer the true locations of facial features given the image measurements even if the face is with significant facial expression and pose. The hierarchical model implicitly captures the lower level shape variations of facial components using the mixture model. Furthermore, in the higher level, it also learns the joint relationship among facial components, the facial expression, and the pose information through automatic structure learning and parameter estimation of the probabilistic model. Experimental results on benchmark databases demonstrate the effectiveness of the proposed hierarchical probabilistic model. |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.05732v1 |
http://arxiv.org/pdf/1709.05732v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchical-probabilistic-model-for-facial |
Repo | |
Framework | |
Restricted Boltzmann Machines for Robust and Fast Latent Truth Discovery
Title | Restricted Boltzmann Machines for Robust and Fast Latent Truth Discovery |
Authors | Klaus Broelemann, Thomas Gottron, Gjergji Kasneci |
Abstract | We address the problem of latent truth discovery, LTD for short, where the goal is to discover the underlying true values of entity attributes in the presence of noisy, conflicting or incomplete information. Despite a multitude of algorithms to address the LTD problem that can be found in literature, only little is known about their overall performance with respect to effectiveness (in terms of truth discovery capabilities), efficiency and robustness. A practical LTD approach should satisfy all these characteristics so that it can be applied to heterogeneous datasets of varying quality and degrees of cleanliness. We propose a novel algorithm for LTD that satisfies the above requirements. The proposed model is based on Restricted Boltzmann Machines, thus coined LTD-RBM. In extensive experiments on various heterogeneous and publicly available datasets, LTD-RBM is superior to state-of-the-art LTD techniques in terms of an overall consideration of effectiveness, efficiency and robustness. |
Tasks | |
Published | 2017-12-31 |
URL | http://arxiv.org/abs/1801.00283v1 |
http://arxiv.org/pdf/1801.00283v1.pdf | |
PWC | https://paperswithcode.com/paper/restricted-boltzmann-machines-for-robust-and |
Repo | |
Framework | |
Empirical Analysis of the Necessary and Sufficient Conditions of the Echo State Property
Title | Empirical Analysis of the Necessary and Sufficient Conditions of the Echo State Property |
Authors | Sebastián Basterrech |
Abstract | The Echo State Network (ESN) is a specific recurrent network, which has gained popularity during the last years. The model has a recurrent network named reservoir, that is fixed during the learning process. The reservoir is used for transforming the input space in a larger space. A fundamental property that provokes an impact on the model accuracy is the Echo State Property (ESP). There are two main theoretical results related to the ESP. First, a sufficient condition for the ESP existence that involves the singular values of the reservoir matrix. Second, a necessary condition for the ESP. The ESP can be violated according to the spectral radius value of the reservoir matrix. There is a theoretical gap between these necessary and sufficient conditions. This article presents an empirical analysis of the accuracy and the projections of reservoirs that satisfy this theoretical gap. It gives some insights about the generation of the reservoir matrix. From previous works, it is already known that the optimal accuracy is obtained near to the border of stability control of the dynamics. Then, according to our empirical results, we can see that this border seems to be closer to the sufficient conditions than to the necessary conditions of the ESP. |
Tasks | |
Published | 2017-03-20 |
URL | http://arxiv.org/abs/1703.06664v1 |
http://arxiv.org/pdf/1703.06664v1.pdf | |
PWC | https://paperswithcode.com/paper/empirical-analysis-of-the-necessary-and |
Repo | |
Framework | |
Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality
Title | Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality |
Authors | Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar |
Abstract | Today, the optimal performance of existing noise-suppression algorithms, both data-driven and those based on classic statistical methods, is range bound to specific levels of instantaneous input signal-to-noise ratios. In this paper, we present a new approach to improve the adaptivity of such algorithms enabling them to perform robustly across a wide range of input signal and noise types. Our methodology is based on the dynamic control of algorithmic parameters via reinforcement learning. Specifically, we model the noise-suppression module as a black box, requiring no knowledge of the algorithmic mechanics except a simple feedback from the output. We utilize this feedback as the reward signal for a reinforcement-learning agent that learns a policy to adapt the algorithmic parameters for every incoming audio frame (16 ms of data). Our preliminary results show that such a control mechanism can substantially increase the overall performance of the underlying noise-suppression algorithm; 42% and 16% improvements in output SNR and MSE, respectively, when compared to no adaptivity. |
Tasks | Speech Enhancement |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10791v2 |
http://arxiv.org/pdf/1711.10791v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-to-adapt-speech |
Repo | |
Framework | |
Algorithms for Approximate Subtropical Matrix Factorization
Title | Algorithms for Approximate Subtropical Matrix Factorization |
Authors | Sanjar Karaev, Pauli Miettinen |
Abstract | Matrix factorization methods are important tools in data mining and analysis. They can be used for many tasks, ranging from dimensionality reduction to visualization. In this paper we concentrate on the use of matrix factorizations for finding patterns from the data. Rather than using the standard algebra – and the summation of the rank-1 components to build the approximation of the original matrix – we use the subtropical algebra, which is an algebra over the nonnegative real values with the summation replaced by the maximum operator. Subtropical matrix factorizations allow “winner-takes-it-all” interpretations of the rank-1 components, revealing different structure than the normal (nonnegative) factorizations. We study the complexity and sparsity of the factorizations, and present a framework for finding low-rank subtropical factorizations. We present two specific algorithms, called Capricorn and Cancer, that are part of our framework. They can be used with data that has been corrupted with different types of noise, and with different error metrics, including the sum-of-absolute differences, Frobenius norm, and Jensen–Shannon divergence. Our experiments show that the algorithms perform well on data that has subtropical structure, and that they can find factorizations that are both sparse and easy to interpret. |
Tasks | Dimensionality Reduction |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.08872v1 |
http://arxiv.org/pdf/1707.08872v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-approximate-subtropical-matrix |
Repo | |
Framework | |
Evidence Against Evidence Theory (?!)
Title | Evidence Against Evidence Theory (?!) |
Authors | Mieczysław A. Kłopotek, Andrzej Matuszewski |
Abstract | This paper is concerned with the apparent greatest weakness of the Mathematical Theory of Evidence (MTE) of Shafer \cite{Shafer:76}, which has been strongly criticized by Wasserman \cite{Wasserman:92ijar} - the relationship to frequencies. Weaknesses of various proposals of probabilistic interpretation of MTE belief functions are demonstrated. A new frequency-based interpretation is presented overcoming various drawbacks of earlier interpretations. |
Tasks | |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02929v1 |
http://arxiv.org/pdf/1706.02929v1.pdf | |
PWC | https://paperswithcode.com/paper/evidence-against-evidence-theory |
Repo | |
Framework | |
Code Completion with Neural Attention and Pointer Networks
Title | Code Completion with Neural Attention and Pointer Networks |
Authors | Jian Li, Yue Wang, Michael R. Lyu, Irwin King |
Abstract | Intelligent code completion has become an essential research task to accelerate modern software development. To facilitate effective code completion for dynamically-typed programming languages, we apply neural language models by learning from large codebases, and develop a tailored attention mechanism for code completion. However, standard neural language models even with attention mechanism cannot correctly predict the out-of-vocabulary (OoV) words that restrict the code completion performance. In this paper, inspired by the prevalence of locally repeated terms in program source code, and the recently proposed pointer copy mechanism, we propose a pointer mixture network for better predicting OoV words in code completion. Based on the context, the pointer mixture network learns to either generate a within-vocabulary word through an RNN component, or regenerate an OoV word from local context through a pointer component. Experiments on two benchmarked datasets demonstrate the effectiveness of our attention mechanism and pointer mixture network on the code completion task. |
Tasks | |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09573v2 |
http://arxiv.org/pdf/1711.09573v2.pdf | |
PWC | https://paperswithcode.com/paper/code-completion-with-neural-attention-and |
Repo | |
Framework | |
Deep learning analysis of the myocardium in coronary CT angiography for identification of patients with functionally significant coronary artery stenosis
Title | Deep learning analysis of the myocardium in coronary CT angiography for identification of patients with functionally significant coronary artery stenosis |
Authors | Majd Zreik, Nikolas Lessmann, Robbert W. van Hamersvelt, Jelmer M. Wolterink, Michiel Voskuil, Max A. Viergever, Tim Leiner, Ivana Išgum |
Abstract | In patients with coronary artery stenoses of intermediate severity, the functional significance needs to be determined. Fractional flow reserve (FFR) measurement, performed during invasive coronary angiography (ICA), is most often used in clinical practice. To reduce the number of ICA procedures, we present a method for automatic identification of patients with functionally significant coronary artery stenoses, employing deep learning analysis of the left ventricle (LV) myocardium in rest coronary CT angiography (CCTA). The study includes consecutively acquired CCTA scans of 166 patients with FFR measurements. To identify patients with a functionally significant coronary artery stenosis, analysis is performed in several stages. First, the LV myocardium is segmented using a multiscale convolutional neural network (CNN). To characterize the segmented LV myocardium, it is subsequently encoded using unsupervised convolutional autoencoder (CAE). Thereafter, patients are classified according to the presence of functionally significant stenosis using an SVM classifier based on the extracted and clustered encodings. Quantitative evaluation of LV myocardium segmentation in 20 images resulted in an average Dice coefficient of 0.91 and an average mean absolute distance between the segmented and reference LV boundaries of 0.7 mm. Classification of patients was evaluated in the remaining 126 CCTA scans in 50 10-fold cross-validation experiments and resulted in an area under the receiver operating characteristic curve of 0.74 +- 0.02. At sensitivity levels 0.60, 0.70 and 0.80, the corresponding specificity was 0.77, 0.71 and 0.59, respectively. The results demonstrate that automatic analysis of the LV myocardium in a single CCTA scan acquired at rest, without assessment of the anatomy of the coronary arteries, can be used to identify patients with functionally significant coronary artery stenosis. |
Tasks | |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.08917v2 |
http://arxiv.org/pdf/1711.08917v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-analysis-of-the-myocardium-in |
Repo | |
Framework | |
Data set operations to hide decision tree rules
Title | Data set operations to hide decision tree rules |
Authors | Dimitris Kalles, Vassilios S. Verykios, Georgios Feretzakis, Athanasios Papagelis |
Abstract | This paper focuses on preserving the privacy of sensitive patterns when inducing decision trees. We adopt a record augmentation approach for hiding sensitive classification rules in binary datasets. Such a hiding methodology is preferred over other heuristic solutions like output perturbation or cryptographic techniques - which restrict the usability of the data - since the raw data itself is readily available for public use. We show some key lemmas which are related to the hiding process and we also demonstrate the methodology with an example and an indicative experiment using a prototype hiding tool. |
Tasks | |
Published | 2017-06-18 |
URL | http://arxiv.org/abs/1706.05733v1 |
http://arxiv.org/pdf/1706.05733v1.pdf | |
PWC | https://paperswithcode.com/paper/data-set-operations-to-hide-decision-tree |
Repo | |
Framework | |