Paper Group ANR 609
Discovering Playing Patterns: Time Series Clustering of Free-To-Play Game Data. Multilingual Speech Recognition With A Single End-To-End Model. Occlusion Aware Unsupervised Learning of Optical Flow. UsingWord Embedding for Cross-Language Plagiarism Detection. Iterative Spectral Clustering for Unsupervised Object Localization. Deep Fault Analysis an …
Discovering Playing Patterns: Time Series Clustering of Free-To-Play Game Data
Title | Discovering Playing Patterns: Time Series Clustering of Free-To-Play Game Data |
Authors | Alain Saas, Anna Guitart, África Periáñez |
Abstract | The classification of time series data is a challenge common to all data-driven fields. However, there is no agreement about which are the most efficient techniques to group unlabeled time-ordered data. This is because a successful classification of time series patterns depends on the goal and the domain of interest, i.e. it is application-dependent. In this article, we study free-to-play game data. In this domain, clustering similar time series information is increasingly important due to the large amount of data collected by current mobile and web applications. We evaluate which methods cluster accurately time series of mobile games, focusing on player behavior data. We identify and validate several aspects of the clustering: the similarity measures and the representation techniques to reduce the high dimensionality of time series. As a robustness test, we compare various temporal datasets of player activity from two free-to-play video-games. With these techniques we extract temporal patterns of player behavior relevant for the evaluation of game events and game-business diagnosis. Our experiments provide intuitive visualizations to validate the results of the clustering and to determine the optimal number of clusters. Additionally, we assess the common characteristics of the players belonging to the same group. This study allows us to improve the understanding of player dynamics and churn behavior. |
Tasks | Time Series, Time Series Clustering |
Published | 2017-10-06 |
URL | http://arxiv.org/abs/1710.02268v1 |
http://arxiv.org/pdf/1710.02268v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-playing-patterns-time-series |
Repo | |
Framework | |
Multilingual Speech Recognition With A Single End-To-End Model
Title | Multilingual Speech Recognition With A Single End-To-End Model |
Authors | Shubham Toshniwal, Tara N. Sainath, Ron J. Weiss, Bo Li, Pedro Moreno, Eugene Weinstein, Kanishka Rao |
Abstract | Training a conventional automatic speech recognition (ASR) system to support multiple languages is challenging because the sub-word unit, lexicon and word inventories are typically language specific. In contrast, sequence-to-sequence models are well suited for multilingual ASR because they encapsulate an acoustic, pronunciation and language model jointly in a single network. In this work we present a single sequence-to-sequence ASR model trained on 9 different Indian languages, which have very little overlap in their scripts. Specifically, we take a union of language-specific grapheme sets and train a grapheme-based sequence-to-sequence model jointly on data from all languages. We find that this model, which is not explicitly given any information about language identity, improves recognition performance by 21% relative compared to analogous sequence-to-sequence models trained on each language individually. By modifying the model to accept a language identifier as an additional input feature, we further improve performance by an additional 7% relative and eliminate confusion between different languages. |
Tasks | Language Modelling, Speech Recognition |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.01694v2 |
http://arxiv.org/pdf/1711.01694v2.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-speech-recognition-with-a-single |
Repo | |
Framework | |
Occlusion Aware Unsupervised Learning of Optical Flow
Title | Occlusion Aware Unsupervised Learning of Optical Flow |
Authors | Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei Xu |
Abstract | It has been recently shown that a convolutional neural network can learn optical flow estimation with unsupervised learning. However, the performance of the unsupervised methods still has a relatively large gap compared to its supervised counterpart. Occlusion and large motion are some of the major factors that limit the current unsupervised learning of optical flow methods. In this work we introduce a new method which models occlusion explicitly and a new warping way that facilitates the learning of large motion. Our method shows promising results on Flying Chairs, MPI-Sintel and KITTI benchmark datasets. Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning. |
Tasks | Optical Flow Estimation |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.05890v2 |
http://arxiv.org/pdf/1711.05890v2.pdf | |
PWC | https://paperswithcode.com/paper/occlusion-aware-unsupervised-learning-of |
Repo | |
Framework | |
UsingWord Embedding for Cross-Language Plagiarism Detection
Title | UsingWord Embedding for Cross-Language Plagiarism Detection |
Authors | J. Ferrero, F. Agnes, L. Besacier, D. Schwab |
Abstract | This paper proposes to use distributed representation of words (word embeddings) in cross-language textual similarity detection. The main contributions of this paper are the following: (a) we introduce new cross-language similarity detection methods based on distributed representation of words; (b) we combine the different methods proposed to verify their complementarity and finally obtain an overall F1 score of 89.15% for English-French similarity detection at chunk level (88.5% at sentence level) on a very challenging corpus. |
Tasks | Word Embeddings |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03082v1 |
http://arxiv.org/pdf/1702.03082v1.pdf | |
PWC | https://paperswithcode.com/paper/usingword-embedding-for-cross-language |
Repo | |
Framework | |
Iterative Spectral Clustering for Unsupervised Object Localization
Title | Iterative Spectral Clustering for Unsupervised Object Localization |
Authors | Aditya Vora, Shanmuganathan Raman |
Abstract | This paper addresses the problem of unsupervised object localization in an image. Unlike previous supervised and weakly supervised algorithms that require bounding box or image level annotations for training classifiers in order to learn features representing the object, we propose a simple yet effective technique for localization using iterative spectral clustering. This iterative spectral clustering approach along with appropriate cluster selection strategy in each iteration naturally helps in searching of object region in the image. In order to estimate the final localization window, we group the proposals obtained from the iterative spectral clustering step based on the perceptual similarity, and average the coordinates of the proposals from the top scoring groups. We benchmark our algorithm on challenging datasets like Object Discovery and PASCAL VOC 2007, achieving an average CorLoc percentage of 51% and 35% respectively which is comparable to various other weakly supervised algorithms despite being completely unsupervised. |
Tasks | Object Localization, Unsupervised Object Localization |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1706.09719v1 |
http://arxiv.org/pdf/1706.09719v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-spectral-clustering-for |
Repo | |
Framework | |
Deep Fault Analysis and Subset Selection in Solar Power Grids
Title | Deep Fault Analysis and Subset Selection in Solar Power Grids |
Authors | Biswarup Bhattacharya, Abhishek Sinha |
Abstract | Non-availability of reliable and sustainable electric power is a major problem in the developing world. Renewable energy sources like solar are not very lucrative in the current stage due to various uncertainties like weather, storage, land use among others. There also exists various other issues like mis-commitment of power, absence of intelligent fault analysis, congestion, etc. In this paper, we propose a novel deep learning-based system for predicting faults and selecting power generators optimally so as to reduce costs and ensure higher reliability in solar power systems. The results are highly encouraging and they suggest that the approaches proposed in this paper have the potential to be applied successfully in the developing world. |
Tasks | |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.02810v1 |
http://arxiv.org/pdf/1711.02810v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-fault-analysis-and-subset-selection-in |
Repo | |
Framework | |
Natasha 2: Faster Non-Convex Optimization Than SGD
Title | Natasha 2: Faster Non-Convex Optimization Than SGD |
Authors | Zeyuan Allen-Zhu |
Abstract | We design a stochastic algorithm to train any smooth neural network to $\varepsilon$-approximate local minima, using $O(\varepsilon^{-3.25})$ backpropagations. The best result was essentially $O(\varepsilon^{-4})$ by SGD. More broadly, it finds $\varepsilon$-approximate local minima of any smooth nonconvex function in rate $O(\varepsilon^{-3.25})$, with only oracle access to stochastic gradients. |
Tasks | |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08694v4 |
http://arxiv.org/pdf/1708.08694v4.pdf | |
PWC | https://paperswithcode.com/paper/natasha-2-faster-non-convex-optimization-than |
Repo | |
Framework | |
Consistent Nonparametric Different-Feature Selection via the Sparsest $k$-Subgraph Problem
Title | Consistent Nonparametric Different-Feature Selection via the Sparsest $k$-Subgraph Problem |
Authors | Satoshi Hara, Takayuki Katsuki, Hiroki Yanagisawa, Masaaki Imaizumi, Takafumi Ono, Ryo Okamoto, Shigeki Takeuchi |
Abstract | Two-sample feature selection is the problem of finding features that describe a difference between two probability distributions, which is a ubiquitous problem in both scientific and engineering studies. However, existing methods have limited applicability because of their restrictive assumptions on data distributoins or computational difficulty. In this paper, we resolve these difficulties by formulating the problem as a sparsest $k$-subgraph problem. The proposed method is nonparametric and does not assume any specific parametric models on the data distributions. We show that the proposed method is computationally efficient and does not require any extra computation for model selection. Moreover, we prove that the proposed method provides a consistent estimator of features under mild conditions. Our experimental results show that the proposed method outperforms the current method with regard to both accuracy and computation time. |
Tasks | Feature Selection, Model Selection |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1707.09688v2 |
http://arxiv.org/pdf/1707.09688v2.pdf | |
PWC | https://paperswithcode.com/paper/consistent-nonparametric-different-feature |
Repo | |
Framework | |
Parametric t-Distributed Stochastic Exemplar-centered Embedding
Title | Parametric t-Distributed Stochastic Exemplar-centered Embedding |
Authors | Martin Renqiang Min, Hongyu Guo, Dinghan Shen |
Abstract | Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation. However, the performance of pt-SNE is highly sensitive to the hyper-parameter batch size due to conflicting optimization goals, and often produces dramatically different embeddings with different choices of user-defined perplexities. To effectively solve these issues, we present parametric t-distributed stochastic exemplar-centered embedding methods. Our strategy learns embedding parameters by comparing given data only with precomputed exemplars, resulting in a cost function with linear computational and memory complexity, which is further reduced by noise contrastive samples. Moreover, we propose a shallow embedding network with high-order feature interactions for data visualization, which is much easier to tune but produces comparable performance in contrast to a deep neural network employed by pt-SNE. We empirically demonstrate, using several benchmark datasets, that our proposed methods significantly outperform pt-SNE in terms of robustness, visual effects, and quantitative evaluations. |
Tasks | |
Published | 2017-10-14 |
URL | http://arxiv.org/abs/1710.05128v5 |
http://arxiv.org/pdf/1710.05128v5.pdf | |
PWC | https://paperswithcode.com/paper/parametric-t-distributed-stochastic-exemplar |
Repo | |
Framework | |
Online Photometric Calibration for Auto Exposure Video for Realtime Visual Odometry and SLAM
Title | Online Photometric Calibration for Auto Exposure Video for Realtime Visual Odometry and SLAM |
Authors | Paul Bergmann, Rui Wang, Daniel Cremers |
Abstract | Recent direct visual odometry and SLAM algorithms have demonstrated impressive levels of precision. However, they require a photometric camera calibration in order to achieve competitive results. Hence, the respective algorithm cannot be directly applied to an off-the-shelf-camera or to a video sequence acquired with an unknown camera. In this work we propose a method for online photometric calibration which enables to process auto exposure videos with visual odometry precisions that are on par with those of photometrically calibrated videos. Our algorithm recovers the exposure times of consecutive frames, the camera response function, and the attenuation factors of the sensor irradiance due to vignetting. Gain robust KLT feature tracks are used to obtain scene point correspondences as input to a nonlinear optimization framework. We show that our approach can reliably calibrate arbitrary video sequences by evaluating it on datasets for which full photometric ground truth is available. We further show that our calibration can improve the performance of a state-of-the-art direct visual odometry method that works solely on pixel intensities, calibrating for photometric parameters in an online fashion in realtime. |
Tasks | Calibration, Visual Odometry |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.02081v1 |
http://arxiv.org/pdf/1710.02081v1.pdf | |
PWC | https://paperswithcode.com/paper/online-photometric-calibration-for-auto |
Repo | |
Framework | |
ContextVP: Fully Context-Aware Video Prediction
Title | ContextVP: Fully Context-Aware Video Prediction |
Authors | Wonmin Byeon, Qin Wang, Rupesh Kumar Srivastava, Petros Koumoutsakos |
Abstract | Video prediction models based on convolutional networks, recurrent networks, and their combinations often result in blurry predictions. We identify an important contributing factor for imprecise predictions that has not been studied adequately in the literature: blind spots, i.e., lack of access to all relevant past information for accurately predicting the future. To address this issue, we introduce a fully context-aware architecture that captures the entire available past context for each pixel using Parallel Multi-Dimensional LSTM units and aggregates it using blending units. Our model outperforms a strong baseline network of 20 recurrent convolutional layers and yields state-of-the-art performance for next step prediction on three challenging real-world video datasets: Human 3.6M, Caltech Pedestrian, and UCF-101. Moreover, it does so with fewer parameters than several recently proposed models, and does not rely on deep convolutional networks, multi-scale architectures, separation of background and foreground modeling, motion flow learning, or adversarial training. These results highlight that full awareness of past context is of crucial importance for video prediction. |
Tasks | Video Prediction |
Published | 2017-10-23 |
URL | http://arxiv.org/abs/1710.08518v3 |
http://arxiv.org/pdf/1710.08518v3.pdf | |
PWC | https://paperswithcode.com/paper/contextvp-fully-context-aware-video |
Repo | |
Framework | |
CHAOS: A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi
Title | CHAOS: A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi |
Authors | Andre Viebke, Suejb Memeti, Sabri Pllana, Ajith Abraham |
Abstract | Deep learning is an important component of big-data analytic tools and intelligent applications, such as, self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive, and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time of deep neural networks. In this paper, we present our parallelization scheme for training convolutional neural networks (CNN) named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). Major features of CHAOS include the support for thread and vector parallelism, non-instant updates of weight parameters during back-propagation without a significant delay, and implicit synchronization in arbitrary order. CHAOS is tailored for parallel computing systems that are accelerated with the Intel Xeon Phi. We evaluate our parallelization approach empirically using measurement techniques and performance modeling for various numbers of threads and CNN architectures. Experimental results for the MNIST dataset of handwritten digits using the total number of threads on the Xeon Phi show speedups of up to 103x compared to the execution on one thread of the Xeon Phi, 14x compared to the sequential execution on Intel Xeon E5, and 58x compared to the sequential execution on Intel Core i5. |
Tasks | Self-Driving Cars, Speech Recognition |
Published | 2017-02-25 |
URL | http://arxiv.org/abs/1702.07908v1 |
http://arxiv.org/pdf/1702.07908v1.pdf | |
PWC | https://paperswithcode.com/paper/chaos-a-parallelization-scheme-for-training |
Repo | |
Framework | |
Testing and Learning on Distributions with Symmetric Noise Invariance
Title | Testing and Learning on Distributions with Symmetric Noise Invariance |
Authors | Ho Chung Leon Law, Christopher Yau, Dino Sejdinovic |
Abstract | Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting distance between distributions, are useful tools for fully nonparametric two-sample testing and learning on distributions. However, it is rarely that all possible differences between samples are of interest – discovered differences can be due to different types of measurement noise, data collection artefacts or other irrelevant sources of variability. We propose distances between distributions which encode invariance to additive symmetric noise, aimed at testing whether the assumed true underlying processes differ. Moreover, we construct invariant features of distributions, leading to learning algorithms robust to the impairment of the input distributions with symmetric additive noise. |
Tasks | |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07596v2 |
http://arxiv.org/pdf/1703.07596v2.pdf | |
PWC | https://paperswithcode.com/paper/testing-and-learning-on-distributions-with |
Repo | |
Framework | |
Beliefs and Probability in Bacchus’ l.p. Logic: A~3-Valued Logic Solution to Apparent Counter-intuition
Title | Beliefs and Probability in Bacchus’ l.p. Logic: A~3-Valued Logic Solution to Apparent Counter-intuition |
Authors | Mieczysław A. Kłopotek |
Abstract | Fundamental discrepancy between first order logic and statistical inference (global versus local properties of universe) is shown to be the obstacle for integration of logic and probability in L.p. logic of Bacchus. To overcome the counterintuitiveness of L.p. behaviour, a 3-valued logic is proposed. |
Tasks | |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03342v1 |
http://arxiv.org/pdf/1704.03342v1.pdf | |
PWC | https://paperswithcode.com/paper/beliefs-and-probability-in-bacchus-lp-logic |
Repo | |
Framework | |
Dropout as a Low-Rank Regularizer for Matrix Factorization
Title | Dropout as a Low-Rank Regularizer for Matrix Factorization |
Authors | Jacopo Cavazza, Pietro Morerio, Benjamin Haeffele, Connor Lane, Vittorio Murino, Rene Vidal |
Abstract | Regularization for matrix factorization (MF) and approximation problems has been carried out in many different ways. Due to its popularity in deep learning, dropout has been applied also for this class of problems. Despite its solid empirical performance, the theoretical properties of dropout as a regularizer remain quite elusive for this class of problems. In this paper, we present a theoretical analysis of dropout for MF, where Bernoulli random variables are used to drop columns of the factors. We demonstrate the equivalence between dropout and a fully deterministic model for MF in which the factors are regularized by the sum of the product of squared Euclidean norms of the columns. Additionally, we inspect the case of a variable sized factorization and we prove that dropout achieves the global minimum of a convex approximation problem with (squared) nuclear norm regularization. As a result, we conclude that dropout can be used as a low-rank regularizer with data dependent singular-value thresholding. |
Tasks | |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05092v1 |
http://arxiv.org/pdf/1710.05092v1.pdf | |
PWC | https://paperswithcode.com/paper/dropout-as-a-low-rank-regularizer-for-matrix |
Repo | |
Framework | |