Paper Group ANR 571
A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games. Combining policy gradient and Q-learning. Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification. Deep Motion Features for Visual Tracking. Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognit …
A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games
Title | A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games |
Authors | Felix Leibfried, Nate Kushman, Katja Hofmann |
Abstract | Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts of data. Model-based techniques are more data-efficient, but need to acquire explicit knowledge about the environment. In this paper, we take a step towards using model-based techniques in environments with a high-dimensional visual state space by demonstrating that it is possible to learn system dynamics and the reward structure jointly. Our contribution is to extend a recently developed deep neural network for video frame prediction in Atari games to enable reward prediction as well. To this end, we phrase a joint optimization problem for minimizing both video frame and reward reconstruction loss, and adapt network parameters accordingly. Empirical evaluations on five Atari games demonstrate accurate cumulative reward prediction of up to 200 frames. We consider these results as opening up important directions for model-based reinforcement learning in complex, initially unknown environments. |
Tasks | Atari Games |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.07078v2 |
http://arxiv.org/pdf/1611.07078v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-for-joint-video |
Repo | |
Framework | |
Combining policy gradient and Q-learning
Title | Combining policy gradient and Q-learning |
Authors | Brendan O’Donoghue, Remi Munos, Koray Kavukcuoglu, Volodymyr Mnih |
Abstract | Policy gradient is an efficient technique for improving a policy in a reinforcement learning setting. However, vanilla online variants are on-policy only and not able to take advantage of off-policy data. In this paper we describe a new technique that combines policy gradient with off-policy Q-learning, drawing experience from a replay buffer. This is motivated by making a connection between the fixed points of the regularized policy gradient algorithm and the Q-values. This connection allows us to estimate the Q-values from the action preferences of the policy, to which we apply Q-learning updates. We refer to the new technique as ‘PGQL’, for policy gradient and Q-learning. We also establish an equivalency between action-value fitting techniques and actor-critic algorithms, showing that regularized policy gradient techniques can be interpreted as advantage function learning algorithms. We conclude with some numerical examples that demonstrate improved data efficiency and stability of PGQL. In particular, we tested PGQL on the full suite of Atari games and achieved performance exceeding that of both asynchronous advantage actor-critic (A3C) and Q-learning. |
Tasks | Atari Games, Q-Learning |
Published | 2016-11-05 |
URL | http://arxiv.org/abs/1611.01626v3 |
http://arxiv.org/pdf/1611.01626v3.pdf | |
PWC | https://paperswithcode.com/paper/combining-policy-gradient-and-q-learning |
Repo | |
Framework | |
Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification
Title | Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification |
Authors | Ali Diba, Ali Mohammad Pazandeh, Luc Van Gool |
Abstract | The video and action classification have extremely evolved by deep neural networks specially with two stream CNN using RGB and optical flow as inputs and they present outstanding performance in terms of video analysis. One of the shortcoming of these methods is handling motion information extraction which is done out side of the CNNs and relatively time consuming also on GPUs. So proposing end-to-end methods which are exploring to learn motion representation, like 3D-CNN can achieve faster and accurate performance. We present some novel deep CNNs using 3D architecture to model actions and motion representation in an efficient way to be accurate and also as fast as real-time. Our new networks learn distinctive models to combine deep motion features into appearance model via learning optical flow features inside the network. |
Tasks | Action Classification, Optical Flow Estimation, Video Classification |
Published | 2016-08-31 |
URL | http://arxiv.org/abs/1608.08851v2 |
http://arxiv.org/pdf/1608.08851v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-two-stream-motion-and-appearance-3d |
Repo | |
Framework | |
Deep Motion Features for Visual Tracking
Title | Deep Motion Features for Visual Tracking |
Authors | Susanna Gladh, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg |
Abstract | Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone. |
Tasks | Optical Flow Estimation, Temporal Action Localization, Video Classification, Visual Tracking |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06615v1 |
http://arxiv.org/pdf/1612.06615v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-motion-features-for-visual-tracking |
Repo | |
Framework | |
Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition
Title | Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition |
Authors | César Roberto de Souza, Adrien Gaidon, Eleonora Vig, Antonio Manuel López |
Abstract | Action recognition in videos is a challenging task due to the complexity of the spatio-temporal patterns to model and the difficulty to acquire and learn on large quantities of video data. Deep learning, although a breakthrough for image classification and showing promise for videos, has still not clearly superseded action recognition methods using hand-crafted features, even when training on massive datasets. In this paper, we introduce hybrid video classification architectures based on carefully designed unsupervised representations of hand-crafted spatio-temporal features classified by supervised deep networks. As we show in our experiments on five popular benchmarks for action recognition, our hybrid model combines the best of both worlds: it is data efficient (trained on 150 to 10000 short clips) and yet improves significantly on the state of the art, including recent deep models trained on millions of manually labelled images and videos. |
Tasks | Action Recognition In Videos, Image Classification, Temporal Action Localization, Video Classification |
Published | 2016-08-25 |
URL | http://arxiv.org/abs/1608.07138v1 |
http://arxiv.org/pdf/1608.07138v1.pdf | |
PWC | https://paperswithcode.com/paper/sympathy-for-the-details-dense-trajectories |
Repo | |
Framework | |
A Novel Memetic Feature Selection Algorithm
Title | A Novel Memetic Feature Selection Algorithm |
Authors | Mohadeseh Montazeri, Hamid Reza Naji, Mitra Montazeri, Ahmad Faraahi |
Abstract | Feature selection is a problem of finding efficient features among all features in which the final feature set can improve accuracy and reduce complexity. In feature selection algorithms search strategies are key aspects. Since feature selection is an NP-Hard problem; therefore heuristic algorithms have been studied to solve this problem. In this paper, we have proposed a method based on memetic algorithm to find an efficient feature subset for a classification problem. It incorporates a filter method in the genetic algorithm to improve classification performance and accelerates the search in identifying core feature subsets. Particularly, the method adds or deletes a feature from a candidate feature subset based on the multivariate feature information. Empirical study on commonly data sets of the university of California, Irvine shows that the proposed method outperforms existing methods. |
Tasks | Feature Selection |
Published | 2016-01-26 |
URL | http://arxiv.org/abs/1601.06933v1 |
http://arxiv.org/pdf/1601.06933v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-memetic-feature-selection-algorithm |
Repo | |
Framework | |
Automatic Content-aware Non-Photorealistic Rendering of Images
Title | Automatic Content-aware Non-Photorealistic Rendering of Images |
Authors | Akshay Gadi Patil, Shanmuganathan Raman |
Abstract | Non-photorealistic rendering techniques work on image features and often manipulate a set of characteristics such as edges and texture to achieve a desired depiction of the scene. Most computational photography methods decompose an image using edge preserving filters and work on the resulting base and detail layers independently to achieve desired visual effects. We propose a new approach for content-aware non-photorealistic rendering of images where we manipulate the visually salient and the non-salient regions separately. We propose a novel content-aware framework in order to render an image for applications such as detail exaggeration, artificial blurring and image abstraction. The processed regions of the image are blended seamlessly for all these applications. We demonstrate that content awareness of the proposed method leads to automatic generation of non-photorealistic rendering of the same image for the different applications mentioned above. |
Tasks | |
Published | 2016-04-07 |
URL | http://arxiv.org/abs/1604.01962v4 |
http://arxiv.org/pdf/1604.01962v4.pdf | |
PWC | https://paperswithcode.com/paper/automatic-content-aware-non-photorealistic |
Repo | |
Framework | |
Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition
Title | Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition |
Authors | Mingrui Liu, Tianbao Yang |
Abstract | Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC). However, the faster convergence of restarting APG method relies on the potentially unknown constant in QGC to appropriately restart APG, which restricts its applicability. We address this issue by developing a novel adaptive gradient converging methods, i.e., leveraging the magnitude of proximal gradient as a criterion for restart and termination. Our analysis extends to a much more general condition beyond the QGC, namely the H"{o}lderian error bound (HEB) condition. {\it The key technique} for our development is a novel synthesis of {\it adaptive regularization and a conditional restarting scheme}, which extends previous work focusing on strongly convex problems to a much broader family of problems. Furthermore, we demonstrate that our results have important implication and applications in machine learning: (i) if the objective function is coercive and semi-algebraic, PG’s convergence speed is essentially $o(\frac{1}{t})$, where $t$ is the total number of iterations; (ii) if the objective function consists of an $\ell_1$, $\ell_\infty$, $\ell_{1,\infty}$, or huber norm regularization and a convex smooth piecewise quadratic loss (e.g., squares loss, squared hinge loss and huber loss), the proposed algorithm is parameter-free and enjoys a {\it faster linear convergence} than PG without any other assumptions (e.g., restricted eigen-value condition). It is notable that our linear convergence results for the aforementioned problems are global instead of local. To the best of our knowledge, these improved results are the first shown in this work. |
Tasks | |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07609v2 |
http://arxiv.org/pdf/1611.07609v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-accelerated-gradient-converging-1 |
Repo | |
Framework | |
Semantic Similarity Strategies for Job Title Classification
Title | Semantic Similarity Strategies for Job Title Classification |
Authors | Yun Zhu, Faizan Javed, Ozgur Ozturk |
Abstract | Automatic and accurate classification of items enables numerous downstream applications in many domains. These applications can range from faceted browsing of items to product recommendations and big data analytics. In the online recruitment domain, we refer to classifying job ads to pre-defined or custom occupation categories as job title classification. A large-scale job title classification system can power various downstream applications such as semantic search, job recommendations and labor market analytics. In this paper, we discuss experiments conducted to improve our in-house job title classification system. The classification component of the system is composed of a two-stage coarse and fine level classifier cascade that classifies input text such as job title and/or job ads to one of the thousands of job titles in our taxonomy. To improve classification accuracy and effectiveness, we experiment with various semantic representation strategies such as average W2V vectors and document similarity measures such as Word Movers Distance (WMD). Our initial results show an overall improvement in accuracy of Carotene[1]. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.06268v1 |
http://arxiv.org/pdf/1609.06268v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-similarity-strategies-for-job-title |
Repo | |
Framework | |
A Unified Approach for Learning the Parameters of Sum-Product Networks
Title | A Unified Approach for Learning the Parameters of Sum-Product Networks |
Authors | Han Zhao, Pascal Poupart, Geoff Gordon |
Abstract | We present a unified approach for learning the parameters of Sum-Product networks (SPNs). We prove that any complete and decomposable SPN is equivalent to a mixture of trees where each tree corresponds to a product of univariate distributions. Based on the mixture model perspective, we characterize the objective function when learning SPNs based on the maximum likelihood estimation (MLE) principle and show that the optimization problem can be formulated as a signomial program. We construct two parameter learning algorithms for SPNs by using sequential monomial approximations (SMA) and the concave-convex procedure (CCCP), respectively. The two proposed methods naturally admit multiplicative updates, hence effectively avoiding the projection operation. With the help of the unified framework, we also show that, in the case of SPNs, CCCP leads to the same algorithm as Expectation Maximization (EM) despite the fact that they are different in general. |
Tasks | |
Published | 2016-01-03 |
URL | http://arxiv.org/abs/1601.00318v4 |
http://arxiv.org/pdf/1601.00318v4.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-approach-for-learning-the |
Repo | |
Framework | |
On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations
Title | On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations |
Authors | Xueyu Mao, Purnamrita Sarkar, Deepayan Chakrabarti |
Abstract | The problem of finding overlapping communities in networks has gained much attention recently. Optimization-based approaches use non-negative matrix factorization (NMF) or variants, but the global optimum cannot be provably attained in general. Model-based approaches, such as the popular mixed-membership stochastic blockmodel or MMSB (Airoldi et al., 2008), use parameters for each node to specify the overlapping communities, but standard inference techniques cannot guarantee consistency. We link the two approaches, by (a) establishing sufficient conditions for the symmetric NMF optimization to have a unique solution under MMSB, and (b) proposing a computationally efficient algorithm called GeoNMF that is provably optimal and hence consistent for a broad parameter regime. We demonstrate its accuracy on both simulated and real-world datasets. |
Tasks | |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00084v2 |
http://arxiv.org/pdf/1607.00084v2.pdf | |
PWC | https://paperswithcode.com/paper/on-mixed-memberships-and-symmetric |
Repo | |
Framework | |
Self-Sustaining Iterated Learning
Title | Self-Sustaining Iterated Learning |
Authors | Bernard Chazelle, Chu Wang |
Abstract | An important result from psycholinguistics (Griffiths & Kalish, 2005) states that no language can be learned iteratively by rational agents in a self-sustaining manner. We show how to modify the learning process slightly in order to achieve self-sustainability. Our work is in two parts. First, we characterize iterated learnability in geometric terms and show how a slight, steady increase in the lengths of the training sessions ensures self-sustainability for any discrete language class. In the second part, we tackle the nondiscrete case and investigate self-sustainability for iterated linear regression. We discuss the implications of our findings to issues of non-equilibrium dynamics in natural algorithms. |
Tasks | |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03960v1 |
http://arxiv.org/pdf/1609.03960v1.pdf | |
PWC | https://paperswithcode.com/paper/self-sustaining-iterated-learning |
Repo | |
Framework | |
Logarithmic Time One-Against-Some
Title | Logarithmic Time One-Against-Some |
Authors | Hal Daume III, Nikos Karampatziakis, John Langford, Paul Mineiro |
Abstract | We create a new online reduction of multiclass classification to binary classification for which training and prediction time scale logarithmically with the number of classes. Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an algorithm. We show that several simple techniques give rise to an algorithm that can compete with one-against-all in both space and predictive power while offering exponential improvements in speed when the number of classes is large. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04988v2 |
http://arxiv.org/pdf/1606.04988v2.pdf | |
PWC | https://paperswithcode.com/paper/logarithmic-time-one-against-some |
Repo | |
Framework | |
PCA/LDA Approach for Text-Independent Speaker Recognition
Title | PCA/LDA Approach for Text-Independent Speaker Recognition |
Authors | Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith |
Abstract | Various algorithms for text-independent speaker recognition have been developed through the decades, aiming to improve both accuracy and efficiency. This paper presents a novel PCA/LDA-based approach that is faster than traditional statistical model-based methods and achieves competitive results. First, the performance based on only PCA and only LDA is measured; then a mixed model, taking advantages of both methods, is introduced. A subset of the TIMIT corpus composed of 200 male speakers, is used for enrollment, validation and testing. The best results achieve 100%; 96% and 95% classification rate at population level 50; 100 and 200, using 39-dimensional MFCC features with delta and double delta. These results are based on 12-second text-independent speech for training and 4-second data for test. These are comparable to the conventional MFCC-GMM methods, but require significantly less time to train and operate. |
Tasks | Speaker Recognition, Text-Independent Speaker Recognition |
Published | 2016-02-25 |
URL | http://arxiv.org/abs/1602.08045v1 |
http://arxiv.org/pdf/1602.08045v1.pdf | |
PWC | https://paperswithcode.com/paper/pcalda-approach-for-text-independent-speaker |
Repo | |
Framework | |
Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions
Title | Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions |
Authors | Shaofei Wang, Charless C. Fowlkes |
Abstract | We describe an end-to-end framework for learning parameters of min-cost flow multi-target tracking problem with quadratic trajectory interactions including suppression of overlapping tracks and contextual cues about cooccurrence of different objects. Our approach utilizes structured prediction with a tracking-specific loss function to learn the complete set of model parameters. In this learning framework, we evaluate two different approaches to finding an optimal set of tracks under a quadratic model objective, one based on an LP relaxation and the other based on novel greedy variants of dynamic programming that handle pairwise interactions. We find the greedy algorithms achieve almost equivalent accuracy to the LP relaxation while being up to 10x faster than a commercial LP solver. We evaluate trained models on three challenging benchmarks. Surprisingly, we find that with proper parameter learning, our simple data association model without explicit appearance/motion reasoning is able to achieve comparable or better accuracy than many state-of-the-art methods that use far more complex motion features or appearance affinity metric learning. |
Tasks | Metric Learning, Structured Prediction |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01394v1 |
http://arxiv.org/pdf/1610.01394v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-optimal-parameters-for-multi-target |
Repo | |
Framework | |