May 5, 2019

2894 words 14 mins read

Paper Group ANR 571

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games. Combining policy gradient and Q-learning. Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification. Deep Motion Features for Visual Tracking. Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognit …

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games


Title	A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games
Authors	Felix Leibfried, Nate Kushman, Katja Hofmann
Abstract	Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts of data. Model-based techniques are more data-efficient, but need to acquire explicit knowledge about the environment. In this paper, we take a step towards using model-based techniques in environments with a high-dimensional visual state space by demonstrating that it is possible to learn system dynamics and the reward structure jointly. Our contribution is to extend a recently developed deep neural network for video frame prediction in Atari games to enable reward prediction as well. To this end, we phrase a joint optimization problem for minimizing both video frame and reward reconstruction loss, and adapt network parameters accordingly. Empirical evaluations on five Atari games demonstrate accurate cumulative reward prediction of up to 200 frames. We consider these results as opening up important directions for model-based reinforcement learning in complex, initially unknown environments.
Tasks	Atari Games
Published	2016-11-21
URL	http://arxiv.org/abs/1611.07078v2
PDF	http://arxiv.org/pdf/1611.07078v2.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-for-joint-video
Repo
Framework

Combining policy gradient and Q-learning


Title	Combining policy gradient and Q-learning
Authors	Brendan O’Donoghue, Remi Munos, Koray Kavukcuoglu, Volodymyr Mnih
Abstract	Policy gradient is an efficient technique for improving a policy in a reinforcement learning setting. However, vanilla online variants are on-policy only and not able to take advantage of off-policy data. In this paper we describe a new technique that combines policy gradient with off-policy Q-learning, drawing experience from a replay buffer. This is motivated by making a connection between the fixed points of the regularized policy gradient algorithm and the Q-values. This connection allows us to estimate the Q-values from the action preferences of the policy, to which we apply Q-learning updates. We refer to the new technique as ‘PGQL’, for policy gradient and Q-learning. We also establish an equivalency between action-value fitting techniques and actor-critic algorithms, showing that regularized policy gradient techniques can be interpreted as advantage function learning algorithms. We conclude with some numerical examples that demonstrate improved data efficiency and stability of PGQL. In particular, we tested PGQL on the full suite of Atari games and achieved performance exceeding that of both asynchronous advantage actor-critic (A3C) and Q-learning.
Tasks	Atari Games, Q-Learning
Published	2016-11-05
URL	http://arxiv.org/abs/1611.01626v3
PDF	http://arxiv.org/pdf/1611.01626v3.pdf
PWC	https://paperswithcode.com/paper/combining-policy-gradient-and-q-learning
Repo
Framework

Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification


Title	Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification
Authors	Ali Diba, Ali Mohammad Pazandeh, Luc Van Gool
Abstract	The video and action classification have extremely evolved by deep neural networks specially with two stream CNN using RGB and optical flow as inputs and they present outstanding performance in terms of video analysis. One of the shortcoming of these methods is handling motion information extraction which is done out side of the CNNs and relatively time consuming also on GPUs. So proposing end-to-end methods which are exploring to learn motion representation, like 3D-CNN can achieve faster and accurate performance. We present some novel deep CNNs using 3D architecture to model actions and motion representation in an efficient way to be accurate and also as fast as real-time. Our new networks learn distinctive models to combine deep motion features into appearance model via learning optical flow features inside the network.
Tasks	Action Classification, Optical Flow Estimation, Video Classification
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08851v2
PDF	http://arxiv.org/pdf/1608.08851v2.pdf
PWC	https://paperswithcode.com/paper/efficient-two-stream-motion-and-appearance-3d
Repo
Framework

Deep Motion Features for Visual Tracking


Title	Deep Motion Features for Visual Tracking
Authors	Susanna Gladh, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg
Abstract	Robust visual tracking is a challenging computer vision problem, with many real-world applications. Most existing approaches employ hand-crafted appearance features, such as HOG or Color Names. Recently, deep RGB features extracted from convolutional neural networks have been successfully applied for tracking. Despite their success, these features only capture appearance information. On the other hand, motion cues provide discriminative and complementary information that can improve tracking performance. Contrary to visual tracking, deep motion features have been successfully applied for action recognition and video classification tasks. Typically, the motion features are learned by training a CNN on optical flow images extracted from large amounts of labeled videos. This paper presents an investigation of the impact of deep motion features in a tracking-by-detection framework. We further show that hand-crafted, deep RGB, and deep motion features contain complementary information. To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking. Comprehensive experiments clearly suggest that our fusion approach with deep motion features outperforms standard methods relying on appearance information alone.
Tasks	Optical Flow Estimation, Temporal Action Localization, Video Classification, Visual Tracking
Published	2016-12-20
URL	http://arxiv.org/abs/1612.06615v1
PDF	http://arxiv.org/pdf/1612.06615v1.pdf
PWC	https://paperswithcode.com/paper/deep-motion-features-for-visual-tracking
Repo
Framework

Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition


Title	Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition
Authors	César Roberto de Souza, Adrien Gaidon, Eleonora Vig, Antonio Manuel López
Abstract	Action recognition in videos is a challenging task due to the complexity of the spatio-temporal patterns to model and the difficulty to acquire and learn on large quantities of video data. Deep learning, although a breakthrough for image classification and showing promise for videos, has still not clearly superseded action recognition methods using hand-crafted features, even when training on massive datasets. In this paper, we introduce hybrid video classification architectures based on carefully designed unsupervised representations of hand-crafted spatio-temporal features classified by supervised deep networks. As we show in our experiments on five popular benchmarks for action recognition, our hybrid model combines the best of both worlds: it is data efficient (trained on 150 to 10000 short clips) and yet improves significantly on the state of the art, including recent deep models trained on millions of manually labelled images and videos.
Tasks	Action Recognition In Videos, Image Classification, Temporal Action Localization, Video Classification
Published	2016-08-25
URL	http://arxiv.org/abs/1608.07138v1
PDF	http://arxiv.org/pdf/1608.07138v1.pdf
PWC	https://paperswithcode.com/paper/sympathy-for-the-details-dense-trajectories
Repo
Framework

A Novel Memetic Feature Selection Algorithm


Title	A Novel Memetic Feature Selection Algorithm
Authors	Mohadeseh Montazeri, Hamid Reza Naji, Mitra Montazeri, Ahmad Faraahi
Abstract	Feature selection is a problem of finding efficient features among all features in which the final feature set can improve accuracy and reduce complexity. In feature selection algorithms search strategies are key aspects. Since feature selection is an NP-Hard problem; therefore heuristic algorithms have been studied to solve this problem. In this paper, we have proposed a method based on memetic algorithm to find an efficient feature subset for a classification problem. It incorporates a filter method in the genetic algorithm to improve classification performance and accelerates the search in identifying core feature subsets. Particularly, the method adds or deletes a feature from a candidate feature subset based on the multivariate feature information. Empirical study on commonly data sets of the university of California, Irvine shows that the proposed method outperforms existing methods.
Tasks	Feature Selection
Published	2016-01-26
URL	http://arxiv.org/abs/1601.06933v1
PDF	http://arxiv.org/pdf/1601.06933v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-memetic-feature-selection-algorithm
Repo
Framework

Automatic Content-aware Non-Photorealistic Rendering of Images


Title	Automatic Content-aware Non-Photorealistic Rendering of Images
Authors	Akshay Gadi Patil, Shanmuganathan Raman
Abstract	Non-photorealistic rendering techniques work on image features and often manipulate a set of characteristics such as edges and texture to achieve a desired depiction of the scene. Most computational photography methods decompose an image using edge preserving filters and work on the resulting base and detail layers independently to achieve desired visual effects. We propose a new approach for content-aware non-photorealistic rendering of images where we manipulate the visually salient and the non-salient regions separately. We propose a novel content-aware framework in order to render an image for applications such as detail exaggeration, artificial blurring and image abstraction. The processed regions of the image are blended seamlessly for all these applications. We demonstrate that content awareness of the proposed method leads to automatic generation of non-photorealistic rendering of the same image for the different applications mentioned above.
Tasks
Published	2016-04-07
URL	http://arxiv.org/abs/1604.01962v4
PDF	http://arxiv.org/pdf/1604.01962v4.pdf
PWC	https://paperswithcode.com/paper/automatic-content-aware-non-photorealistic
Repo
Framework

Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition


Title	Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition
Authors	Mingrui Liu, Tianbao Yang
Abstract	Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC). However, the faster convergence of restarting APG method relies on the potentially unknown constant in QGC to appropriately restart APG, which restricts its applicability. We address this issue by developing a novel adaptive gradient converging methods, i.e., leveraging the magnitude of proximal gradient as a criterion for restart and termination. Our analysis extends to a much more general condition beyond the QGC, namely the H"{o}lderian error bound (HEB) condition. {\it The key technique} for our development is a novel synthesis of {\it adaptive regularization and a conditional restarting scheme}, which extends previous work focusing on strongly convex problems to a much broader family of problems. Furthermore, we demonstrate that our results have important implication and applications in machine learning: (i) if the objective function is coercive and semi-algebraic, PG’s convergence speed is essentially $o(\frac{1}{t})$, where $t$ is the total number of iterations; (ii) if the objective function consists of an $\ell_1$, $\ell_\infty$, $\ell_{1,\infty}$, or huber norm regularization and a convex smooth piecewise quadratic loss (e.g., squares loss, squared hinge loss and huber loss), the proposed algorithm is parameter-free and enjoys a {\it faster linear convergence} than PG without any other assumptions (e.g., restricted eigen-value condition). It is notable that our linear convergence results for the aforementioned problems are global instead of local. To the best of our knowledge, these improved results are the first shown in this work.
Tasks
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07609v2
PDF	http://arxiv.org/pdf/1611.07609v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-accelerated-gradient-converging-1
Repo
Framework

Semantic Similarity Strategies for Job Title Classification


Title	Semantic Similarity Strategies for Job Title Classification
Authors	Yun Zhu, Faizan Javed, Ozgur Ozturk
Abstract	Automatic and accurate classification of items enables numerous downstream applications in many domains. These applications can range from faceted browsing of items to product recommendations and big data analytics. In the online recruitment domain, we refer to classifying job ads to pre-defined or custom occupation categories as job title classification. A large-scale job title classification system can power various downstream applications such as semantic search, job recommendations and labor market analytics. In this paper, we discuss experiments conducted to improve our in-house job title classification system. The classification component of the system is composed of a two-stage coarse and fine level classifier cascade that classifies input text such as job title and/or job ads to one of the thousands of job titles in our taxonomy. To improve classification accuracy and effectiveness, we experiment with various semantic representation strategies such as average W2V vectors and document similarity measures such as Word Movers Distance (WMD). Our initial results show an overall improvement in accuracy of Carotene[1].
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-09-20
URL	http://arxiv.org/abs/1609.06268v1
PDF	http://arxiv.org/pdf/1609.06268v1.pdf
PWC	https://paperswithcode.com/paper/semantic-similarity-strategies-for-job-title
Repo
Framework

A Unified Approach for Learning the Parameters of Sum-Product Networks


Title	A Unified Approach for Learning the Parameters of Sum-Product Networks
Authors	Han Zhao, Pascal Poupart, Geoff Gordon
Abstract	We present a unified approach for learning the parameters of Sum-Product networks (SPNs). We prove that any complete and decomposable SPN is equivalent to a mixture of trees where each tree corresponds to a product of univariate distributions. Based on the mixture model perspective, we characterize the objective function when learning SPNs based on the maximum likelihood estimation (MLE) principle and show that the optimization problem can be formulated as a signomial program. We construct two parameter learning algorithms for SPNs by using sequential monomial approximations (SMA) and the concave-convex procedure (CCCP), respectively. The two proposed methods naturally admit multiplicative updates, hence effectively avoiding the projection operation. With the help of the unified framework, we also show that, in the case of SPNs, CCCP leads to the same algorithm as Expectation Maximization (EM) despite the fact that they are different in general.
Tasks
Published	2016-01-03
URL	http://arxiv.org/abs/1601.00318v4
PDF	http://arxiv.org/pdf/1601.00318v4.pdf
PWC	https://paperswithcode.com/paper/a-unified-approach-for-learning-the
Repo
Framework

On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations


Title	On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations
Authors	Xueyu Mao, Purnamrita Sarkar, Deepayan Chakrabarti
Abstract	The problem of finding overlapping communities in networks has gained much attention recently. Optimization-based approaches use non-negative matrix factorization (NMF) or variants, but the global optimum cannot be provably attained in general. Model-based approaches, such as the popular mixed-membership stochastic blockmodel or MMSB (Airoldi et al., 2008), use parameters for each node to specify the overlapping communities, but standard inference techniques cannot guarantee consistency. We link the two approaches, by (a) establishing sufficient conditions for the symmetric NMF optimization to have a unique solution under MMSB, and (b) proposing a computationally efficient algorithm called GeoNMF that is provably optimal and hence consistent for a broad parameter regime. We demonstrate its accuracy on both simulated and real-world datasets.
Tasks
Published	2016-07-01
URL	http://arxiv.org/abs/1607.00084v2
PDF	http://arxiv.org/pdf/1607.00084v2.pdf
PWC	https://paperswithcode.com/paper/on-mixed-memberships-and-symmetric
Repo
Framework

Self-Sustaining Iterated Learning


Title	Self-Sustaining Iterated Learning
Authors	Bernard Chazelle, Chu Wang
Abstract	An important result from psycholinguistics (Griffiths & Kalish, 2005) states that no language can be learned iteratively by rational agents in a self-sustaining manner. We show how to modify the learning process slightly in order to achieve self-sustainability. Our work is in two parts. First, we characterize iterated learnability in geometric terms and show how a slight, steady increase in the lengths of the training sessions ensures self-sustainability for any discrete language class. In the second part, we tackle the nondiscrete case and investigate self-sustainability for iterated linear regression. We discuss the implications of our findings to issues of non-equilibrium dynamics in natural algorithms.
Tasks
Published	2016-09-13
URL	http://arxiv.org/abs/1609.03960v1
PDF	http://arxiv.org/pdf/1609.03960v1.pdf
PWC	https://paperswithcode.com/paper/self-sustaining-iterated-learning
Repo
Framework

Logarithmic Time One-Against-Some


Title	Logarithmic Time One-Against-Some
Authors	Hal Daume III, Nikos Karampatziakis, John Langford, Paul Mineiro
Abstract	We create a new online reduction of multiclass classification to binary classification for which training and prediction time scale logarithmically with the number of classes. Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an algorithm. We show that several simple techniques give rise to an algorithm that can compete with one-against-all in both space and predictive power while offering exponential improvements in speed when the number of classes is large.
Tasks
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04988v2
PDF	http://arxiv.org/pdf/1606.04988v2.pdf
PWC	https://paperswithcode.com/paper/logarithmic-time-one-against-some
Repo
Framework

PCA/LDA Approach for Text-Independent Speaker Recognition


Title	PCA/LDA Approach for Text-Independent Speaker Recognition
Authors	Zhenhao Ge, Sudhendu R. Sharma, Mark J. T. Smith
Abstract	Various algorithms for text-independent speaker recognition have been developed through the decades, aiming to improve both accuracy and efficiency. This paper presents a novel PCA/LDA-based approach that is faster than traditional statistical model-based methods and achieves competitive results. First, the performance based on only PCA and only LDA is measured; then a mixed model, taking advantages of both methods, is introduced. A subset of the TIMIT corpus composed of 200 male speakers, is used for enrollment, validation and testing. The best results achieve 100%; 96% and 95% classification rate at population level 50; 100 and 200, using 39-dimensional MFCC features with delta and double delta. These results are based on 12-second text-independent speech for training and 4-second data for test. These are comparable to the conventional MFCC-GMM methods, but require significantly less time to train and operate.
Tasks	Speaker Recognition, Text-Independent Speaker Recognition
Published	2016-02-25
URL	http://arxiv.org/abs/1602.08045v1
PDF	http://arxiv.org/pdf/1602.08045v1.pdf
PWC	https://paperswithcode.com/paper/pcalda-approach-for-text-independent-speaker
Repo
Framework

Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions


Title	Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions
Authors	Shaofei Wang, Charless C. Fowlkes
Abstract	We describe an end-to-end framework for learning parameters of min-cost flow multi-target tracking problem with quadratic trajectory interactions including suppression of overlapping tracks and contextual cues about cooccurrence of different objects. Our approach utilizes structured prediction with a tracking-specific loss function to learn the complete set of model parameters. In this learning framework, we evaluate two different approaches to finding an optimal set of tracks under a quadratic model objective, one based on an LP relaxation and the other based on novel greedy variants of dynamic programming that handle pairwise interactions. We find the greedy algorithms achieve almost equivalent accuracy to the LP relaxation while being up to 10x faster than a commercial LP solver. We evaluate trained models on three challenging benchmarks. Surprisingly, we find that with proper parameter learning, our simple data association model without explicit appearance/motion reasoning is able to achieve comparable or better accuracy than many state-of-the-art methods that use far more complex motion features or appearance affinity metric learning.
Tasks	Metric Learning, Structured Prediction
Published	2016-10-05
URL	http://arxiv.org/abs/1610.01394v1
PDF	http://arxiv.org/pdf/1610.01394v1.pdf
PWC	https://paperswithcode.com/paper/learning-optimal-parameters-for-multi-target
Repo
Framework