October 19, 2019

3211 words 16 mins read

Paper Group ANR 368

From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning. Universal Adversarial Training. Object and Text-guided Semantics for CNN-based Activity Recognition. Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning. Detecting Correlations with Little Memory and Communication. A C …

From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning


Title	From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning
Authors	Mohammadhosein Hasanbeig, Lacra Pavel
Abstract	The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarantees while relaxing the structural assumptions to include synchronous updates and limitations in information available to the players. On the other hand, model-free reinforcement learning is able to perform even under weaker assumptions on players’ knowledge about the environment and other players’ strategies. We propose a reinforcement algorithm that uses a double-aggregation scheme in order to deepen players’ insight about the environment and constant learning step-size which achieves a higher convergence rate. Numerical experiments are conducted to verify each algorithm’s robustness and performance.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02277v2
PDF	http://arxiv.org/pdf/1802.02277v2.pdf
PWC	https://paperswithcode.com/paper/from-game-theoretic-multi-agent-log-linear
Repo
Framework

Universal Adversarial Training


Title	Universal Adversarial Training
Authors	Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein
Abstract	Standard adversarial attacks change the predicted class label of a selected image by adding specially tailored small perturbations to its pixels. In contrast, a universal perturbation is an update that can be added to any image in a broad class of images, while still changing the predicted class label. We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. We propose a simple optimization-based universal attack that reduces the top-1 accuracy of various network architectures on ImageNet to less than 20%, while learning the universal perturbation 13X faster than the standard method. To defend against these perturbations, we propose universal adversarial training, which models the problem of robust classifier generation as a two-player min-max game, and produces robust models with only 2X the cost of natural training. We also propose a simultaneous stochastic gradient method that is almost free of extra computation, which allows us to do universal adversarial training on ImageNet.
Tasks
Published	2018-11-27
URL	https://arxiv.org/abs/1811.11304v2
PDF	https://arxiv.org/pdf/1811.11304v2.pdf
PWC	https://paperswithcode.com/paper/universal-adversarial-training
Repo
Framework

Object and Text-guided Semantics for CNN-based Activity Recognition


Title	Object and Text-guided Semantics for CNN-based Activity Recognition
Authors	Sungmin Eum, Christopher Reale, Heesung Kwon, Claire Bonial, Clare Voss
Abstract	Many previous methods have demonstrated the importance of considering semantically relevant objects for carrying out video-based human activity recognition, yet none of the methods have harvested the power of large text corpora to relate the objects and the activities to be transferred into learning a unified deep convolutional neural network. We present a novel activity recognition CNN which co-learns the object recognition task in an end-to-end multitask learning scheme to improve upon the baseline activity recognition performance. We further improve upon the multitask learning approach by exploiting a text-guided semantic space to select the most relevant objects with respect to the target activities. To the best of our knowledge, we are the first to investigate this approach.
Tasks	Activity Recognition, Human Activity Recognition, Object Recognition
Published	2018-05-04
URL	http://arxiv.org/abs/1805.01818v1
PDF	http://arxiv.org/pdf/1805.01818v1.pdf
PWC	https://paperswithcode.com/paper/object-and-text-guided-semantics-for-cnn
Repo
Framework

Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning


Title	Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning
Authors	Baoxiang Wang, Tongfang Sun, Xianjun Sam Zheng
Abstract	In recent years, reinforcement learning (RL) methods have been applied to model gameplay with great success, achieving super-human performance in various environments, such as Atari, Go, and Poker. However, those studies mostly focus on winning the game and have largely ignored the rich and complex human motivations, which are essential for understanding different players’ diverse behaviors. In this paper, we present a novel method called Multi-Motivation Behavior Modeling (MMBM) that takes the multifaceted human motivations into consideration and models the underlying value structure of the players using inverse RL. Our approach does not require the access to the dynamic of the system, making it feasible to model complex interactive environments such as massively multiplayer online games. MMBM is tested on the World of Warcraft Avatar History dataset, which recorded over 70,000 users’ gameplay spanning three years period. Our model reveals the significant difference of value structures among different player groups. Using the results of motivation modeling, we also predict and explain their diverse gameplay behaviors and provide a quantitative assessment of how the redesign of the game environment impacts players’ behaviors.
Tasks
Published	2018-07-01
URL	http://arxiv.org/abs/1807.00366v2
PDF	http://arxiv.org/pdf/1807.00366v2.pdf
PWC	https://paperswithcode.com/paper/beyond-winning-and-losing-modeling-human
Repo
Framework

Detecting Correlations with Little Memory and Communication


Title	Detecting Correlations with Little Memory and Communication
Authors	Yuval Dagan, Ohad Shamir
Abstract	We study the problem of identifying correlations in multivariate data, under information constraints: Either on the amount of memory that can be used by the algorithm, or the amount of communication when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required memory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir [2014], which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems.
Tasks
Published	2018-03-04
URL	http://arxiv.org/abs/1803.01420v2
PDF	http://arxiv.org/pdf/1803.01420v2.pdf
PWC	https://paperswithcode.com/paper/detecting-correlations-with-little-memory-and
Repo
Framework

A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese


Title	A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Authors	Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu
Abstract	The choice of modeling units is critical to automatic speech recognition (ASR) tasks. Conventional ASR systems typically choose context-dependent states (CD-states) or context-dependent phonemes (CD-phonemes) as their modeling units. However, it has been challenged by sequence-to-sequence attention-based models, which integrate an acoustic, pronunciation and language model into a single neural network. On English ASR tasks, previous attempts have already shown that the modeling unit of graphemes can outperform that of phonemes by sequence-to-sequence attention-based model. In this paper, we are concerned with modeling units on Mandarin Chinese ASR tasks using sequence-to-sequence attention-based models with the Transformer. Five modeling units are explored including context-independent phonemes (CI-phonemes), syllables, words, sub-words and characters. Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER). Among five modeling units, character based model performs best and establishes a new state-of-the-art CER of $26.64%$ on HKUST datasets without a hand-designed lexicon and an extra language model integration, which corresponds to a $4.8%$ relative improvement over the existing best CER of $28.0%$ by the joint CTC-attention based encoder-decoder network.
Tasks	Language Modelling, Sequence-To-Sequence Speech Recognition, Speech Recognition
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06239v2
PDF	http://arxiv.org/pdf/1805.06239v2.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-modeling-units-in-sequence-to
Repo
Framework

Context-aware Cascade Attention-based RNN for Video Emotion Recognition


Title	Context-aware Cascade Attention-based RNN for Video Emotion Recognition
Authors	Man-Chin Sun, Shih-Huan Hsu, Min-Chun Yang, Jen-Hsien Chien
Abstract	Emotion recognition can provide crucial information about the user in many applications when building human-computer interaction (HCI) systems. Most of current researches on visual emotion recognition are focusing on exploring facial features. However, context information including surrounding environment and human body can also provide extra clues to recognize emotion more accurately. Inspired by “sequence to sequence model” for neural machine translation, which models input and output sequences by an encoder and a decoder in recurrent neural network (RNN) architecture respectively, a novel architecture, “CACA-RNN”, is proposed in this work. The proposed network consists of two RNNs in a cascaded architecture to process both context and facial information to perform video emotion classification. Results of the model were submitted to video emotion recognition sub-challenge in Multimodal Emotion Recognition Challenge (MEC2017). CACA-RNN outperforms the MEC2017 baseline (mAP of 21.7%): it achieved mAP of 45.51% on the testing set in the video only challenge.
Tasks	Emotion Classification, Emotion Recognition, Machine Translation, Multimodal Emotion Recognition, Video Emotion Recognition
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12098v1
PDF	http://arxiv.org/pdf/1805.12098v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-cascade-attention-based-rnn-for
Repo
Framework

Spectral Image Visualization Using Generative Adversarial Networks


Title	Spectral Image Visualization Using Generative Adversarial Networks
Authors	Siyu Chen, Danping Liao, Yuntao Qian
Abstract	Spectral images captured by satellites and radio-telescopes are analyzed to obtain information about geological compositions distributions, distant asters as well as undersea terrain. Spectral images usually contain tens to hundreds of continuous narrow spectral bands and are widely used in various fields. But the vast majority of those image signals are beyond the visible range, which calls for special visualization technique. The visualizations of spectral images shall convey as much information as possible from the original signal and facilitate image interpretation. However, most of the existing visualizatio methods display spectral images in false colors, which contradict with human’s experience and expectation. In this paper, we present a novel visualization generative adversarial network (GAN) to display spectral images in natural colors. To achieve our goal, we propose a loss function which consists of an adversarial loss and a structure loss. The adversarial loss pushes our solution to the natural image distribution using a discriminator network that is trained to differentiate between false-color images and natural-color images. We also use a cycle loss as the structure constraint to guarantee structure consistency. Experimental results show that our method is able to generate structure-preserved and natural-looking visualizations.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02290v1
PDF	http://arxiv.org/pdf/1802.02290v1.pdf
PWC	https://paperswithcode.com/paper/spectral-image-visualization-using-generative
Repo
Framework

Handling Concept Drift via Model Reuse


Title	Handling Concept Drift via Model Reuse
Authors	Peng Zhao, Le-Wen Cai, Zhi-Hua Zhou
Abstract	In many real-world applications, data are often collected in the form of stream, and thus the distribution usually changes in nature, which is referred as concept drift in literature. We propose a novel and effective approach to handle concept drift via model reuse, leveraging previous knowledge by reusing models. Each model is associated with a weight representing its reusability towards current data, and the weight is adaptively adjusted according to the model performance. We provide generalization and regret analysis. Experimental results also validate the superiority of our approach on both synthetic and real-world datasets.
Tasks
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02804v1
PDF	http://arxiv.org/pdf/1809.02804v1.pdf
PWC	https://paperswithcode.com/paper/handling-concept-drift-via-model-reuse
Repo
Framework

Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS


Title	Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS
Authors	Mahardhika Pratama, Choiru Za’in, Eric Pardede
Abstract	Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets.
Tasks	Active Learning
Published	2018-07-18
URL	http://arxiv.org/abs/1807.06996v1
PDF	http://arxiv.org/pdf/1807.06996v1.pdf
PWC	https://paperswithcode.com/paper/evolving-large-scale-data-stream-analytics
Repo
Framework

Improving Temporal Interpolation of Head and Body Pose using Gaussian Process Regression in a Matrix Completion Setting


Title	Improving Temporal Interpolation of Head and Body Pose using Gaussian Process Regression in a Matrix Completion Setting
Authors	Stephanie Tan, Hayley Hung
Abstract	This paper presents a model for head and body pose estimation (HBPE) when labelled samples are highly sparse. The current state-of-the-art multimodal approach to HBPE utilizes the matrix completion method in a transductive setting to predict pose labels for unobserved samples. Based on this approach, the proposed method tackles HBPE when manually annotated ground truth labels are temporally sparse. We posit that the current state of the art approach oversimplifies the temporal sparsity assumption by using Laplacian smoothing. Our final solution uses: i) Gaussian process regression in place of Laplacian smoothing, ii) head and body coupling, and iii) nuclear norm minimization in the matrix completion setting. The model is applied to the challenging SALSA dataset for benchmark against the state-of-the-art method. Our presented formulation outperforms the state-of-the-art significantly in this particular setting, e.g. at 5% ground truth labels as training data, head pose accuracy and body pose accuracy is approximately 62% and 70%, respectively. As well as fitting a more flexible model to missing labels in time, we posit that our approach also loosens the head and body coupling constraint, allowing for a more expressive model of the head and body pose typically seen during conversational interaction in groups. This provides a new baseline to improve upon for future integration of multimodal sensor data for the purpose of HBPE.
Tasks	Matrix Completion, Pose Estimation
Published	2018-08-06
URL	http://arxiv.org/abs/1808.01837v1
PDF	http://arxiv.org/pdf/1808.01837v1.pdf
PWC	https://paperswithcode.com/paper/improving-temporal-interpolation-of-head-and
Repo
Framework

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator


Title	SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Authors	Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang
Abstract	In this paper, we propose a new technique named \textit{Stochastic Path-Integrated Differential EstimatoR} (SPIDER), which can be used to track many deterministic quantities of interest with significantly reduced computational cost. We apply SPIDER to two tasks, namely the stochastic first-order and zeroth-order methods. For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only. We provide sharp error-bound results on their convergence rates. In special, we prove that the SPIDER-SFO and SPIDER-SFO\textsuperscript{+} algorithms achieve a record-breaking gradient computation cost of $\mathcal{O}\left( \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3} ) \right)$ for finding an $\epsilon$-approximate first-order and $\tilde{\mathcal{O}}\left( \min( n^{1/2} \epsilon^{-2}+\epsilon^{-2.5}, \epsilon^{-3} ) \right)$ for finding an $(\epsilon, \mathcal{O}(\epsilon^{0.5}))$-approximate second-order stationary point, respectively. In addition, we prove that SPIDER-SFO nearly matches the algorithmic lower bound for finding approximate first-order stationary points under the gradient Lipschitz assumption in the finite-sum setting. For stochastic zeroth-order method, we prove a cost of $\mathcal{O}( d \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3}) )$ which outperforms all existing results.
Tasks	Stochastic Optimization
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01695v2
PDF	http://arxiv.org/pdf/1807.01695v2.pdf
PWC	https://paperswithcode.com/paper/spider-near-optimal-non-convex-optimization-1
Repo
Framework

On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization


Title	On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization
Authors	Zhehui Chen, Xingguo Li, Lin F. Yang, Jarvis Haupt, Tuo Zhao
Abstract	We study constrained nonconvex optimization problems in machine learning, signal processing, and stochastic control. It is well-known that these problems can be rewritten to a minimax problem in a Lagrangian form. However, due to the lack of convexity, their landscape is not well understood and how to find the stable equilibria of the Lagrangian function is still unknown. To bridge the gap, we study the landscape of the Lagrangian function. Further, we define a special class of Lagrangian functions. They enjoy two properties: 1.Equilibria are either stable or unstable (Formal definition in Section 2); 2.Stable equilibria correspond to the global optima of the original problem. We show that a generalized eigenvalue (GEV) problem, including canonical correlation analysis and other problems, belongs to the class. Specifically, we characterize its stable and unstable equilibria by leveraging an invariant group and symmetric property (more details in Section 3). Motivated by these neat geometric structures, we propose a simple, efficient, and stochastic primal-dual algorithm solving the online GEV problem. Theoretically, we provide sufficient conditions, based on which we establish an asymptotic convergence rate and obtain the first sample complexity result for the online GEV problem by diffusion approximations, which are widely used in applied probability and stochastic control. Numerical results are provided to support our theory.
Tasks
Published	2018-06-13
URL	https://arxiv.org/abs/1806.05151v3
PDF	https://arxiv.org/pdf/1806.05151v3.pdf
PWC	https://paperswithcode.com/paper/on-landscape-of-lagrangian-functions-and
Repo
Framework

ABC: Efficient Selection of Machine Learning Configuration on Large Dataset


Title	ABC: Efficient Selection of Machine Learning Configuration on Large Dataset
Authors	Silu Huang, Chi Wang, Bolin Ding, Surajit Chaudhuri
Abstract	A machine learning configuration refers to a combination of preprocessor, learner, and hyperparameters. Given a set of configurations and a large dataset randomly split into training and testing set, we study how to efficiently select the best configuration with approximately the highest testing accuracy when trained from the training set. To guarantee small accuracy loss, we develop a solution using confidence interval (CI)-based progressive sampling and pruning strategy. Compared to using full data to find the exact best configuration, our solution achieves more than two orders of magnitude speedup, while the returned top configuration has identical or close test accuracy.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03250v2
PDF	http://arxiv.org/pdf/1811.03250v2.pdf
PWC	https://paperswithcode.com/paper/abc-efficient-selection-of-machine-learning
Repo
Framework

Geospatial distributions reflect rates of evolution of features of language


Title	Geospatial distributions reflect rates of evolution of features of language
Authors	Henri Kauhanen, Deepthi Gopal, Tobias Galla, Ricardo Bermúdez-Otero
Abstract	Different structural features of human language change at different rates and thus exhibit different temporal stabilities. Existing methods of linguistic stability estimation depend upon the prior genealogical classification of the world’s languages into language families; these methods result in unreliable stability estimates for features which are sensitive to horizontal transfer between families and whenever data are aggregated from families of divergent time depths. To overcome these problems, we describe a method of stability estimation without family classifications, based on mathematical modelling and the analysis of contemporary geospatial distributions of linguistic features. Regressing the estimates produced by our model against those of a genealogical method, we report broad agreement but also important differences. In particular, we show that our approach is not liable to some of the false positives and false negatives incurred by the genealogical method. Our results suggest that the historical evolution of a linguistic feature leaves a footprint in its global geospatial distribution, and that rates of evolution can be recovered from these distributions by treating language dynamics as a spatially extended stochastic process.
Tasks
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09637v1
PDF	http://arxiv.org/pdf/1801.09637v1.pdf
PWC	https://paperswithcode.com/paper/geospatial-distributions-reflect-rates-of
Repo
Framework