Paper Group ANR 368
From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning. Universal Adversarial Training. Object and Text-guided Semantics for CNN-based Activity Recognition. Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning. Detecting Correlations with Little Memory and Communication. A C …
From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning
Title | From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning |
Authors | Mohammadhosein Hasanbeig, Lacra Pavel |
Abstract | The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarantees while relaxing the structural assumptions to include synchronous updates and limitations in information available to the players. On the other hand, model-free reinforcement learning is able to perform even under weaker assumptions on players’ knowledge about the environment and other players’ strategies. We propose a reinforcement algorithm that uses a double-aggregation scheme in order to deepen players’ insight about the environment and constant learning step-size which achieves a higher convergence rate. Numerical experiments are conducted to verify each algorithm’s robustness and performance. |
Tasks | |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02277v2 |
http://arxiv.org/pdf/1802.02277v2.pdf | |
PWC | https://paperswithcode.com/paper/from-game-theoretic-multi-agent-log-linear |
Repo | |
Framework | |
Universal Adversarial Training
Title | Universal Adversarial Training |
Authors | Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein |
Abstract | Standard adversarial attacks change the predicted class label of a selected image by adding specially tailored small perturbations to its pixels. In contrast, a universal perturbation is an update that can be added to any image in a broad class of images, while still changing the predicted class label. We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. We propose a simple optimization-based universal attack that reduces the top-1 accuracy of various network architectures on ImageNet to less than 20%, while learning the universal perturbation 13X faster than the standard method. To defend against these perturbations, we propose universal adversarial training, which models the problem of robust classifier generation as a two-player min-max game, and produces robust models with only 2X the cost of natural training. We also propose a simultaneous stochastic gradient method that is almost free of extra computation, which allows us to do universal adversarial training on ImageNet. |
Tasks | |
Published | 2018-11-27 |
URL | https://arxiv.org/abs/1811.11304v2 |
https://arxiv.org/pdf/1811.11304v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-adversarial-training |
Repo | |
Framework | |
Object and Text-guided Semantics for CNN-based Activity Recognition
Title | Object and Text-guided Semantics for CNN-based Activity Recognition |
Authors | Sungmin Eum, Christopher Reale, Heesung Kwon, Claire Bonial, Clare Voss |
Abstract | Many previous methods have demonstrated the importance of considering semantically relevant objects for carrying out video-based human activity recognition, yet none of the methods have harvested the power of large text corpora to relate the objects and the activities to be transferred into learning a unified deep convolutional neural network. We present a novel activity recognition CNN which co-learns the object recognition task in an end-to-end multitask learning scheme to improve upon the baseline activity recognition performance. We further improve upon the multitask learning approach by exploiting a text-guided semantic space to select the most relevant objects with respect to the target activities. To the best of our knowledge, we are the first to investigate this approach. |
Tasks | Activity Recognition, Human Activity Recognition, Object Recognition |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01818v1 |
http://arxiv.org/pdf/1805.01818v1.pdf | |
PWC | https://paperswithcode.com/paper/object-and-text-guided-semantics-for-cnn |
Repo | |
Framework | |
Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning
Title | Beyond Winning and Losing: Modeling Human Motivations and Behaviors Using Inverse Reinforcement Learning |
Authors | Baoxiang Wang, Tongfang Sun, Xianjun Sam Zheng |
Abstract | In recent years, reinforcement learning (RL) methods have been applied to model gameplay with great success, achieving super-human performance in various environments, such as Atari, Go, and Poker. However, those studies mostly focus on winning the game and have largely ignored the rich and complex human motivations, which are essential for understanding different players’ diverse behaviors. In this paper, we present a novel method called Multi-Motivation Behavior Modeling (MMBM) that takes the multifaceted human motivations into consideration and models the underlying value structure of the players using inverse RL. Our approach does not require the access to the dynamic of the system, making it feasible to model complex interactive environments such as massively multiplayer online games. MMBM is tested on the World of Warcraft Avatar History dataset, which recorded over 70,000 users’ gameplay spanning three years period. Our model reveals the significant difference of value structures among different player groups. Using the results of motivation modeling, we also predict and explain their diverse gameplay behaviors and provide a quantitative assessment of how the redesign of the game environment impacts players’ behaviors. |
Tasks | |
Published | 2018-07-01 |
URL | http://arxiv.org/abs/1807.00366v2 |
http://arxiv.org/pdf/1807.00366v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-winning-and-losing-modeling-human |
Repo | |
Framework | |
Detecting Correlations with Little Memory and Communication
Title | Detecting Correlations with Little Memory and Communication |
Authors | Yuval Dagan, Ohad Shamir |
Abstract | We study the problem of identifying correlations in multivariate data, under information constraints: Either on the amount of memory that can be used by the algorithm, or the amount of communication when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required memory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir [2014], which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems. |
Tasks | |
Published | 2018-03-04 |
URL | http://arxiv.org/abs/1803.01420v2 |
http://arxiv.org/pdf/1803.01420v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-correlations-with-little-memory-and |
Repo | |
Framework | |
A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Title | A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese |
Authors | Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu |
Abstract | The choice of modeling units is critical to automatic speech recognition (ASR) tasks. Conventional ASR systems typically choose context-dependent states (CD-states) or context-dependent phonemes (CD-phonemes) as their modeling units. However, it has been challenged by sequence-to-sequence attention-based models, which integrate an acoustic, pronunciation and language model into a single neural network. On English ASR tasks, previous attempts have already shown that the modeling unit of graphemes can outperform that of phonemes by sequence-to-sequence attention-based model. In this paper, we are concerned with modeling units on Mandarin Chinese ASR tasks using sequence-to-sequence attention-based models with the Transformer. Five modeling units are explored including context-independent phonemes (CI-phonemes), syllables, words, sub-words and characters. Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER). Among five modeling units, character based model performs best and establishes a new state-of-the-art CER of $26.64%$ on HKUST datasets without a hand-designed lexicon and an extra language model integration, which corresponds to a $4.8%$ relative improvement over the existing best CER of $28.0%$ by the joint CTC-attention based encoder-decoder network. |
Tasks | Language Modelling, Sequence-To-Sequence Speech Recognition, Speech Recognition |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06239v2 |
http://arxiv.org/pdf/1805.06239v2.pdf | |
PWC | https://paperswithcode.com/paper/a-comparison-of-modeling-units-in-sequence-to |
Repo | |
Framework | |
Context-aware Cascade Attention-based RNN for Video Emotion Recognition
Title | Context-aware Cascade Attention-based RNN for Video Emotion Recognition |
Authors | Man-Chin Sun, Shih-Huan Hsu, Min-Chun Yang, Jen-Hsien Chien |
Abstract | Emotion recognition can provide crucial information about the user in many applications when building human-computer interaction (HCI) systems. Most of current researches on visual emotion recognition are focusing on exploring facial features. However, context information including surrounding environment and human body can also provide extra clues to recognize emotion more accurately. Inspired by “sequence to sequence model” for neural machine translation, which models input and output sequences by an encoder and a decoder in recurrent neural network (RNN) architecture respectively, a novel architecture, “CACA-RNN”, is proposed in this work. The proposed network consists of two RNNs in a cascaded architecture to process both context and facial information to perform video emotion classification. Results of the model were submitted to video emotion recognition sub-challenge in Multimodal Emotion Recognition Challenge (MEC2017). CACA-RNN outperforms the MEC2017 baseline (mAP of 21.7%): it achieved mAP of 45.51% on the testing set in the video only challenge. |
Tasks | Emotion Classification, Emotion Recognition, Machine Translation, Multimodal Emotion Recognition, Video Emotion Recognition |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.12098v1 |
http://arxiv.org/pdf/1805.12098v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-cascade-attention-based-rnn-for |
Repo | |
Framework | |
Spectral Image Visualization Using Generative Adversarial Networks
Title | Spectral Image Visualization Using Generative Adversarial Networks |
Authors | Siyu Chen, Danping Liao, Yuntao Qian |
Abstract | Spectral images captured by satellites and radio-telescopes are analyzed to obtain information about geological compositions distributions, distant asters as well as undersea terrain. Spectral images usually contain tens to hundreds of continuous narrow spectral bands and are widely used in various fields. But the vast majority of those image signals are beyond the visible range, which calls for special visualization technique. The visualizations of spectral images shall convey as much information as possible from the original signal and facilitate image interpretation. However, most of the existing visualizatio methods display spectral images in false colors, which contradict with human’s experience and expectation. In this paper, we present a novel visualization generative adversarial network (GAN) to display spectral images in natural colors. To achieve our goal, we propose a loss function which consists of an adversarial loss and a structure loss. The adversarial loss pushes our solution to the natural image distribution using a discriminator network that is trained to differentiate between false-color images and natural-color images. We also use a cycle loss as the structure constraint to guarantee structure consistency. Experimental results show that our method is able to generate structure-preserved and natural-looking visualizations. |
Tasks | |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02290v1 |
http://arxiv.org/pdf/1802.02290v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-image-visualization-using-generative |
Repo | |
Framework | |
Handling Concept Drift via Model Reuse
Title | Handling Concept Drift via Model Reuse |
Authors | Peng Zhao, Le-Wen Cai, Zhi-Hua Zhou |
Abstract | In many real-world applications, data are often collected in the form of stream, and thus the distribution usually changes in nature, which is referred as concept drift in literature. We propose a novel and effective approach to handle concept drift via model reuse, leveraging previous knowledge by reusing models. Each model is associated with a weight representing its reusability towards current data, and the weight is adaptively adjusted according to the model performance. We provide generalization and regret analysis. Experimental results also validate the superiority of our approach on both synthetic and real-world datasets. |
Tasks | |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02804v1 |
http://arxiv.org/pdf/1809.02804v1.pdf | |
PWC | https://paperswithcode.com/paper/handling-concept-drift-via-model-reuse |
Repo | |
Framework | |
Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS
Title | Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS |
Authors | Mahardhika Pratama, Choiru Za’in, Eric Pardede |
Abstract | Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets. |
Tasks | Active Learning |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06996v1 |
http://arxiv.org/pdf/1807.06996v1.pdf | |
PWC | https://paperswithcode.com/paper/evolving-large-scale-data-stream-analytics |
Repo | |
Framework | |
Improving Temporal Interpolation of Head and Body Pose using Gaussian Process Regression in a Matrix Completion Setting
Title | Improving Temporal Interpolation of Head and Body Pose using Gaussian Process Regression in a Matrix Completion Setting |
Authors | Stephanie Tan, Hayley Hung |
Abstract | This paper presents a model for head and body pose estimation (HBPE) when labelled samples are highly sparse. The current state-of-the-art multimodal approach to HBPE utilizes the matrix completion method in a transductive setting to predict pose labels for unobserved samples. Based on this approach, the proposed method tackles HBPE when manually annotated ground truth labels are temporally sparse. We posit that the current state of the art approach oversimplifies the temporal sparsity assumption by using Laplacian smoothing. Our final solution uses: i) Gaussian process regression in place of Laplacian smoothing, ii) head and body coupling, and iii) nuclear norm minimization in the matrix completion setting. The model is applied to the challenging SALSA dataset for benchmark against the state-of-the-art method. Our presented formulation outperforms the state-of-the-art significantly in this particular setting, e.g. at 5% ground truth labels as training data, head pose accuracy and body pose accuracy is approximately 62% and 70%, respectively. As well as fitting a more flexible model to missing labels in time, we posit that our approach also loosens the head and body coupling constraint, allowing for a more expressive model of the head and body pose typically seen during conversational interaction in groups. This provides a new baseline to improve upon for future integration of multimodal sensor data for the purpose of HBPE. |
Tasks | Matrix Completion, Pose Estimation |
Published | 2018-08-06 |
URL | http://arxiv.org/abs/1808.01837v1 |
http://arxiv.org/pdf/1808.01837v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-temporal-interpolation-of-head-and |
Repo | |
Framework | |
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Title | SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator |
Authors | Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang |
Abstract | In this paper, we propose a new technique named \textit{Stochastic Path-Integrated Differential EstimatoR} (SPIDER), which can be used to track many deterministic quantities of interest with significantly reduced computational cost. We apply SPIDER to two tasks, namely the stochastic first-order and zeroth-order methods. For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only. We provide sharp error-bound results on their convergence rates. In special, we prove that the SPIDER-SFO and SPIDER-SFO\textsuperscript{+} algorithms achieve a record-breaking gradient computation cost of $\mathcal{O}\left( \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3} ) \right)$ for finding an $\epsilon$-approximate first-order and $\tilde{\mathcal{O}}\left( \min( n^{1/2} \epsilon^{-2}+\epsilon^{-2.5}, \epsilon^{-3} ) \right)$ for finding an $(\epsilon, \mathcal{O}(\epsilon^{0.5}))$-approximate second-order stationary point, respectively. In addition, we prove that SPIDER-SFO nearly matches the algorithmic lower bound for finding approximate first-order stationary points under the gradient Lipschitz assumption in the finite-sum setting. For stochastic zeroth-order method, we prove a cost of $\mathcal{O}( d \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3}) )$ which outperforms all existing results. |
Tasks | Stochastic Optimization |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01695v2 |
http://arxiv.org/pdf/1807.01695v2.pdf | |
PWC | https://paperswithcode.com/paper/spider-near-optimal-non-convex-optimization-1 |
Repo | |
Framework | |
On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization
Title | On Landscape of Lagrangian Functions and Stochastic Search for Constrained Nonconvex Optimization |
Authors | Zhehui Chen, Xingguo Li, Lin F. Yang, Jarvis Haupt, Tuo Zhao |
Abstract | We study constrained nonconvex optimization problems in machine learning, signal processing, and stochastic control. It is well-known that these problems can be rewritten to a minimax problem in a Lagrangian form. However, due to the lack of convexity, their landscape is not well understood and how to find the stable equilibria of the Lagrangian function is still unknown. To bridge the gap, we study the landscape of the Lagrangian function. Further, we define a special class of Lagrangian functions. They enjoy two properties: 1.Equilibria are either stable or unstable (Formal definition in Section 2); 2.Stable equilibria correspond to the global optima of the original problem. We show that a generalized eigenvalue (GEV) problem, including canonical correlation analysis and other problems, belongs to the class. Specifically, we characterize its stable and unstable equilibria by leveraging an invariant group and symmetric property (more details in Section 3). Motivated by these neat geometric structures, we propose a simple, efficient, and stochastic primal-dual algorithm solving the online GEV problem. Theoretically, we provide sufficient conditions, based on which we establish an asymptotic convergence rate and obtain the first sample complexity result for the online GEV problem by diffusion approximations, which are widely used in applied probability and stochastic control. Numerical results are provided to support our theory. |
Tasks | |
Published | 2018-06-13 |
URL | https://arxiv.org/abs/1806.05151v3 |
https://arxiv.org/pdf/1806.05151v3.pdf | |
PWC | https://paperswithcode.com/paper/on-landscape-of-lagrangian-functions-and |
Repo | |
Framework | |
ABC: Efficient Selection of Machine Learning Configuration on Large Dataset
Title | ABC: Efficient Selection of Machine Learning Configuration on Large Dataset |
Authors | Silu Huang, Chi Wang, Bolin Ding, Surajit Chaudhuri |
Abstract | A machine learning configuration refers to a combination of preprocessor, learner, and hyperparameters. Given a set of configurations and a large dataset randomly split into training and testing set, we study how to efficiently select the best configuration with approximately the highest testing accuracy when trained from the training set. To guarantee small accuracy loss, we develop a solution using confidence interval (CI)-based progressive sampling and pruning strategy. Compared to using full data to find the exact best configuration, our solution achieves more than two orders of magnitude speedup, while the returned top configuration has identical or close test accuracy. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03250v2 |
http://arxiv.org/pdf/1811.03250v2.pdf | |
PWC | https://paperswithcode.com/paper/abc-efficient-selection-of-machine-learning |
Repo | |
Framework | |
Geospatial distributions reflect rates of evolution of features of language
Title | Geospatial distributions reflect rates of evolution of features of language |
Authors | Henri Kauhanen, Deepthi Gopal, Tobias Galla, Ricardo Bermúdez-Otero |
Abstract | Different structural features of human language change at different rates and thus exhibit different temporal stabilities. Existing methods of linguistic stability estimation depend upon the prior genealogical classification of the world’s languages into language families; these methods result in unreliable stability estimates for features which are sensitive to horizontal transfer between families and whenever data are aggregated from families of divergent time depths. To overcome these problems, we describe a method of stability estimation without family classifications, based on mathematical modelling and the analysis of contemporary geospatial distributions of linguistic features. Regressing the estimates produced by our model against those of a genealogical method, we report broad agreement but also important differences. In particular, we show that our approach is not liable to some of the false positives and false negatives incurred by the genealogical method. Our results suggest that the historical evolution of a linguistic feature leaves a footprint in its global geospatial distribution, and that rates of evolution can be recovered from these distributions by treating language dynamics as a spatially extended stochastic process. |
Tasks | |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09637v1 |
http://arxiv.org/pdf/1801.09637v1.pdf | |
PWC | https://paperswithcode.com/paper/geospatial-distributions-reflect-rates-of |
Repo | |
Framework | |