January 27, 2020

3291 words 16 mins read

Paper Group ANR 1210

Paper Group ANR 1210

Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network. Mining Data from the Congressional Record. Wield: Systematic Reinforcement Learning With Progressive Randomization. A unified sparse optimization framework to learn parsimonious physics-informed models from data. An Unsupervised Character-Aware Neural A …

Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network

Title Automatic 4D Facial Expression Recognition via Collaborative Cross-domain Dynamic Image Network
Authors Muzammil Behzad, Nhat Vo, Xiaobai Li, Guoying Zhao
Abstract This paper proposes a novel 4D Facial Expression Recognition (FER) method using Collaborative Cross-domain Dynamic Image Network (CCDN). Given a 4D data of face scans, we first compute its geometrical images, and then combine their correlated information in the proposed cross-domain image representations. The acquired set is then used to generate cross-domain dynamic images (CDI) via rank pooling that encapsulates facial deformations over time in terms of a single image. For the training phase, these CDIs are fed into an end-to-end deep learning model, and the resultant predictions collaborate over multi-views for performance gain in expression classification. Furthermore, we propose a 4D augmentation scheme that not only expands the training data scale but also introduces significant facial muscle movement patterns to improve the FER performance. Results from extensive experiments on the commonly used BU-4DFE dataset under widely adopted settings show that our proposed method outperforms the state-of-the-art 4D FER methods by achieving an accuracy of 96.5% indicating its effectiveness.
Tasks Facial Expression Recognition
Published 2019-05-07
URL https://arxiv.org/abs/1905.02319v2
PDF https://arxiv.org/pdf/1905.02319v2.pdf
PWC https://paperswithcode.com/paper/automatic-4d-facial-expression-recognition
Repo
Framework

Mining Data from the Congressional Record

Title Mining Data from the Congressional Record
Authors Zhengyu Ma, Tianjiao Qi, James Route, Amir Ziai
Abstract We propose a data storage and analysis method for using the US Congressional record as a policy analysis tool. We use Amazon Web Services and the Solr search engine to store and process Congressional record data from 1789 to the present, and then query Solr to find how frequently language related to tax increases and decreases appears. This frequency data is compared to six economic indicators. Our preliminary results indicate potential relationships between incidence of tax discussion and multiple indicators. We present our data storage and analysis procedures, as well as results from comparisons to all six indicators.
Tasks
Published 2019-06-03
URL https://arxiv.org/abs/1906.00529v1
PDF https://arxiv.org/pdf/1906.00529v1.pdf
PWC https://paperswithcode.com/paper/190600529
Repo
Framework

Wield: Systematic Reinforcement Learning With Progressive Randomization

Title Wield: Systematic Reinforcement Learning With Progressive Randomization
Authors Michael Schaarschmidt, Kai Fricke, Eiko Yoneki
Abstract Reinforcement learning frameworks have introduced abstractions to implement and execute algorithms at scale. They assume standardized simulator interfaces but are not concerned with identifying suitable task representations. We present Wield, a first-of-its kind system to facilitate task design for practical reinforcement learning. Through software primitives, Wield enables practitioners to decouple system-interface and deployment-specific configuration from state and action design. To guide experimentation, Wield further introduces a novel task design protocol and classification scheme centred around staged randomization to incrementally evaluate model capabilities.
Tasks
Published 2019-09-15
URL https://arxiv.org/abs/1909.06844v1
PDF https://arxiv.org/pdf/1909.06844v1.pdf
PWC https://paperswithcode.com/paper/wield-systematic-reinforcement-learning-with
Repo
Framework

A unified sparse optimization framework to learn parsimonious physics-informed models from data

Title A unified sparse optimization framework to learn parsimonious physics-informed models from data
Authors Kathleen Champion, Peng Zheng, Aleksandr Y. Aravkin, Steven L. Brunton, J. Nathan Kutz
Abstract Machine learning (ML) is redefining what is possible in data-intensive fields of science and engineering. However, applying ML to problems in the physical sciences comes with a unique set of challenges: scientists want physically interpretable models that can (i) generalize to predict previously unobserved behaviors, (ii) provide effective forecasting predictions (extrapolation), and (iii) be certifiable. Autonomous systems will necessarily interact with changing and uncertain environments, motivating the need for models that can accurately extrapolate based on physical principles (e.g. Newton’s universal second law for classical mechanics, F=ma). Standard ML approaches have shown impressive performance for predicting dynamics in an interpolatory regime, but the resulting models often lack interpretability and fail to generalize. In this paper, we introduce a unified sparse optimization framework that learns governing dynamical systems models from data, selecting relevant terms in the dynamics from a library of possible functions. The resulting models are parsimonious, have physical interpretations, and can generalize to new parameter regimes. Our framework allows the use of non-convex sparsity promoting regularization functions and can be adapted to address key challenges in scientific problems and data sets, including outliers, parametric dependencies, and physical constraints. We show that the approach discovers parsimonious dynamical models on several example systems, including a spiking neuron model. This flexible approach can be tailored to the unique challenges associated with a wide range of applications and data sets, providing a powerful ML-based framework for learning governing models for physical systems from data.
Tasks
Published 2019-06-25
URL https://arxiv.org/abs/1906.10612v1
PDF https://arxiv.org/pdf/1906.10612v1.pdf
PWC https://paperswithcode.com/paper/a-unified-sparse-optimization-framework-to
Repo
Framework

An Unsupervised Character-Aware Neural Approach to Word and Context Representation Learning

Title An Unsupervised Character-Aware Neural Approach to Word and Context Representation Learning
Authors Giuseppe Marra, Andrea Zugarini, Stefano Melacci, Marco Maggini
Abstract In the last few years, neural networks have been intensively used to develop meaningful distributed representations of words and contexts around them. When these representations, also known as “embeddings”, are learned from unsupervised large corpora, they can be transferred to different tasks with positive effects in terms of performances, especially when only a few supervisions are available. In this work, we further extend this concept, and we present an unsupervised neural architecture that jointly learns word and context embeddings, processing words as sequences of characters. This allows our model to spot the regularities that are due to the word morphology, and to avoid the need of a fixed-sized input vocabulary of words. We show that we can learn compact encoders that, despite the relatively small number of parameters, reach high-level performances in downstream tasks, comparing them with related state-of-the-art approaches or with fully supervised methods.
Tasks Representation Learning
Published 2019-07-19
URL https://arxiv.org/abs/1908.01819v1
PDF https://arxiv.org/pdf/1908.01819v1.pdf
PWC https://paperswithcode.com/paper/an-unsupervised-character-aware-neural
Repo
Framework

Contextual Compositionality Detection with External Knowledge Bases andWord Embeddings

Title Contextual Compositionality Detection with External Knowledge Bases andWord Embeddings
Authors Dongsheng Wang, Quichi Li, Lucas Chaves Lima, Jakob grue Simonsen, Christina Lioma
Abstract When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multi-word phrases is critical in any application of semantic processing, such as search engines; failing to detect non-compositional phrases can hurt system effectiveness notably. Existing research treats phrases as either compositional or non-compositional in a deterministic manner. In this paper, we operationalize the viewpoint that compositionality is contextual rather than deterministic, i.e., that whether a phrase is compositional or non-compositional depends on its context. For example, the phrase `green card’ is compositional when referring to a green colored card, whereas it is non-compositional when meaning permanent residence authorization. We address the challenge of detecting this type of contextual compositionality as follows: given a multi-word phrase, we enrich the word embedding representing its semantics with evidence about its global context (terms it often collocates with) as well as its local context (narratives where that phrase is used, which we call usage scenarios). We further extend this representation with information extracted from external knowledge bases. The resulting representation incorporates both localized context and more general usage of the phrase and allows to detect its compositionality in a non-deterministic and contextual way. Empirical evaluation of our model on a dataset of phrase compositionality, manually collected by crowdsourcing contextual compositionality assessments, shows that our model outperforms state-of-the-art baselines notably on detecting phrase compositionality. |
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08389v1
PDF http://arxiv.org/pdf/1903.08389v1.pdf
PWC https://paperswithcode.com/paper/contextual-compositionality-detection-with
Repo
Framework

Ethically Aligned Opportunistic Scheduling for Productive Laziness

Title Ethically Aligned Opportunistic Scheduling for Productive Laziness
Authors Han Yu, Chunyan Miao, Yongqing Zheng, Lizhen Cui, Simon Fauvel, Cyril Leung
Abstract In artificial intelligence (AI) mediated workforce management systems (e.g., crowdsourcing), long-term success depends on workers accomplishing tasks productively and resting well. This dual objective can be summarized by the concept of productive laziness. Existing scheduling approaches mostly focus on efficiency but overlook worker wellbeing through proper rest. In order to enable workforce management systems to follow the IEEE Ethically Aligned Design guidelines to prioritize worker wellbeing, we propose a distributed Computational Productive Laziness (CPL) approach in this paper. It intelligently recommends personalized work-rest schedules based on local data concerning a worker’s capabilities and situational factors to incorporate opportunistic resting and achieve superlinear collective productivity without the need for explicit coordination messages. Extensive experiments based on a real-world dataset of over 5,000 workers demonstrate that CPL enables workers to spend 70% of the effort to complete 90% of the tasks on average, providing more ethically aligned scheduling than existing approaches.
Tasks
Published 2019-01-02
URL http://arxiv.org/abs/1901.00298v1
PDF http://arxiv.org/pdf/1901.00298v1.pdf
PWC https://paperswithcode.com/paper/ethically-aligned-opportunistic-scheduling
Repo
Framework

Distributed Power Control for Large Energy Harvesting Networks: A Multi-Agent Deep Reinforcement Learning Approach

Title Distributed Power Control for Large Energy Harvesting Networks: A Multi-Agent Deep Reinforcement Learning Approach
Authors Mohit K. Sharma, Alessio Zappone, Mohamad Assaad, Merouane Debbah, Spyridon Vassilaras
Abstract In this paper, we develop a multi-agent reinforcement learning (MARL) framework to obtain online power control policies for a large energy harvesting (EH) multiple access channel, when only causal information about the EH process and wireless channel is available. In the proposed framework, we model the online power control problem as a discrete-time mean-field game (MFG), and analytically show that the MFG has a unique stationary solution. Next, we leverage the fictitious play property of the mean-field games, and the deep reinforcement learning technique to learn the stationary solution of the game, in a completely distributed fashion. We analytically show that the proposed procedure converges to the unique stationary solution of the MFG. This, in turn, ensures that the optimal policies can be learned in a completely distributed fashion. In order to benchmark the performance of the distributed policies, we also develop a deep neural network (DNN) based centralized as well as distributed online power control schemes. Our simulation results show the efficacy of the proposed power control policies. In particular, the DNN based centralized power control policies provide a very good performance for large EH networks for which the design of optimal policies is intractable using the conventional methods such as Markov decision processes. Further, performance of both the distributed policies is close to the throughput achieved by the centralized policies.
Tasks Multi-agent Reinforcement Learning
Published 2019-04-01
URL https://arxiv.org/abs/1904.00601v2
PDF https://arxiv.org/pdf/1904.00601v2.pdf
PWC https://paperswithcode.com/paper/distributed-power-control-for-large-energy
Repo
Framework

Cooperative Multi-Agent Reinforcement Learning Framework for Scalping Trading

Title Cooperative Multi-Agent Reinforcement Learning Framework for Scalping Trading
Authors Uk Jo, Taehyun Jo, Wanjun Kim, Iljoo Yoon, Dongseok Lee, Seungho Lee
Abstract We explore deep Reinforcement Learning(RL) algorithms for scalping trading and knew that there is no appropriate trading gym and agent examples. Thus we propose gym and agent like Open AI gym in finance. Not only that, we introduce new RL framework based on our hybrid algorithm which leverages between supervised learning and RL algorithm and uses meaningful observations such order book and settlement data from experience watching scalpers trading. That is very crucial information for traders behavior to be decided. To feed these data into our model, we use spatio-temporal convolution layer, called Conv3D for order book data and temporal CNN, called Conv1D for settlement data. Those are preprocessed by episode filter we developed. Agent consists of four sub agents divided to clarify their own goal to make best decision. Also, we adopted value and policy based algorithm to our framework. With these features, we could make agent mimic scalpers as much as possible. In many fields, RL algorithm has already begun to transcend human capabilities in many domains. This approach could be a starting point to beat human in the financial stock market, too and be a good reference for anyone who wants to design RL algorithm in real world domain. Finally, weexperiment our framework and gave you experiment progress.
Tasks Multi-agent Reinforcement Learning
Published 2019-03-31
URL http://arxiv.org/abs/1904.00441v1
PDF http://arxiv.org/pdf/1904.00441v1.pdf
PWC https://paperswithcode.com/paper/cooperative-multi-agent-reinforcement-1
Repo
Framework

Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization

Title Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Authors Hongfei Xu, Qiuhui Liu, Josef van Genabith, Jingyi Zhang
Abstract The Transformer translation model employs residual connection and layer normalization to ease the optimization difficulties caused by its multi-layer encoder/decoder structure. While several previous works show that even with residual connection and layer normalization, deep Transformers still have difficulty in training, and particularly a Transformer model with more than 12 encoder/decoder layers fails to converge. In this paper, we first empirically demonstrate that a simple modification made in the official implementation which changes the computation order of residual connection and layer normalization can effectively ease the optimization of deep Transformers. In addition, we deeply compare the subtle difference in computation order, and propose a parameter initialization method which simply puts Lipschitz restriction on the initialization of Transformers but can effectively ensure their convergence. We empirically show that with proper parameter initialization, deep Transformers with the original computation order can converge, which is quite in contrast to all previous works, and obtain significant improvements with up to 24 layers. Our proposed approach additionally enables to benefit from deep decoders compared to previous works which focus on deep encoders.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03179v1
PDF https://arxiv.org/pdf/1911.03179v1.pdf
PWC https://paperswithcode.com/paper/why-deep-transformers-are-difficult-to
Repo
Framework

Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus

Title Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus
Authors Yan Zhang, Michael M. Zavlanos
Abstract In this paper, we propose a distributed off-policy actor critic method to solve multi-agent reinforcement learning problems. Specifically, we assume that all agents keep local estimates of the global optimal policy parameter and update their local value function estimates independently. Then, we introduce an additional consensus step to let all the agents asymptotically achieve agreement on the global optimal policy function. The convergence analysis of the proposed algorithm is provided and the effectiveness of the proposed algorithm is validated using a distributed resource allocation example. Compared to relevant distributed actor critic methods, here the agents do not share information about their local tasks, but instead they coordinate to estimate the global policy function.
Tasks Multi-agent Reinforcement Learning
Published 2019-03-21
URL http://arxiv.org/abs/1903.09255v1
PDF http://arxiv.org/pdf/1903.09255v1.pdf
PWC https://paperswithcode.com/paper/distributed-off-policy-actor-critic
Repo
Framework

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces

Title Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces
Authors Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan
Abstract Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces. However, to the best of our knowledge, no previous work has ever succeeded in applying DRL to multi-agent problems with discrete-continuous hybrid (or parameterized) action spaces which is very common in practice. Our work fills this gap by proposing two novel algorithms: Deep Multi-Agent Parameterized Q-Networks (Deep MAPQN) and Deep Multi-Agent Hierarchical Hybrid Q-Networks (Deep MAHHQN). We follow the centralized training but decentralized execution paradigm: different levels of communication between different agents are used to facilitate the training process, while each agent executes its policy independently based on local observations during execution. Our empirical results on several challenging tasks (simulated RoboCup Soccer and game Ghost Story) show that both Deep MAPQN and Deep MAHHQN are effective and significantly outperform existing independent deep parameterized Q-learning method.
Tasks Multi-agent Reinforcement Learning, Q-Learning
Published 2019-03-12
URL http://arxiv.org/abs/1903.04959v1
PDF http://arxiv.org/pdf/1903.04959v1.pdf
PWC https://paperswithcode.com/paper/deep-multi-agent-reinforcement-learning-with-1
Repo
Framework

Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce

Title Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce
Authors Zehua Cheng, Zhenghua Xu
Abstract It is inevitable to train large deep learning models on a large-scale cluster equipped with accelerators system. Deep gradient compression would highly increase the bandwidth utilization and speed up the training process but hard to implement on ring structure. In this paper, we find that redundant gradient and gradient staleness has negative effect on training. We have observed that in different epoch and different steps, the neural networks focus on updating different layers and different parameters. In order to save more communication bandwidth and preserve the accuracy on ring structure, which break the restrict as the node increase, we propose a new algorithm to measure the importance of gradients on large-scale cluster implementing ring all-reduce based on the size of the ratio of parameter calculation gradient to parameter value. Our importance weighted pruning approach achieved 64X and 58.8X of gradient compression ratio on AlexNet and ResNet50 on ImageNet. Meanwhile, in order to maintain the sparseness of the gradient propagation, we randomly broadcast the index of important gradients on each node. While the remaining nodes are ready for the index gradient and perform all-reduce update. This would speed up the convergence of the model and preserve the training accuracy.
Tasks
Published 2019-01-06
URL http://arxiv.org/abs/1901.01544v1
PDF http://arxiv.org/pdf/1901.01544v1.pdf
PWC https://paperswithcode.com/paper/bandwidth-reduction-using-importance-weighted
Repo
Framework

Worst Cases Policy Gradients

Title Worst Cases Policy Gradients
Authors Yichuan Charlie Tang, Jian Zhang, Ruslan Salakhutdinov
Abstract Recent advances in deep reinforcement learning have demonstrated the capability of learning complex control policies from many types of environments. When learning policies for safety-critical applications, it is essential to be sensitive to risks and avoid catastrophic events. Towards this goal, we propose an actor-critic framework that models the uncertainty of the future and simultaneously learns a policy based on that uncertainty model. Specifically, given a distribution of the future return for any state and action, we optimize policies for varying levels of conditional Value-at-Risk. The learned policy can map the same state to different actions depending on the propensity for risk. We demonstrate the effectiveness of our approach in the domain of driving simulations, where we learn maneuvers in two scenarios. Our learned controller can dynamically select actions along a continuous axis, where safe and conservative behaviors are found at one end while riskier behaviors are found at the other. Finally, when testing with very different simulation parameters, our risk-averse policies generalize significantly better compared to other reinforcement learning approaches.
Tasks
Published 2019-11-09
URL https://arxiv.org/abs/1911.03618v1
PDF https://arxiv.org/pdf/1911.03618v1.pdf
PWC https://paperswithcode.com/paper/worst-cases-policy-gradients
Repo
Framework

Gaussian mixture model decomposition of multivariate signals

Title Gaussian mixture model decomposition of multivariate signals
Authors Gustav Zickert, Can Evren Yarman
Abstract We propose a greedy variational method for decomposing a non-negative multivariate signal as a weighted sum of Gaussians which, borrowing the terminology from statistics, we refer to as a Gaussian mixture model (GMM). Mixture components are added one at the time in two steps. In the first step, a new Gaussian atom and an amplitude are chosen based on a heuristic that aims to minimize the 2-norm of the residual. In the second step the 2-norm of the residual is further decreased by simultaneously adjusting all current Gaussians. Notably, our method has the following features: (1) It accepts multivariate signals, i.e. sampled multivariate function, histograms, time series, images, etc. as input. (2) The method can handle general (i.e. ellipsoidal) Gaussians. (3) No prior assumption on the number of mixture components is needed. To the best of our knowledge, no previous method for GMM decomposition simultaneously enjoys all these features. Since finding the optimal atom is a non-convex problem, an important point is how to initialize each new atom. We initialize the mean at the maximum of the residual. As a motivation for this initialization procedure, we prove an upper bound, which cannot be improved by a global constant, for the distance from any mode of a GMM to the set of corresponding means. For mixtures of spherical Gaussians with common variance $\sigma^2$, the bound takes the simple form $\sqrt{n}\sigma$. We evaluate our method on one- and two-dimensional signals. We also discuss the relation between clustering and signal decomposition, and compare our method to the baseline expectation maximization algorithm.
Tasks Time Series
Published 2019-09-01
URL https://arxiv.org/abs/1909.00367v1
PDF https://arxiv.org/pdf/1909.00367v1.pdf
PWC https://paperswithcode.com/paper/gaussian-mixture-model-decomposition-of
Repo
Framework
comments powered by Disqus