Paper Group ANR 213
Incorporating Effective Global Information via Adaptive Gate Attention for Text Classification. Tree-structured Attention with Hierarchical Accumulation. Deep Learning Tubes for Tube MPC. FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback. Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints. …
Incorporating Effective Global Information via Adaptive Gate Attention for Text Classification
Title | Incorporating Effective Global Information via Adaptive Gate Attention for Text Classification |
Authors | Xianming Li, Zongxi Li, Yingbin Zhao, Haoran Xie, Qing Li |
Abstract | The dominant text classification studies focus on training classifiers using textual instances only or introducing external knowledge (e.g., hand-craft features and domain expert knowledge). In contrast, some corpus-level statistical features, like word frequency and distribution, are not well exploited. Our work shows that such simple statistical information can enhance classification performance both efficiently and significantly compared with several baseline models. In this paper, we propose a classifier with gate mechanism named Adaptive Gate Attention model with Global Information (AGA+GI), in which the adaptive gate mechanism incorporates global statistical features into latent semantic features and the attention layer captures dependency relationship within the sentence. To alleviate the overfitting issue, we propose a novel Leaky Dropout mechanism to improve generalization ability and performance stability. Our experiments show that the proposed method can achieve better accuracy than CNN-based and RNN-based approaches without global information on several benchmarks. |
Tasks | Text Classification |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09673v1 |
https://arxiv.org/pdf/2002.09673v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-effective-global-information |
Repo | |
Framework | |
Tree-structured Attention with Hierarchical Accumulation
Title | Tree-structured Attention with Hierarchical Accumulation |
Authors | Xuan-Phi Nguyen, Shafiq Joty, Steven C. H. Hoi, Richard Socher |
Abstract | Incorporating hierarchical structures like constituency trees has been shown to be effective for various natural language processing (NLP) tasks. However, it is evident that state-of-the-art (SOTA) sequence-based models like the Transformer struggle to encode such structures inherently. On the other hand, dedicated models like the Tree-LSTM, while explicitly modeling hierarchical structures, do not perform as efficiently as the Transformer. In this paper, we attempt to bridge this gap with “Hierarchical Accumulation” to encode parse tree structures into self-attention at constant time complexity. Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT’14 English-German translation task. It also yields improvements over Transformer and Tree-LSTM on three text classification tasks. We further demonstrate that using hierarchical priors can compensate for data shortage, and that our model prefers phrase-level attentions over token-level attentions. |
Tasks | Text Classification |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08046v1 |
https://arxiv.org/pdf/2002.08046v1.pdf | |
PWC | https://paperswithcode.com/paper/tree-structured-attention-with-hierarchical-1 |
Repo | |
Framework | |
Deep Learning Tubes for Tube MPC
Title | Deep Learning Tubes for Tube MPC |
Authors | David D. Fan, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou |
Abstract | Learning-based control aims to construct models of a system to use for planning or trajectory optimization, e.g. in model-based reinforcement learning. In order to obtain guarantees of safety in this context, uncertainty must be accurately quantified. This uncertainty may come from errors in learning (due to a lack of data, for example), or may be inherent to the system. Propagating uncertainty in learned dynamics models is a difficult problem. Common approaches rely on restrictive assumptions of how distributions are parameterized or propagated in time. In contrast, in this work we propose using deep learning to obtain expressive and flexible models of how these distributions behave, which we then use for nonlinear Model Predictive Control (MPC). First, we introduce a deep quantile regression framework for control which enforces probabilistic quantile bounds and quantifies epistemic uncertainty. Next, using our method we explore three different approaches for learning tubes which contain the possible trajectories of the system, and demonstrate how to use each of them in a Tube MPC scheme. Furthermore, we prove these schemes are recursively feasible and satisfy constraints with a desired margin of probability. Finally, we present experiments in simulation on a nonlinear quadrotor system, demonstrating the practical efficacy of these ideas. |
Tasks | |
Published | 2020-02-05 |
URL | https://arxiv.org/abs/2002.01587v1 |
https://arxiv.org/pdf/2002.01587v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-tubes-for-tube-mpc |
Repo | |
Framework | |
FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback
Title | FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback |
Authors | Baicen Xiao, Qifan Lu, Bhaskar Ramasubramanian, Andrew Clark, Linda Bushnell, Radha Poovendran |
Abstract | Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing. |
Tasks | Atari Games |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06781v1 |
https://arxiv.org/pdf/2001.06781v1.pdf | |
PWC | https://paperswithcode.com/paper/fresh-interactive-reward-shaping-in-high |
Repo | |
Framework | |
Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints
Title | Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints |
Authors | Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Otmar Hilliges, Jan Kautz |
Abstract | Estimating 3D hand pose from 2D images is a difficult, inverse problem due to the inherent scale and depth ambiguities. Current state-of-the-art methods train fully supervised deep neural networks with 3D ground-truth data. However, acquiring 3D annotations is expensive, typically requiring calibrated multi-view setups or labor intensive manual annotations. While annotations of 2D keypoints are much easier to obtain, how to efficiently leverage such weakly-supervised data to improve the task of 3D hand pose prediction remains an important open question. The key difficulty stems from the fact that direct application of additional 2D supervision mostly benefits the 2D proxy objective but does little to alleviate the depth and scale ambiguities. Embracing this challenge we propose a set of novel losses. We show by extensive experiments that our proposed constraints significantly reduce the depth ambiguity and allow the network to more effectively leverage additional 2D annotated images. For example, on the challenging freiHAND dataset using additional 2D annotation without our proposed biomechanical constraints reduces the depth error by only $15%$, whereas the error is reduced significantly by $50%$ when the proposed biomechanical constraints are used. |
Tasks | Hand Pose Estimation, Pose Estimation, Pose Prediction |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09282v1 |
https://arxiv.org/pdf/2003.09282v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-3d-hand-pose-estimation-via |
Repo | |
Framework | |
Attention! A Lightweight 2D Hand Pose Estimation Approach
Title | Attention! A Lightweight 2D Hand Pose Estimation Approach |
Authors | Nicholas Santavas, Ioannis Kansizoglou, Loukas Bampis, Evangelos Karakasis, Antonios Gasteratos |
Abstract | Vision based human pose estimation is an non-invasive technology for Human-Computer Interaction (HCI). Direct use of the hand as an input device provides an attractive interaction method, with no need for specialized sensing equipment, such as exoskeletons, gloves etc, but a camera. Traditionally, HCI is employed in various applications spreading in areas including manufacturing, surgery, entertainment industry and architecture, to mention a few. Deployment of vision based human pose estimation algorithms can give a breath of innovation to these applications. In this letter, we present a novel Convolutional Neural Network architecture, reinforced with a Self-Attention module that it can be deployed on an embedded system, due to its lightweight nature, with just 1.9 Million parameters. The source code and qualitative results are publicly available. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08047v1 |
https://arxiv.org/pdf/2001.08047v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-a-lightweight-2d-hand-pose |
Repo | |
Framework | |
Representation Learning on Variable Length and Incomplete Wearable-Sensory Time Series
Title | Representation Learning on Variable Length and Incomplete Wearable-Sensory Time Series |
Authors | Xian Wu, Chao Huang, Pablo Roblesgranda, Nitesh Chawla |
Abstract | The prevalence of wearable sensors (e.g., smart wristband) is enabling an unprecedented opportunity to not only inform health and wellness states of individuals, but also assess and infer demographic information and personality. This can allow us a deeper personalized insight beyond how many steps we took or what is our heart rate. However, before we can achieve this goal of personalized insight about an individual, we have to resolve a number of shortcomings: 1) wearable-sensory time series is often of variable-length and incomplete due to different data collection periods (e.g., wearing behavior varies by person); 2) inter-individual variability to external factors like stress and environment. This paper addresses these challenges and brings us closer to the potential of personalized insights whether about health or personality or job performance about an individual by developing a novel representation learning algorithm, HeartSpace. Specifically, HeartSpace is capable of encoding time series data with variable-length and missing values via the integration of a time series encoding module and a pattern aggregation network. Additionally, HeartSpace implements a Siamese-triplet network to optimize representations by jointly capturing intra- and inter-series correlations during the embedding learning process. Our empirical evaluation over two different data presents significant performance gains over state-of-the-art baselines in a variety of applications, including personality prediction, demographics inference, user identification. |
Tasks | Representation Learning, Time Series |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03595v1 |
https://arxiv.org/pdf/2002.03595v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-on-variable-length |
Repo | |
Framework | |
LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching
Title | LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching |
Authors | Angfan Zhu, Jiaqi Yang, Chen Zhao, Ke Xian, Zhiguo Cao, Xin Li |
Abstract | The local reference frame (LRF) acts as a critical role in 3D local shape description and matching. However, most of existing LRFs are hand-crafted and suffer from limited repeatability and robustness. This paper presents the first attempt to learn an LRF via a Siamese network that needs weak supervision only. In particular, we argue that each neighboring point in the local surface gives a unique contribution to LRF construction and measure such contributions via learned weights. Extensive analysis and comparative experiments on three public datasets addressing different application scenarios have demonstrated that LRF-Net is more repeatable and robust than several state-of-the-art LRF methods (LRF-Net is only trained on one dataset). In addition, LRF-Net can significantly boost the local shape description and 6-DoF pose estimation performance when matching 3D point clouds. |
Tasks | Pose Estimation |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.07832v1 |
https://arxiv.org/pdf/2001.07832v1.pdf | |
PWC | https://paperswithcode.com/paper/lrf-net-learning-local-reference-frames-for |
Repo | |
Framework | |
SemClinBr – a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks
Title | SemClinBr – a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks |
Authors | Lucas Emanuel Silva e Oliveira, Ana Carolina Peters, Adalniza Moura Pucca da Silva, Caroline P. Gebeluca, Yohan Bonescki Gumiel, Lilian Mie Mukai Cintho, Deborah Ribeiro Carvalho, Sadid A. Hasan, Claudia Maria Cabral Moro |
Abstract | The high volume of research focusing on extracting patient’s information from electronic health records (EHR) has led to an increase in the demand for annotated corpora, which are a very valuable resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multi-purpose clinical corpus outside the scope of the English language, especially in Brazilian Portuguese, is glaring and severely impacts scientific progress in the biomedical NLP field. In this study, we developed a semantically annotated corpus using clinical texts from multiple medical specialties, document types, and institutions. We present the following: (1) a survey listing common aspects and lessons learned from previous research, (2) a fine-grained annotation schema which could be replicated and guide other annotation initiatives, (3) a web-based annotation tool focusing on an annotation suggestion feature, and (4) both intrinsic and extrinsic evaluation of the annotations. The result of this work is the SemClinBr, a corpus that has 1,000 clinical notes, labeled with 65,117 entities and 11,263 relations, and can support a variety of clinical NLP tasks and boost the EHR’s secondary use for the Portuguese language. |
Tasks | |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.10071v1 |
https://arxiv.org/pdf/2001.10071v1.pdf | |
PWC | https://paperswithcode.com/paper/semclinbr-a-multi-institutional-and-multi |
Repo | |
Framework | |
Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems
Title | Combining Machine Learning with Knowledge-Based Modeling for Scalable Forecasting and Subgrid-Scale Closure of Large, Complex, Spatiotemporal Systems |
Authors | Alexander Wikner, Jaideep Pathak, Brian Hunt, Michelle Girvan, Troy Arcomano, Istvan Szunyogh, Andrew Pomerance, Edward Ott |
Abstract | We consider the commonly encountered situation (e.g., in weather forecasting) where the goal is to predict the time evolution of a large, spatiotemporally chaotic dynamical system when we have access to both time series data of previous system states and an imperfect model of the full system dynamics. Specifically, we attempt to utilize machine learning as the essential tool for integrating the use of past data into predictions. In order to facilitate scalability to the common scenario of interest where the spatiotemporally chaotic system is very large and complex, we propose combining two approaches:(i) a parallel machine learning prediction scheme; and (ii) a hybrid technique, for a composite prediction system composed of a knowledge-based component and a machine-learning-based component. We demonstrate that not only can this method combining (i) and (ii) be scaled to give excellent performance for very large systems, but also that the length of time series data needed to train our multiple, parallel machine learning components is dramatically less than that necessary without parallelization. Furthermore, considering cases where computational realization of the knowledge-based component does not resolve subgrid-scale processes, our scheme is able to use training data to incorporate the effect of the unresolved short-scale dynamics upon the resolved longer-scale dynamics (“subgrid-scale closure”). |
Tasks | Time Series, Weather Forecasting |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.05514v1 |
https://arxiv.org/pdf/2002.05514v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-machine-learning-with-knowledge |
Repo | |
Framework | |
Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers
Title | Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers |
Authors | Swapnil Ashok Jadhav |
Abstract | There have been very few attempts to benchmark performances of state-of-the-art algorithms for Neural Machine Translation task on Indian Languages. Google, Bing, Facebook and Yandex are some of the very few companies which have built translation systems for few of the Indian Languages. Among them, translation results from Google are supposed to be better, based on general inspection. Bing-Translator do not even support Marathi language which has around 95 million speakers and ranks 15th in the world in terms of combined primary and secondary speakers. In this exercise, we trained and compared variety of Neural Machine Marathi to English Translators trained with BERT-tokenizer by huggingface and various Transformer based architectures using Facebook’s Fairseq platform with limited but almost correct parallel corpus to achieve better BLEU scores than Google on Tatoeba and Wikimedia open datasets. |
Tasks | Machine Translation |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11643v1 |
https://arxiv.org/pdf/2002.11643v1.pdf | |
PWC | https://paperswithcode.com/paper/marathi-to-english-neural-machine-translation |
Repo | |
Framework | |
Provably efficient reconstruction of policy networks
Title | Provably efficient reconstruction of policy networks |
Authors | Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau |
Abstract | Recent research has shown that learning poli-cies parametrized by large neural networks can achieve significant success on challenging reinforcement learning problems. However, when memory is limited, it is not always possible to store such models exactly for inference, and com-pressing the policy into a compact representation might be necessary. We propose a general framework for policy representation, which reduces this problem to finding a low-dimensional embedding of a given density function in a separable inner product space. Our framework allows us to de-rive strong theoretical guarantees, controlling the error of the reconstructed policies. Such guaran-tees are typically lacking in black-box models, but are very desirable in risk-sensitive tasks. Our experimental results suggest that the reconstructed policies can use less than 10%of the number of parameters in the original networks, while incurring almost no decrease in rewards. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02863v1 |
https://arxiv.org/pdf/2002.02863v1.pdf | |
PWC | https://paperswithcode.com/paper/provably-efficient-reconstruction-of-policy |
Repo | |
Framework | |
Directional Message Passing for Molecular Graphs
Title | Directional Message Passing for Molecular Graphs |
Authors | Johannes Klicpera, Janek Groß, Stephan Günnemann |
Abstract | Graph neural networks have recently achieved great successes in predicting quantum mechanical properties of molecules. These models represent a molecule as a graph using only the distance between atoms (nodes). They do not, however, consider the spatial direction from one atom to another, despite directional information playing a central role in empirical potentials for molecules, e.g. in angular potentials. To alleviate this limitation we propose directional message passing, in which we embed the messages passed between atoms instead of the atoms themselves. Each message is associated with a direction in coordinate space. These directional message embeddings are rotationally equivariant since the associated directions rotate with the molecule. We propose a message passing scheme analogous to belief propagation, which uses the directional information by transforming messages based on the angle between them. Additionally, we use spherical Bessel functions and spherical harmonics to construct theoretically well-founded, orthogonal representations that achieve better performance than the currently prevalent Gaussian radial basis representations while using fewer than 1/4 of the parameters. We leverage these innovations to construct the directional message passing neural network (DimeNet). DimeNet outperforms previous GNNs on average by 76% on MD17 and by 31% on QM9. Our implementation is available online. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03123v1 |
https://arxiv.org/pdf/2003.03123v1.pdf | |
PWC | https://paperswithcode.com/paper/directional-message-passing-for-molecular-1 |
Repo | |
Framework | |
Exploring Maximum Entropy Distributions with Evolutionary Algorithms
Title | Exploring Maximum Entropy Distributions with Evolutionary Algorithms |
Authors | Raul Rojas |
Abstract | This paper shows how to evolve numerically the maximum entropy probability distributions for a given set of constraints, which is a variational calculus problem. An evolutionary algorithm can obtain approximations to some well-known analytical results, but is even more flexible and can find distributions for which a closed formula cannot be readily stated. The numerical approach handles distributions over finite intervals. We show that there are two ways of conducting the procedure: by direct optimization of the Lagrangian of the constrained problem, or by optimizing the entropy among the subset of distributions which fulfill the constraints. An incremental evolutionary strategy easily obtains the uniform, the exponential, the Gaussian, the log-normal, the Laplace, among other distributions, once the constrained problem is solved with any of the two methods. Solutions for mixed (“chimera”) distributions can be also found. We explain why many of the distributions are symmetrical and continuous, but some are not. |
Tasks | |
Published | 2020-02-05 |
URL | https://arxiv.org/abs/2002.01973v1 |
https://arxiv.org/pdf/2002.01973v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-maximum-entropy-distributions-with |
Repo | |
Framework | |
Aerial Imagery based LIDAR Localization for Autonomous Vehicles
Title | Aerial Imagery based LIDAR Localization for Autonomous Vehicles |
Authors | Ankit Vora, Siddharth Agarwal, Gaurav Pandey, James McBride |
Abstract | This paper presents a localization technique using aerial imagery maps and LIDAR based ground reflectivity for autonomous vehicles in urban environments. Traditional localization techniques using LIDAR reflectivity rely on high definition reflectivity maps generated from a mapping vehicle. The cost and effort required to maintain such prior maps are generally very high because it requires a fleet of expensive mapping vehicles. In this work we propose a localization technique where the vehicle localizes using aerial/satellite imagery, eradicating the need to develop and maintain complex high-definition maps. The proposed technique has been tested on a real world dataset collected from a test track in Ann Arbor, Michigan. This research concludes that aerial imagery based maps provides real-time localization performance similar to state-of-the-art LIDAR based maps for autonomous vehicles in urban environments at reduced costs. |
Tasks | Autonomous Vehicles |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11192v1 |
https://arxiv.org/pdf/2003.11192v1.pdf | |
PWC | https://paperswithcode.com/paper/aerial-imagery-based-lidar-localization-for |
Repo | |
Framework | |