April 2, 2020

2932 words 14 mins read

Paper Group ANR 292

Paper Group ANR 292

Predicting the Physical Dynamics of Unseen 3D Objects. TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding. Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing. Granular Learning with Deep Generative Models using Highly Contaminated Data. Let Me At Least Learn What You Really Like: Dealing With Noisy Human …

Predicting the Physical Dynamics of Unseen 3D Objects

Title Predicting the Physical Dynamics of Unseen 3D Objects
Authors Davis Rempe, Srinath Sridhar, He Wang, Leonidas J. Guibas
Abstract Machines that can predict the effect of physical interactions on the dynamics of previously unseen object instances are important for creating better robots and interactive virtual worlds. In this work, we focus on predicting the dynamics of 3D objects on a plane that have just been subjected to an impulsive force. In particular, we predict the changes in state - 3D position, rotation, velocities, and stability. Different from previous work, our approach can generalize dynamics predictions to object shapes and initial conditions that were unseen during training. Our method takes the 3D object’s shape as a point cloud and its initial linear and angular velocities as input. We extract shape features and use a recurrent neural network to predict the full change in state at each time step. Our model can support training with data from both a physics engine or the real world. Experiments show that we can accurately predict the changes in state for unseen object geometries and initial conditions.
Tasks
Published 2020-01-16
URL https://arxiv.org/abs/2001.06291v1
PDF https://arxiv.org/pdf/2001.06291v1.pdf
PWC https://paperswithcode.com/paper/predicting-the-physical-dynamics-of-unseen-3d
Repo
Framework

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Title TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Authors Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
Abstract Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model architecture is derived primarily from the transformer. Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering. In this paper, we investigate how these two modeling techniques can be combined to create a more powerful model architecture. We propose a new architecture denoted as Transformer with BLSTM (TRANS-BLSTM) which has a BLSTM layer integrated to each transformer block, leading to a joint modeling framework for transformer and BLSTM. We show that TRANS-BLSTM models consistently lead to improvements in accuracy compared to BERT baselines in GLUE and SQuAD 1.1 experiments. Our TRANS-BLSTM model obtains an F1 score of 94.01% on the SQuAD 1.1 development dataset, which is comparable to the state-of-the-art result.
Tasks Machine Translation, Question Answering, Sentence Classification
Published 2020-03-16
URL https://arxiv.org/abs/2003.07000v1
PDF https://arxiv.org/pdf/2003.07000v1.pdf
PWC https://paperswithcode.com/paper/trans-blstm-transformer-with-bidirectional
Repo
Framework

Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing

Title Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing
Authors Mao V. Ngo, Hakima Chaouchi, Tie Luo, Tony Q. S. Quek
Abstract Advances in deep neural networks (DNN) greatly bolster real-time detection of anomalous IoT data. However, IoT devices can barely afford complex DNN models due to limited computational power and energy supply. While one can offload anomaly detection tasks to the cloud, it incurs long delay and requires large bandwidth when thousands of IoT devices stream data to the cloud concurrently. In this paper, we propose an adaptive anomaly detection approach for hierarchical edge computing (HEC) systems to solve this problem. Specifically, we first construct three anomaly detection DNN models of increasing complexity, and associate them with the three layers of HEC from bottom to top, i.e., IoT devices, edge servers, and cloud. Then, we design an adaptive scheme to select one of the models based on the contextual information extracted from input data, to perform anomaly detection. The selection is formulated as a contextual bandit problem and is characterized by a single-step Markov decision process, with an objective of achieving high detection accuracy and low detection delay simultaneously. We evaluate our proposed approach using a real IoT dataset, and demonstrate that it reduces detection delay by 84% while maintaining almost the same accuracy as compared to offloading detection tasks to the cloud. In addition, our evaluation also shows that it outperforms other baseline schemes.
Tasks Anomaly Detection
Published 2020-01-10
URL https://arxiv.org/abs/2001.03314v1
PDF https://arxiv.org/pdf/2001.03314v1.pdf
PWC https://paperswithcode.com/paper/adaptive-anomaly-detection-for-iot-data-in
Repo
Framework

Granular Learning with Deep Generative Models using Highly Contaminated Data

Title Granular Learning with Deep Generative Models using Highly Contaminated Data
Authors John Just
Abstract An approach to utilize recent advances in deep generative models for anomaly detection in a granular (continuous) sense on a real-world image dataset with quality issues is detailed using recent normalizing flow models, with implications in many other applications/domains/data types. The approach is completely unsupervised (no annotations available) but qualitatively shown to provide accurate semantic labeling for images via heatmaps of the scaled log-likelihood overlaid on the images. When sorted based on the median values per image, clear trends in quality are observed. Furthermore, downstream classification is shown to be possible and effective via a weakly supervised approach using the log-likelihood output from a normalizing flow model as a training signal for a feature-extracting convolutional neural network. The pre-linear dense layer outputs on the CNN are shown to disentangle high level representations and efficiently cluster various quality issues. Thus, an entirely non-annotated (fully unsupervised) approach is shown possible for accurate estimation and classification of quality issues..
Tasks Anomaly Detection
Published 2020-01-06
URL https://arxiv.org/abs/2001.04297v1
PDF https://arxiv.org/pdf/2001.04297v1.pdf
PWC https://paperswithcode.com/paper/granular-learning-with-deep-generative-models
Repo
Framework

Let Me At Least Learn What You Really Like: Dealing With Noisy Humans When Learning Preferences

Title Let Me At Least Learn What You Really Like: Dealing With Noisy Humans When Learning Preferences
Authors Sriram Gopalakrishnan, Utkarsh Soni
Abstract Learning the preferences of a human improves the quality of the interaction with the human. The number of queries available to learn preferences maybe limited especially when interacting with a human, and so active learning is a must. One approach to active learning is to use uncertainty sampling to decide the informativeness of a query. In this paper, we propose a modification to uncertainty sampling which uses the expected output value to help speed up learning of preferences. We compare our approach with the uncertainty sampling baseline, as well as conduct an ablation study to test the validity of each component of our approach.
Tasks Active Learning
Published 2020-02-15
URL https://arxiv.org/abs/2002.06288v1
PDF https://arxiv.org/pdf/2002.06288v1.pdf
PWC https://paperswithcode.com/paper/let-me-at-least-learn-what-you-really-like
Repo
Framework

Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling

Title Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling
Authors Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei Zhang
Abstract Despite the wide applications of Adam in reinforcement learning (RL), the theoretical convergence of Adam-type RL algorithms has not been established. This paper provides the first such convergence analysis for two fundamental RL algorithms of policy gradient (PG) and temporal difference (TD) learning that incorporate AMSGrad updates (a standard alternative of Adam in theoretical analysis), referred to as PG-AMSGrad and TD-AMSGrad, respectively. Moreover, our analysis focuses on Markovian sampling for both algorithms. We show that under general nonlinear function approximation, PG-AMSGrad with a constant stepsize converges to a neighborhood of a stationary point at the rate of $\mathcal{O}(1/T)$ (where $T$ denotes the number of iterations), and with a diminishing stepsize converges exactly to a stationary point at the rate of $\mathcal{O}(\log^2 T/\sqrt{T})$. Furthermore, under linear function approximation, TD-AMSGrad with a constant stepsize converges to a neighborhood of the global optimum at the rate of $\mathcal{O}(1/T)$, and with a diminishing stepsize converges exactly to the global optimum at the rate of $\mathcal{O}(\log T/\sqrt{T})$. Our study develops new techniques for analyzing the Adam-type RL algorithms under Markovian sampling.
Tasks
Published 2020-02-15
URL https://arxiv.org/abs/2002.06286v1
PDF https://arxiv.org/pdf/2002.06286v1.pdf
PWC https://paperswithcode.com/paper/non-asymptotic-convergence-of-adam-type
Repo
Framework

Masakhane – Machine Translation For Africa

Title Masakhane – Machine Translation For Africa
Authors Iroro Orife, Julia Kreutzer, Blessing Sibanda, Daniel Whitenack, Kathleen Siminyu, Laura Martinus, Jamiil Toure Ali, Jade Abbott, Vukosi Marivate, Salomon Kabongo, Musie Meressa, Espoir Murhabazi, Orevaoghene Ahia, Elan van Biljon, Arshath Ramkilowan, Adewale Akinfaderin, Alp Öktem, Wole Akin, Ghollah Kioko, Kevin Degila, Herman Kamper, Bonaventure Dossou, Chris Emezue, Kelechi Ogueji, Abdallah Bashir
Abstract Africa has over 2000 languages. Despite this, African languages account for a small portion of available resources and publications in Natural Language Processing (NLP). This is due to multiple factors, including: a lack of focus from government and funding, discoverability, a lack of community, sheer language complexity, difficulty in reproducing papers and no benchmarks to compare techniques. To begin to address the identified problems, MASAKHANE, an open-source, continent-wide, distributed, online research effort for machine translation for African languages, was founded. In this paper, we discuss our methodology for building the community and spurring research from the African continent, as well as outline the success of the community in terms of addressing the identified problems affecting African NLP.
Tasks Machine Translation
Published 2020-03-13
URL https://arxiv.org/abs/2003.11529v1
PDF https://arxiv.org/pdf/2003.11529v1.pdf
PWC https://paperswithcode.com/paper/masakhane-machine-translation-for-africa
Repo
Framework

Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses

Title Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses
Authors Mingda Li, Weitong Ruan, Xinyue Liu, Luca Soldaini, Wael Hamza, Chengwei Su
Abstract In a modern spoken language understanding (SLU) system, the natural language understanding (NLU) module takes interpretations of a speech from the automatic speech recognition (ASR) module as the input. The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification. However, the ASR module might misrecognize some speeches and the first best interpretation could be erroneous and noisy. Solely relying on the first best interpretation could make the performance of downstream tasks non-optimal. To address this issue, we introduce a series of simple yet efficient models for improving the understanding of semantics of the input speeches by collectively exploiting the n-best speech interpretations from the ASR module.
Tasks Intent Classification, Speech Recognition, Spoken Language Understanding
Published 2020-01-11
URL https://arxiv.org/abs/2001.05284v1
PDF https://arxiv.org/pdf/2001.05284v1.pdf
PWC https://paperswithcode.com/paper/improving-spoken-language-understanding-by
Repo
Framework

Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Title Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow
Authors Steffen Wiewel, Byungsoo Kim, Vinicius C. Azevedo, Barbara Solenthaler, Nils Thuerey
Abstract We propose an end-to-end trained neural networkarchitecture to robustly predict the complex dynamics of fluid flows with high temporal stability. We focus on single-phase smoke simulations in 2D and 3D based on the incompressible Navier-Stokes (NS) equations, which are relevant for a wide range of practical problems. To achieve stable predictions for long-term flow sequences, a convolutional neural network (CNN) is trained for spatial compression in combination with a temporal prediction network that consists of stacked Long Short-Term Memory (LSTM) layers. Our core contribution is a novel latent space subdivision (LSS) to separate the respective input quantities into individual parts of the encoded latent space domain. This allows to distinctively alter the encoded quantities without interfering with the remaining latent space values and hence maximizes external control. By selectively overwriting parts of the predicted latent space points, our proposed method is capable to robustly predict long-term sequences of complex physics problems. In addition, we highlight the benefits of a recurrent training on the latent space creation, which is performed by the spatial compression network.
Tasks
Published 2020-03-12
URL https://arxiv.org/abs/2003.08723v1
PDF https://arxiv.org/pdf/2003.08723v1.pdf
PWC https://paperswithcode.com/paper/latent-space-subdivision-stable-and
Repo
Framework

Real-time 3D Deep Multi-Camera Tracking

Title Real-time 3D Deep Multi-Camera Tracking
Authors Quanzeng You, Hao Jiang
Abstract Tracking a crowd in 3D using multiple RGB cameras is a challenging task. Most previous multi-camera tracking algorithms are designed for offline setting and have high computational complexity. Robust real-time multi-camera 3D tracking is still an unsolved problem. In this work, we propose a novel end-to-end tracking pipeline, Deep Multi-Camera Tracking (DMCT), which achieves reliable real-time multi-camera people tracking. Our DMCT consists of 1) a fast and novel perspective-aware Deep GroudPoint Network, 2) a fusion procedure for ground-plane occupancy heatmap estimation, 3) a novel Deep Glimpse Network for person detection and 4) a fast and accurate online tracker. Our design fully unleashes the power of deep neural network to estimate the “ground point” of each person in each color image, which can be optimized to run efficiently and robustly. Our fusion procedure, glimpse network and tracker merge the results from different views, find people candidates using multiple video frames and then track people on the fused heatmap. Our system achieves the state-of-the-art tracking results while maintaining real-time performance. Apart from evaluation on the challenging WILDTRACK dataset, we also collect two more tracking datasets with high-quality labels from two different environments and camera settings. Our experimental results confirm that our proposed real-time pipeline gives superior results to previous approaches.
Tasks Human Detection
Published 2020-03-26
URL https://arxiv.org/abs/2003.11753v1
PDF https://arxiv.org/pdf/2003.11753v1.pdf
PWC https://paperswithcode.com/paper/real-time-3d-deep-multi-camera-tracking
Repo
Framework

Predictive Power of Nearest Neighbors Algorithm under Random Perturbation

Title Predictive Power of Nearest Neighbors Algorithm under Random Perturbation
Authors Yue Xing, Qifan Song, Guang Cheng
Abstract We consider a data corruption scenario in the classical $k$ Nearest Neighbors ($k$-NN) algorithm, that is, the testing data are randomly perturbed. Under such a scenario, the impact of corruption level on the asymptotic regret is carefully characterized. In particular, our theoretical analysis reveals a phase transition phenomenon that, when the corruption level $\omega$ is below a critical order (i.e., small-$\omega$ regime), the asymptotic regret remains the same; when it is beyond that order (i.e., large-$\omega$ regime), the asymptotic regret deteriorates polynomially. Surprisingly, we obtain a negative result that the classical noise-injection approach will not help improve the testing performance in the beginning stage of the large-$\omega$ regime, even in the level of the multiplicative constant of asymptotic regret. As a technical by-product, we prove that under different model assumptions, the pre-processed 1-NN proposed in \cite{xue2017achieving} will at most achieve a sub-optimal rate when the data dimension $d>4$ even if $k$ is chosen optimally in the pre-processing step.
Tasks
Published 2020-02-13
URL https://arxiv.org/abs/2002.05304v1
PDF https://arxiv.org/pdf/2002.05304v1.pdf
PWC https://paperswithcode.com/paper/predictive-power-of-nearest-neighbors
Repo
Framework

A Primer in BERTology: What we know about how BERT works

Title A Primer in BERTology: What we know about how BERT works
Authors Anna Rogers, Olga Kovaleva, Anna Rumshisky
Abstract Transformer-based models are now widely used in NLP, but we still do not understand a lot about their inner workings. This paper describes what is known to date about the famous BERT model (Devlin et al. 2019), synthesizing over 40 analysis studies. We also provide an overview of the proposed modifications to the model and its training regime. We then outline the directions for further research.
Tasks
Published 2020-02-27
URL https://arxiv.org/abs/2002.12327v1
PDF https://arxiv.org/pdf/2002.12327v1.pdf
PWC https://paperswithcode.com/paper/a-primer-in-bertology-what-we-know-about-how
Repo
Framework

A Hybrid Approach to Temporal Pattern Matching

Title A Hybrid Approach to Temporal Pattern Matching
Authors Konstantinos Semertzidis, Evaggelia Pitoura
Abstract The primary objective of graph pattern matching is to find all appearances of an input graph pattern query in a large data graph. Such appearances are called matches. In this paper, we are interested in finding matches of interaction patterns in temporal graphs. To this end, we propose a hybrid approach that achieves effective filtering of potential matches based both on structure and time. Our approach exploits a graph representation where edges are ordered by time. We present experiments with real datasets that illustrate the efficiency of our approach.
Tasks
Published 2020-01-06
URL https://arxiv.org/abs/2001.01661v2
PDF https://arxiv.org/pdf/2001.01661v2.pdf
PWC https://paperswithcode.com/paper/a-hybrid-approach-to-temporal-pattern
Repo
Framework

Human-like Planning for Reaching in Cluttered Environments

Title Human-like Planning for Reaching in Cluttered Environments
Authors Mohamed Hasan, Matthew Warburton, Wisdom C. Agboh, Mehmet R. Dogar, Matteo Leonetti, He Wang, Faisal Mushtaq, Mark Mon-Williams, Anthony G. Cohn
Abstract Humans, in comparison to robots, are remarkably adept at reaching for objects in cluttered environments. The best existing robot planners are based on random sampling of configuration space – which becomes excessively high-dimensional with large number of objects. Consequently, most planners often fail to efficiently find object manipulation plans in such environments. We addressed this problem by identifying high-level manipulation plans in humans, and transferring these skills to robot planners. We used virtual reality to capture human participants reaching for a target object on a tabletop cluttered with obstacles. From this, we devised a qualitative representation of the task space to abstract the decision making, irrespective of the number of obstacles. Based on this representation, human demonstrations were segmented and used to train decision classifiers. Using these classifiers, our planner produced a list of waypoints in task space. These waypoints provided a high-level plan, which could be transferred to an arbitrary robot model and used to initialise a local trajectory optimiser. We evaluated this approach through testing on unseen human VR data, a physics-based robot simulation, and a real robot (dataset and code are publicly available). We found that the human-like planner outperformed a state-of-the-art standard trajectory optimisation algorithm, and was able to generate effective strategies for rapid planning – irrespective of the number of obstacles in the environment.
Tasks Decision Making
Published 2020-02-28
URL https://arxiv.org/abs/2002.12738v2
PDF https://arxiv.org/pdf/2002.12738v2.pdf
PWC https://paperswithcode.com/paper/introducing-a-human-like-planner-for-reaching
Repo
Framework

Towards Automated Swimming Analytics Using Deep Neural Networks

Title Towards Automated Swimming Analytics Using Deep Neural Networks
Authors Timothy Woinoski, Alon Harell, Ivan V. Bajic
Abstract Methods for creating a system to automate the collection of swimming analytics on a pool-wide scale are considered in this paper. There has not been much work on swimmer tracking or the creation of a swimmer database for machine learning purposes. Consequently, methods for collecting swimmer data from videos of swim competitions are explored and analyzed. The result is a guide to the creation of a comprehensive collection of swimming data suitable for training swimmer detection and tracking systems. With this database in place, systems can then be created to automate the collection of swimming analytics.
Tasks
Published 2020-01-13
URL https://arxiv.org/abs/2001.04433v1
PDF https://arxiv.org/pdf/2001.04433v1.pdf
PWC https://paperswithcode.com/paper/towards-automated-swimming-analytics-using
Repo
Framework
comments powered by Disqus