January 25, 2020

3306 words 16 mins read

Paper Group ANR 1709

Performance Analysis and Characterization of Training Deep Learning Models on Mobile Devices. Streaming Quantiles Algorithms with Small Space and Update Time. Fast Record Linkage for Company Entities. Toward Fairness in AI for People with Disabilities: A Research Roadmap. Model-Based Reinforcement Learning for Atari. Revisiting the probabilistic me …

Performance Analysis and Characterization of Training Deep Learning Models on Mobile Devices


Title	Performance Analysis and Characterization of Training Deep Learning Models on Mobile Devices
Authors	Jie Liu, Jiawen Liu, Wan Du, Dong Li
Abstract	Training deep learning models on mobile devices recently becomes possible, because of increasing computation power on mobile hardware and the advantages of enabling high user experiences. Most of the existing work on machine learning at mobile devices is focused on the inference of deep learning models (particularly convolutional neural network and recurrent neural network), but not training. The performance characterization of training deep learning models on mobile devices is largely unexplored, although understanding the performance characterization is critical for designing and implementing deep learning models on mobile devices. In this paper, we perform a variety of experiments on a representative mobile device (the NVIDIA TX2) to study the performance of training deep learning models. We introduce a benchmark suite and tools to study performance of training deep learning models on mobile devices, from the perspectives of memory consumption, hardware utilization, and power consumption. The tools can correlate performance results with fine-grained operations in deep learning models, providing capabilities to capture performance variance and problems at a fine granularity. We reveal interesting performance problems and opportunities, including under-utilization of heterogeneous hardware, large energy consumption of the memory, and high predictability of workload characterization. Based on the performance analysis, we suggest interesting research directions.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04278v2
PDF	https://arxiv.org/pdf/1906.04278v2.pdf
PWC	https://paperswithcode.com/paper/performance-analysis-and-characterization-of
Repo
Framework

Streaming Quantiles Algorithms with Small Space and Update Time


Title	Streaming Quantiles Algorithms with Small Space and Update Time
Authors	Nikita Ivkin, Edo Liberty, Kevin Lang, Zohar Karnin, Vladimir Braverman
Abstract	Approximating quantiles and distributions over streaming data has been studied for roughly two decades now. Recently, Karnin, Lang, and Liberty proposed the first asymptotically optimal algorithm for doing so. This manuscript complements their theoretical result by providing a practical variants of their algorithm with improved constants. For a given sketch size, our techniques provably reduce the upper bound on the sketch error by a factor of two. These improvements are verified experimentally. Our modified quantile sketch improves the latency as well by reducing the worst case update time from $O(1/\varepsilon)$ down to $O(\log (1/\varepsilon))$. We also suggest two algorithms for weighted item streams which offer improved asymptotic update times compared to na"ive extensions. Finally, we provide a specialized data structure for these sketches which reduces both their memory footprints and update times.
Tasks
Published	2019-06-29
URL	https://arxiv.org/abs/1907.00236v1
PDF	https://arxiv.org/pdf/1907.00236v1.pdf
PWC	https://paperswithcode.com/paper/streaming-quantiles-algorithms-with-small
Repo
Framework

Fast Record Linkage for Company Entities


Title	Fast Record Linkage for Company Entities
Authors	Thomas Gschwind, Christoph Miksovic, Julian Minder, Katsiaryna Mirylenka, Paolo Scotton
Abstract	Record linkage is an essential part of nearly all real-world systems that consume structured and unstructured data coming from different sources. Typically no common key is available for connecting records. Massive data cleaning and data integration processes often have to be completed before any data analytics and further processing can be performed. Although record linkage is frequently regarded as a somewhat tedious but necessary step, it reveals valuable insights into the data at hand. These insights guide further analytic approaches to the data and support data visualization. In this work we focus on company entity matching, where company name, location and industry are taken into account. Our contribution is an end-to-end, highly scalable, enterprise-grade system that uses rule-based linkage algorithms extended with a machine learning approach to account for short company names. Linkage time is greatly reduced by efficient decomposition of the search space using MinHash. High linkage accuracy is achieved by the proposed thorough scoring process of the matching candidates. Based on real-world ground truth datasets, we show that our approach reaches a recall of 91% compared to 73% for baseline approaches. These results are achieved while scaling linearly with the number of nodes used in the system.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08667v3
PDF	https://arxiv.org/pdf/1907.08667v3.pdf
PWC	https://paperswithcode.com/paper/fast-record-linkage-for-company-entities
Repo
Framework

Toward Fairness in AI for People with Disabilities: A Research Roadmap


Title	Toward Fairness in AI for People with Disabilities: A Research Roadmap
Authors	Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, Meredith Ringel Morris
Abstract	AI technologies have the potential to dramatically impact the lives of people with disabilities (PWD). Indeed, improving the lives of PWD is a motivator for many state-of-the-art AI systems, such as automated speech recognition tools that can caption videos for people who are deaf and hard of hearing, or language prediction algorithms that can augment communication for people with speech or cognitive disabilities. However, widely deployed AI systems may not work properly for PWD, or worse, may actively discriminate against them. These considerations regarding fairness in AI for PWD have thus far received little attention. In this position paper, we identify potential areas of concern regarding how several AI technology categories may impact particular disability constituencies if care is not taken in their design, development, and testing. We intend for this risk assessment of how various classes of AI might interact with various classes of disability to provide a roadmap for future research that is needed to gather data, test these hypotheses, and build more inclusive algorithms.
Tasks	Speech Recognition
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02227v2
PDF	https://arxiv.org/pdf/1907.02227v2.pdf
PWC	https://paperswithcode.com/paper/toward-fairness-in-ai-for-people-with
Repo
Framework

Model-Based Reinforcement Learning for Atari


Title	Model-Based Reinforcement Learning for Atari
Authors	Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
Abstract	Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction – substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.
Tasks	Atari Games, Video Prediction
Published	2019-03-01
URL	https://arxiv.org/abs/1903.00374v4
PDF	https://arxiv.org/pdf/1903.00374v4.pdf
PWC	https://paperswithcode.com/paper/model-based-reinforcement-learning-for-atari
Repo
Framework

Revisiting the probabilistic method of record linkage


Title	Revisiting the probabilistic method of record linkage
Authors	Abel Dasylva, Arthur Goussanou, David Ajavon, Hanan Abousaleh
Abstract	In theory, the probabilistic linkage method provides two distinct advantages over non-probabilistic methods, including minimal rates of linkage error and accurate measures of these rates for data users. However, implementations can fall short of these expectations either because the conditional independence assumption is made, or because a model with interactions is used but lacks the identification property. In official statistics, this is currently the main challenge to the automated production and use of linked data. To address this challenge, a new methodology is described for proper linkage problems, where matched records may be identified with a probability that is bounded away from zero, regardless of the population size. It models the number of neighbours of a given record, i.e. the number of resembling records. To be specific, the proposed model is a finite mixture where each component is the sum of a Bernoulli variable with an independent Poisson variable. It has the identification property and yields solutions for many longstanding problems, including the evaluation of blocking criteria and the estimation of linkage errors for probabilistic or non-probabilistic linkages, all without clerical reviews or conditional independence assumptions. Thus it also enables unsupervised machine learning solutions for record linkage problems.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01874v1
PDF	https://arxiv.org/pdf/1911.01874v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-the-probabilistic-method-of-record
Repo
Framework

Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks


Title	Better Guider Predicts Future Better: Difference Guided Generative Adversarial Networks
Authors	Guohao Ying, Yingtian Zou, Lin Wan, Yiming Hu, Jiashi Feng
Abstract	Predicting the future is a fantasy but practicality work. It is the key component to intelligent agents, such as self-driving vehicles, medical monitoring devices and robotics. In this work, we consider generating unseen future frames from previous obeservations, which is notoriously hard due to the uncertainty in frame dynamics. While recent works based on generative adversarial networks (GANs) made remarkable progress, there is still an obstacle for making accurate and realistic predictions. In this paper, we propose a novel GAN based on inter-frame difference to circumvent the difficulties. More specifically, our model is a multi-stage generative network, which is named the Difference Guided Generative Adversarial Netwok (DGGAN). The DGGAN learns to explicitly enforce future-frame predictions that is guided by synthetic inter-frame difference. Given a sequence of frames, DGGAN first uses dual paths to generate meta information. One path, called Coarse Frame Generator, predicts the coarse details about future frames, and the other path, called Difference Guide Generator, generates the difference image which include complementary fine details. Then our coarse details will then be refined via guidance of difference image under the support of GANs. With this model and novel architecture, we achieve state-of-the-art performance for future video prediction on UCF-101, KITTI.
Tasks	Video Prediction
Published	2019-01-07
URL	http://arxiv.org/abs/1901.01649v1
PDF	http://arxiv.org/pdf/1901.01649v1.pdf
PWC	https://paperswithcode.com/paper/better-guider-predicts-future-better
Repo
Framework

Multi-Task Learning with Contextualized Word Representations for Extented Named Entity Recognition


Title	Multi-Task Learning with Contextualized Word Representations for Extented Named Entity Recognition
Authors	Thai-Hoang Pham, Khai Mai, Nguyen Minh Trung, Nguyen Tuan Duc, Danushka Bolegala, Ryohei Sasano, Satoshi Sekine
Abstract	Fine-Grained Named Entity Recognition (FG-NER) is critical for many NLP applications. While classical named entity recognition (NER) has attracted a substantial amount of research, FG-NER is still an open research domain. The current state-of-the-art (SOTA) model for FG-NER relies heavily on manual efforts for building a dictionary and designing hand-crafted features. The end-to-end framework which achieved the SOTA result for NER did not get the competitive result compared to SOTA model for FG-NER. In this paper, we investigate how effective multi-task learning approaches are in an end-to-end framework for FG-NER in different aspects. Our experiments show that using multi-task learning approaches with contextualized word representation can help an end-to-end neural network model achieve SOTA results without using any additional manual effort for creating data and designing features.
Tasks	Multi-Task Learning, Named Entity Recognition
Published	2019-02-26
URL	http://arxiv.org/abs/1902.10118v1
PDF	http://arxiv.org/pdf/1902.10118v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-with-contextualized-word
Repo
Framework

Directional PointNet: 3D Environmental Classification for Wearable Robotics


Title	Directional PointNet: 3D Environmental Classification for Wearable Robotics
Authors	Kuangen Zhang, Jing Wang, Chenglong Fu
Abstract	Environmental information can provide reliable prior information about human motion intent, which can aid the subject with wearable robotics to walk in complex environments. Previous researchers have utilized 1D signal and 2D images to classify environments, but they may face the problems of self-occlusion. Comparatively, 3D point cloud can be more appropriate to depict environments, thus we propose a directional PointNet to classify 3D point cloud directly. By utilizing the orientation information of the point cloud, the directional PointNet can classify daily terrains, including level ground, up stairs, and down stairs, and the classification accuracy achieves 99% for testing set. Moreover, the directional PointNet is more efficient than the previous PointNet because the T-net, which is utilized to estimate the transformation of the point cloud, is removed in this research and the length of the global feature is optimized. The experimental results demonstrate that the directional PointNet can classify the environments robustly and efficiently.
Tasks
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06846v2
PDF	http://arxiv.org/pdf/1903.06846v2.pdf
PWC	https://paperswithcode.com/paper/directional-pointnet-3d-environmental
Repo
Framework

Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System


Title	Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System
Authors	James P. Bailey, Georgios Piliouras
Abstract	Zero-sum games are natural, if informal, analogues of closed physical systems where no energy/utility can enter or exit. This analogy can be extended even further if we consider zero-sum network (polymatrix) games where multiple agents interact in a closed economy. Typically, (network) zero-sum games are studied from the perspective of Nash equilibria. Nevertheless, this comes in contrast with the way we typically think about closed physical systems, e.g., Earth-moon systems which move perpetually along recurrent trajectories of constant energy. We establish a formal and robust connection between multi-agent systems and Hamiltonian dynamics – the same dynamics that describe conservative systems in physics. Specifically, we show that no matter the size, or network structure of such closed economies, even if agents use different online learning dynamics from the standard class of Follow-the-Regularized-Leader, they yield Hamiltonian dynamics. This approach generalizes the known connection to Hamiltonians for the special case of replicator dynamics in two agent zero-sum games developed by Hofbauer. Moreover, our results extend beyond zero-sum settings and provide a type of a Rosetta stone (see e.g. Table 1) that helps to translate results and techniques between online optimization, convex analysis, games theory, and physics.
Tasks
Published	2019-03-05
URL	http://arxiv.org/abs/1903.01720v1
PDF	http://arxiv.org/pdf/1903.01720v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-learning-in-network-zero-sum
Repo
Framework

Not All Parts Are Created Equal: 3D Pose Estimation by Modelling Bi-directional Dependencies of Body Parts


Title	Not All Parts Are Created Equal: 3D Pose Estimation by Modelling Bi-directional Dependencies of Body Parts
Authors	Jue Wang, Shaoli Huang, Xinchao Wang, Dacheng Tao
Abstract	Not all the human body parts have the same~degree of freedom~(DOF) due to the physiological structure. For example, the limbs may move more flexibly and freely than the torso does. Most of the existing 3D pose estimation methods, despite the very promising results achieved, treat the body joints equally and consequently often lead to larger reconstruction errors on the limbs. In this paper, we propose a progressive approach that explicitly accounts for the distinct DOFs among the body parts. We model parts with higher DOFs like the elbows, as dependent components of the corresponding parts with lower DOFs like the torso, of which the 3D locations can be more reliably estimated. Meanwhile, the high-DOF parts may, in turn, impose a constraint on where the low-DOF ones lie. As a result, parts with different DOFs supervise one another, yielding physically constrained and plausible pose-estimation results. To further facilitate the prediction of the high-DOF parts, we introduce a pose-attribute estimation, where the relative location of a limb joint with respect to the torso, which has the least DOF of a human body, is explicitly estimated and further fed to the joint-estimation module. The proposed approach achieves very promising results, outperforming the state of the art on several benchmarks.
Tasks	3D Pose Estimation, Pose Estimation
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07862v1
PDF	https://arxiv.org/pdf/1905.07862v1.pdf
PWC	https://paperswithcode.com/paper/not-all-parts-are-created-equal-3d-pose
Repo
Framework

Benchmarking Contemporary Deep Learning Hardware and Frameworks:A Survey of Qualitative Metrics


Title	Benchmarking Contemporary Deep Learning Hardware and Frameworks:A Survey of Qualitative Metrics
Authors	Wei Dai, Daniel Berleant
Abstract	This paper surveys benchmarking principles, machine learning devices including GPUs, FPGAs, and ASICs, and deep learning software frameworks. It also reviews these technologies with respect to benchmarking from the perspectives of a 6-metric approach to frameworks and an 11-metric approach to hardware platforms. Because MLPerf is a benchmark organization working with industry and academia, and offering deep learning benchmarks that evaluate training and inference on deep learning hardware devices, the survey also mentions MLPerf benchmark results, benchmark metrics, datasets, deep learning frameworks and algorithms. We summarize seven benchmarking principles, differential characteristics of mainstream AI devices, and qualitative comparison of deep learning hardware and frameworks.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.03626v4
PDF	https://arxiv.org/pdf/1907.03626v4.pdf
PWC	https://paperswithcode.com/paper/qualitative-benchmarking-of-deep-learning
Repo
Framework

Transcribing Content from Structural Images with Spotlight Mechanism


Title	Transcribing Content from Structural Images with Spotlight Mechanism
Authors	Yu Yin, Zhenya Huang, Enhong Chen, Qi Liu, Fuzheng Zhang, Xing Xie, Guoping Hu
Abstract	Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., structured symbols), which often follow a fine-grained grammar. To this end, in this paper, we propose a hierarchical Spotlight Transcribing Network (STN) framework followed by a two-stage “where-to-what” solution. Specifically, we first decide “where-to-look” through a novel spotlight mechanism to focus on different areas of the original image following its structure. Then, we decide “what-to-write” by developing a GRU based network with the spotlight areas for transcribing the content accordingly. Moreover, we propose two implementations on the basis of STN, i.e., STNM and STNR, where the spotlight movement follows the Markov property and Recurrent modeling, respectively. We also design a reinforcement method to refine the framework by self-improving the spotlight mechanism. We conduct extensive experiments on many structural image datasets, where the results clearly demonstrate the effectiveness of STN framework.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10954v1
PDF	https://arxiv.org/pdf/1905.10954v1.pdf
PWC	https://paperswithcode.com/paper/transcribing-content-from-structural-images
Repo
Framework

Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing


Title	Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing
Authors	Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin
Abstract	In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code conditioned on the class environment. Our approach naturally combines a retrieval model and a meta-learner, where the former learns to find similar datapoints from the training data, and the latter considers retrieved datapoints as a pseudo task for fast adaptation. Specifically, our retriever is a context-aware encoder-decoder model with a latent variable which takes context environment into consideration, and our meta-learner learns to utilize retrieved datapoints in a model-agnostic meta-learning paradigm for fast adaptation. We conduct experiments on CONCODE and CSQA datasets, where the context refers to class environment in JAVA codes and conversational history, respectively. We use sequence-to-action model as the base semantic parser, which performs the state-of-the-art accuracy on both datasets. Results show that both the context-aware retriever and the meta-learning strategy improve accuracy, and our approach performs better than retrieve-and-edit baselines.
Tasks	Meta-Learning, Semantic Parsing
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07108v1
PDF	https://arxiv.org/pdf/1906.07108v1.pdf
PWC	https://paperswithcode.com/paper/coupling-retrieval-and-meta-learning-for
Repo
Framework


Title	Dynamic Term-Modal Logics for Epistemic Planning
Authors	Andreas Achen, Andrés Occhipinti Liberman, Rasmus K. Rendsvig
Abstract	Classical planning frameworks are built on first-order languages. The first-order expressive power is desirable for compactly representing actions via schemas, and for specifying goal formulas such as $\neg\exists x\mathsf{blocks_door}(x)$. In contrast, several recent epistemic planning frameworks build on propositional modal logic. The modal expressive power is desirable for investigating planning problems with epistemic goals such as $K_{a}\neg\mathsf{problem}$. The present paper presents an epistemic planning framework with first-order expressiveness of classical planning, but extending fully to the epistemic operators. In this framework, e.g. $\exists xK_{x}\exists y\mathsf{blocks_door}(y)$ is a formula. Logics with this expressive power are called “term-modal” in the literature. This paper presents a rich but well-behaved semantics for term-modal logic. The semantics are given a dynamic extension using first-order “action models” allowing for epistemic planning, and it is shown how corresponding “action schemas” allow for a very compact action representation. Concerning metatheory, the paper defines axiomatic normal term-modal logics, shows a Canonical Model Theorem-like result, present non-standard frame characterization formulas, shows decidability for the finite agent case, and shows a general completeness result for the dynamic extension by reduction axioms.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06047v1
PDF	https://arxiv.org/pdf/1906.06047v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-term-modal-logics-for-epistemic
Repo
Framework