April 2, 2020

3215 words 16 mins read

Paper Group ANR 388

Paper Group ANR 388

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration. 3D Object Segmentation for Shelf Bin Picking by Humanoid with Deep Learning and Occupancy Voxel Grid Map. Recognition of Smoking Gesture Using Smart Watch Technology. Fact Check-Worthiness Detection as Positive Unlabelled Learning. Active Model Es …

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration

Title Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration
Authors Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Minyi Guo, Bin Yao
Abstract The research interest in specialized hardware accelerators for deep neural networks (DNN) spiked recently owing to their superior performance and efficiency. However, today’s DNN accelerators primarily focus on accelerating specific “kernels” such as convolution and matrix multiplication, which are vital but only part of an end-to-end DNN-enabled application. Meaningful speedups over the entire application often require supporting computations that are, while massively parallel, ill-suited to DNN accelerators. Integrating a general-purpose processor such as a CPU or a GPU incurs significant data movement overhead and leads to resource under-utilization on the DNN accelerators. We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications. The key to SMA is the temporal integration of the systolic execution model with the GPU-like SIMD execution model. The SMA exploits the common components shared between the systolic-array accelerator and the GPU, and provides lightweight reconfiguration capability to switch between the two modes in-situ. The SMA achieves up to 63% performance improvement while consuming 23% less energy than the baseline Volta architecture with TensorCore.
Tasks
Published 2020-02-18
URL https://arxiv.org/abs/2002.08326v1
PDF https://arxiv.org/pdf/2002.08326v1.pdf
PWC https://paperswithcode.com/paper/balancing-efficiency-and-flexibility-for-dnn
Repo
Framework

3D Object Segmentation for Shelf Bin Picking by Humanoid with Deep Learning and Occupancy Voxel Grid Map

Title 3D Object Segmentation for Shelf Bin Picking by Humanoid with Deep Learning and Occupancy Voxel Grid Map
Authors Kentaro Wada, Masaki Murooka, Kei Okada, Masayuki Inaba
Abstract Picking objects in a narrow space such as shelf bins is an important task for humanoid to extract target object from environment. In those situations, however, there are many occlusions between the camera and objects, and this makes it difficult to segment the target object three dimensionally because of the lack of three dimentional sensor inputs. We address this problem with accumulating segmentation result with multiple camera angles, and generating voxel model of the target object. Our approach consists of two components: first is object probability prediction for input image with convolutional networks, and second is generating voxel grid map which is designed for object segmentation. We evaluated the method with the picking task experiment for target objects in narrow shelf bins. Our method generates dense 3D object segments even with occlusions, and the real robot successfuly picked target objects from the narrow space.
Tasks Semantic Segmentation
Published 2020-01-15
URL https://arxiv.org/abs/2001.05406v2
PDF https://arxiv.org/pdf/2001.05406v2.pdf
PWC https://paperswithcode.com/paper/3d-object-segmentation-for-shelf-bin-picking
Repo
Framework

Recognition of Smoking Gesture Using Smart Watch Technology

Title Recognition of Smoking Gesture Using Smart Watch Technology
Authors Casey A. Cole, Bethany Janos, Dien Anshari, James F. Thrasher, Scott Strayer, Homayoun Valafar
Abstract Diseases resulting from prolonged smoking are the most common preventable causes of death in the world today. In this report we investigate the success of utilizing accelerometer sensors in smart watches to identify smoking gestures. Early identification of smoking gestures can help to initiate the appropriate intervention method and prevent relapses in smoking. Our experiments indicate 85%-95% success rates in identification of smoking gesture among other similar gestures using Artificial Neural Networks (ANNs). Our investigations concluded that information obtained from the x-dimension of accelerometers is the best means of identifying the smoking gesture, while y and z dimensions are helpful in eliminating other gestures such as: eating, drinking, and scratch of nose. We utilized sensor data from the Apple Watch during the training of the ANN. Using sensor data from another participant collected on Pebble Steel, we obtained a smoking identification accuracy of greater than 90% when using an ANN trained on data previously collected from the Apple Watch. Finally, we have demonstrated the possibility of using smart watches to perform continuous monitoring of daily activities.
Tasks
Published 2020-03-05
URL https://arxiv.org/abs/2003.02735v1
PDF https://arxiv.org/pdf/2003.02735v1.pdf
PWC https://paperswithcode.com/paper/recognition-of-smoking-gesture-using-smart
Repo
Framework

Fact Check-Worthiness Detection as Positive Unlabelled Learning

Title Fact Check-Worthiness Detection as Positive Unlabelled Learning
Authors Dustin Wright, Isabelle Augenstein
Abstract A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece of information should be checked for veracity. There are multiple isolated lines of research which address this core issue: check-worthiness detection from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. What is still lacking is a structured comparison of these variants of check-worthiness, as well as a unified approach to them. We find that check-worthiness detection is a very challenging task in any domain, because it both hinges upon detecting how factual a sentence is, and how likely a sentence is to be believed without verification. As such, annotators often only mark those instances they judge to be clear-cut check-worthy. Our best-performing method automatically corrects for this, using a variant of positive unlabelled learning, which learns when an instance annotated as not check-worthy should in fact have been annotated as being check-worthy. In applying this, we outperform the state of the art in two of the three domains studied for check-worthiness detection in English.
Tasks Rumour Detection
Published 2020-03-05
URL https://arxiv.org/abs/2003.02736v1
PDF https://arxiv.org/pdf/2003.02736v1.pdf
PWC https://paperswithcode.com/paper/fact-check-worthiness-detection-as-positive
Repo
Framework

Active Model Estimation in Markov Decision Processes

Title Active Model Estimation in Markov Decision Processes
Authors Jean Tarbouriech, Shubhanshu Shekhar, Matteo Pirotta, Mohammad Ghavamzadeh, Alessandro Lazaric
Abstract We study the problem of efficient exploration in order to learn an accurate model of an environment, modeled as a Markov decision process (MDP). Efficient exploration in this problem requires the agent to identify the regions in which estimating the model is more difficult and then exploit this knowledge to collect more samples there. In this paper, we formalize this problem, introduce the first algorithm to learn an $\epsilon$-accurate estimate of the dynamics, and provide its sample complexity analysis. While this algorithm enjoys strong guarantees in the large-sample regime, it tends to have a poor performance in early stages of exploration. To address this issue, we propose an algorithm that is based on maximum weighted entropy, a heuristic that stems from common sense and our theoretical analysis. The main idea here is cover the entire state-action space with the weight proportional to the noise in the transitions. Using a number of simple domains with heterogeneous noise in their transitions, we show that our heuristic-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime, while achieving similar asymptotic performance as that of the original algorithm.
Tasks Common Sense Reasoning, Efficient Exploration
Published 2020-03-06
URL https://arxiv.org/abs/2003.03297v1
PDF https://arxiv.org/pdf/2003.03297v1.pdf
PWC https://paperswithcode.com/paper/active-model-estimation-in-markov-decision
Repo
Framework

Deep Learning Guided Undersampling Mask Design for MR Image Reconstruction

Title Deep Learning Guided Undersampling Mask Design for MR Image Reconstruction
Authors Shengke Xue, Ruiliang Bai, Xinyu Jin
Abstract In this paper, we propose a cross-domain networks that can achieve undersampled MR image reconstruction from raw k-space space. We design a 2D probability sampling mask layer to simulate real undersampling operation. Then the 2D Inverse FFT is deployed to reconstruct MR image from frequency domain to spatial domain. By minimizing the Euclidean loss between ground-truth image and output, we train the parameters in our probability mask layer. We discover the probability appears special patterns that is quite different from universal common sense that mask should be Poisson-like, under certain undersampled rates. We analyze the probability mask is subjected to Gaussian or Quadratic distributions, and discuss this pattern will be more accurate and robust than traditional ones. Extensive experiments proves that the rules we discovered are adaptive to most cases. This can be a useful guidance to further MR reconstruction mask designs.
Tasks Common Sense Reasoning, Image Reconstruction
Published 2020-03-08
URL https://arxiv.org/abs/2003.03797v1
PDF https://arxiv.org/pdf/2003.03797v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-guided-undersampling-mask
Repo
Framework

Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization

Title Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization
Authors Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino
Abstract The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (i) the partial and noisy observations that can realistically be obtained, (ii) the need to learn from long time series of data, and (iii) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.
Tasks Bayesian Inference, Time Series
Published 2020-01-17
URL https://arxiv.org/abs/2001.06270v2
PDF https://arxiv.org/pdf/2001.06270v2.pdf
PWC https://paperswithcode.com/paper/bayesian-inference-of-dynamics-from-partial
Repo
Framework

A Reinforcement Learning Framework for Time-Dependent Causal Effects Evaluation in A/B Testing

Title A Reinforcement Learning Framework for Time-Dependent Causal Effects Evaluation in A/B Testing
Authors Chengchun Shi, Xiaoyu Wang, Shikai Luo, Rui Song, Hongtu Zhu, Jieping Ye
Abstract A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this paper is to introduce a reinforcement learning framework for carrying A/B testing, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating, so it is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., asymptotic distribution and power) of our testing procedure. Finally, we apply our framework to both synthetic datasets and a real-world data example obtained from a ride-sharing company to illustrate its usefulness.
Tasks
Published 2020-02-05
URL https://arxiv.org/abs/2002.01711v4
PDF https://arxiv.org/pdf/2002.01711v4.pdf
PWC https://paperswithcode.com/paper/a-reinforcement-learning-framework-for-time
Repo
Framework

For2For: Learning to forecast from forecasts

Title For2For: Learning to forecast from forecasts
Authors Shi Zhao, Ying Feng
Abstract This paper presents a time series forecasting framework which combines standard forecasting methods and a machine learning model. The inputs to the machine learning model are not lagged values or regular time series features, but instead forecasts produced by standard methods. The machine learning model can be either a convolutional neural network model or a recurrent neural network model. The intuition behind this approach is that forecasts of a time series are themselves good features characterizing the series, especially when the modelling purpose is forecasting. It can also be viewed as a weighted ensemble method. Tested on the M4 competition dataset, this approach outperforms all submissions for quarterly series, and is more accurate than all but the winning algorithm for monthly series.
Tasks Time Series, Time Series Forecasting
Published 2020-01-14
URL https://arxiv.org/abs/2001.04601v1
PDF https://arxiv.org/pdf/2001.04601v1.pdf
PWC https://paperswithcode.com/paper/for2for-learning-to-forecast-from-forecasts
Repo
Framework

Dynamic and Distributed Online Convex Optimization for Demand Response of Commercial Buildings

Title Dynamic and Distributed Online Convex Optimization for Demand Response of Commercial Buildings
Authors Antoine Lesage-Landry, Duncan S. Callaway
Abstract We extend the regret analysis of the online distributed weighted dual averaging (DWDA) algorithm[1] to the dynamic setting and provide the tightest dynamic regret bound known to date for a distributed online convex optimization (OCO) algorithm. Our bound is linear in the cumulative difference between consecutive optima and does not depend explicitly on the time horizon. We use dynamic-online DWDA (D-ODWDA) and formulate a performance-guaranteed distributed online demand response approach for heating, ventilation, and air-conditioning (HVAC) systems of commercial buildings. We show the performance of our approach for fast timescale demand response in numerical simulations and obtain demand response decisions that closely reproduce the centralized optimal ones.
Tasks
Published 2020-01-31
URL https://arxiv.org/abs/2002.00099v1
PDF https://arxiv.org/pdf/2002.00099v1.pdf
PWC https://paperswithcode.com/paper/dynamic-and-distributed-online-convex
Repo
Framework

Algorithm-Based Fault Tolerance for Convolutional Neural Networks

Title Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Authors Kai Zhao, Sheng Di, Sihuan Li, Xin Liang, Yujia Zhai, Jieyang Chen, Kaiming Ouyang, Franck Cappello, Zizhong Chen
Abstract Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields. CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage. Of critical importance is ensuring the stability of the CNN inference process against soft errors. Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components, instruction duplication techniques incur high overhead, and existing algorithm-based fault tolerance (ABFT) schemes cannot protect all convolution implementations. In this paper, we focus on how to protect the CNN inference process against soft errors as efficiently as possible, with the following three contributions. (1) We propose several systematic ABFT schemes based on checksum techniques and analyze their pros and cons thoroughly. Unlike traditional ABFT based on matrix-matrix multiplication, our schemes support any convolution implementations. (2) We design a novel workflow integrating all the proposed schemes to obtain a high detection/correction ability with limited total runtime overhead. (3) We perform our evaluation using ImageNet with well-known CNN models including AlexNet, VGG-19, ResNet-18, and YOLOv2. Experimental results demonstrate that our implementation can handle soft errors with very limited runtime overhead (4%~8% in both error-free and error-injected situations).
Tasks
Published 2020-03-27
URL https://arxiv.org/abs/2003.12203v1
PDF https://arxiv.org/pdf/2003.12203v1.pdf
PWC https://paperswithcode.com/paper/algorithm-based-fault-tolerance-for
Repo
Framework

A Novel Deep Learning Architecture for Decoding Imagined Speech from EEG

Title A Novel Deep Learning Architecture for Decoding Imagined Speech from EEG
Authors Jerrin Thomas Panachakel, A. G. Ramakrishnan, T. V. Ananthapadmanabha
Abstract The recent advances in the field of deep learning have not been fully utilised for decoding imagined speech primarily because of the unavailability of sufficient training samples to train a deep network. In this paper, we present a novel architecture that employs deep neural network (DNN) for classifying the words “in” and “cooperate” from the corresponding EEG signals in the ASU imagined speech dataset. Nine EEG channels, which best capture the underlying cortical activity, are chosen using common spatial pattern (CSP) and are treated as independent data vectors. Discrete wavelet transform (DWT) is used for feature extraction. To the best of our knowledge, so far DNN has not been employed as a classifier in decoding imagined speech. Treating the selected EEG channels corresponding to each imagined word as independent data vectors helps in providing sufficient number of samples to train a DNN. For each test trial, the final class label is obtained by applying a majority voting on the classification results of the individual channels considered in the trial. We have achieved accuracies comparable to the state-of-the-art results. The results can be further improved by using a higher-density EEG acquisition system in conjunction with other deep learning techniques such as long short-term memory.
Tasks EEG
Published 2020-03-19
URL https://arxiv.org/abs/2003.09374v1
PDF https://arxiv.org/pdf/2003.09374v1.pdf
PWC https://paperswithcode.com/paper/a-novel-deep-learning-architecture-for
Repo
Framework

Decoding Imagined Speech using Wavelet Features and Deep Neural Networks

Title Decoding Imagined Speech using Wavelet Features and Deep Neural Networks
Authors Jerrin Thomas Panachakel, A. G. Ramakrishnan, A. G. Ramakrishnan
Abstract This paper proposes a novel approach that uses deep neural networks for classifying imagined speech, significantly increasing the classification accuracy. The proposed approach employs only the EEG channels over specific areas of the brain for classification, and derives distinct feature vectors from each of those channels. This gives us more data to train a classifier, enabling us to use deep learning approaches. Wavelet and temporal domain features are extracted from each channel. The final class label of each test trial is obtained by applying a majority voting on the classification results of the individual channels considered in the trial. This approach is used for classifying all the 11 prompts in the KaraOne dataset of imagined speech. The proposed architecture and the approach of treating the data have resulted in an average classification accuracy of 57.15%, which is an improvement of around 35% over the state-of-the-art results.
Tasks EEG
Published 2020-03-19
URL https://arxiv.org/abs/2003.10433v1
PDF https://arxiv.org/pdf/2003.10433v1.pdf
PWC https://paperswithcode.com/paper/decoding-imagined-speech-using-wavelet
Repo
Framework

FuDGE: Functional Differential Graph Estimation with fully and discretely observed curves

Title FuDGE: Functional Differential Graph Estimation with fully and discretely observed curves
Authors Boxin Zhao, Y. Samuel Wang, Mladen Kolar
Abstract We consider the problem of estimating the difference between two functional undirected graphical models with shared structures. In many applications, data are naturally regarded as high-dimensional random function vectors rather than multivariate scalars. For example, electroencephalography (EEG) data are more appropriately treated as functions of time. In these problems, not only can the number of functions measured per sample be large, but each function is itself an infinite dimensional object, making estimation of model parameters challenging. In practice, curves are usually discretely observed, which makes graph structure recovery even more challenging. We formally characterize when two functional graphical models are comparable and propose a method that directly estimates the functional differential graph, which we term FuDGE. FuDGE avoids separate estimation of each graph, which allows for estimation in problems where individual graphs are dense, but their difference is sparse. We show that FuDGE consistently estimates the functional differential graph in a high-dimensional setting for both discretely observed and fully observed function paths. We illustrate finite sample properties of our method through simulation studies. In order to demonstrate the benefits of our method, we propose Joint Functional Graphical Lasso as a competitor, which is a generalization of the Joint Graphical Lasso. Finally, we apply our method to EEG data to uncover differences in functional brain connectivity between alcoholics and control subjects.
Tasks EEG
Published 2020-03-11
URL https://arxiv.org/abs/2003.05402v1
PDF https://arxiv.org/pdf/2003.05402v1.pdf
PWC https://paperswithcode.com/paper/fudge-functional-differential-graph
Repo
Framework

Selective Attention Encoders by Syntactic Graph Convolutional Networks for Document Summarization

Title Selective Attention Encoders by Syntactic Graph Convolutional Networks for Document Summarization
Authors Haiyang Xu, Yun Wang, Kun Han, Baochang Ma, Junwen Chen, Xiangang Li
Abstract Abstractive text summarization is a challenging task, and one need to design a mechanism to effectively extract salient information from the source text and then generate a summary. A parsing process of the source text contains critical syntactic or semantic structures, which is useful to generate more accurate summary. However, modeling a parsing tree for text summarization is not trivial due to its non-linear structure and it is harder to deal with a document that includes multiple sentences and their parsing trees. In this paper, we propose to use a graph to connect the parsing trees from the sentences in a document and utilize the stacked graph convolutional networks (GCNs) to learn the syntactic representation for a document. The selective attention mechanism is used to extract salient information in semantic and structural aspect and generate an abstractive summary. We evaluate our approach on the CNN/Daily Mail text summarization dataset. The experimental results show that the proposed GCNs based selective attention approach outperforms the baselines and achieves the state-of-the-art performance on the dataset.
Tasks Abstractive Text Summarization, Document Summarization, Text Summarization
Published 2020-03-18
URL https://arxiv.org/abs/2003.08004v1
PDF https://arxiv.org/pdf/2003.08004v1.pdf
PWC https://paperswithcode.com/paper/selective-attention-encoders-by-syntactic
Repo
Framework
comments powered by Disqus