October 18, 2019

3127 words 15 mins read

Paper Group ANR 585

Learning Memory Access Patterns. End-to-end Deep Learning from Raw Sensor Data: Atrial Fibrillation Detection using Wearables. Complex Fully Convolutional Neural Networks for MR Image Reconstruction. Geo-Text Data and Data-Driven Geospatial Semantics. On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization. The Toybox Dataset of …

Learning Memory Access Patterns


Title	Learning Memory Access Patterns
Authors	Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan
Abstract	The explosion in workload complexity and the recent slow-down in Moore’s law scaling call for new approaches towards efficient computing. Researchers are now beginning to use recent advances in machine learning in software optimizations, augmenting or replacing traditional heuristics and data structures. However, the space of machine learning for computer hardware architecture is only lightly explored. In this paper, we demonstrate the potential of deep learning to address the von Neumann bottleneck of memory performance. We focus on the critical problem of learning memory access patterns, with the goal of constructing accurate and efficient memory prefetchers. We relate contemporary prefetching strategies to n-gram models in natural language processing, and show how recurrent neural networks can serve as a drop-in replacement. On a suite of challenging benchmark datasets, we find that neural networks consistently demonstrate superior performance in terms of precision and recall. This work represents the first step towards practical neural-network based prefetching, and opens a wide range of exciting directions for machine learning in computer architecture research.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02329v1
PDF	http://arxiv.org/pdf/1803.02329v1.pdf
PWC	https://paperswithcode.com/paper/learning-memory-access-patterns
Repo
Framework

End-to-end Deep Learning from Raw Sensor Data: Atrial Fibrillation Detection using Wearables


Title	End-to-end Deep Learning from Raw Sensor Data: Atrial Fibrillation Detection using Wearables
Authors	Igor Gotlibovych, Stuart Crawford, Dileep Goyal, Jiaqi Liu, Yaniv Kerem, David Benaron, Defne Yilmaz, Gregory Marcus, Yihan, Li
Abstract	We present a convolutional-recurrent neural network architecture with long short-term memory for real-time processing and classification of digital sensor data. The network implicitly performs typical signal processing tasks such as filtering and peak detection, and learns time-resolved embeddings of the input signal. We use a prototype multi-sensor wearable device to collect over 180h of photoplethysmography (PPG) data sampled at 20Hz, of which 36h are during atrial fibrillation (AFib). We use end-to-end learning to achieve state-of-the-art results in detecting AFib from raw PPG data. For classification labels output every 0.8s, we demonstrate an area under ROC curve of 0.9999, with false positive and false negative rates both below $2\times 10^{-3}$. This constitutes a significant improvement on previous results utilising domain-specific feature engineering, such as heart rate extraction, and brings large-scale atrial fibrillation screenings within imminent reach.
Tasks	Atrial Fibrillation Detection, Feature Engineering, Photoplethysmography (PPG)
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10707v1
PDF	http://arxiv.org/pdf/1807.10707v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-deep-learning-from-raw-sensor-data
Repo
Framework

Complex Fully Convolutional Neural Networks for MR Image Reconstruction


Title	Complex Fully Convolutional Neural Networks for MR Image Reconstruction
Authors	Muneer Ahmad Dedmari, Sailesh Conjeti, Santiago Estrada, Phillip Ehses, Tony Stöcker, Martin Reuter
Abstract	Undersampling the k-space data is widely adopted for acceleration of Magnetic Resonance Imaging (MRI). Current deep learning based approaches for supervised learning of MRI image reconstruction employ real-valued operations and representations by treating complex valued k-space/spatial-space as real values. In this paper, we propose complex dense fully convolutional neural network ($\mathbb{C}$DFNet) for learning to de-alias the reconstruction artifacts within undersampled MRI images. We fashioned a densely-connected fully convolutional block tailored for complex-valued inputs by introducing dedicated layers such as complex convolution, batch normalization, non-linearities etc. $\mathbb{C}$DFNet leverages the inherently complex-valued nature of input k-space and learns richer representations. We demonstrate improved perceptual quality and recovery of anatomical structures through $\mathbb{C}$DFNet in contrast to its real-valued counterparts.
Tasks	Image Reconstruction
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03343v1
PDF	http://arxiv.org/pdf/1807.03343v1.pdf
PWC	https://paperswithcode.com/paper/complex-fully-convolutional-neural-networks
Repo
Framework

Geo-Text Data and Data-Driven Geospatial Semantics


Title	Geo-Text Data and Data-Driven Geospatial Semantics
Authors	Yingjie Hu
Abstract	Many datasets nowadays contain links between geographic locations and natural language texts. These links can be geotags, such as geotagged tweets or geotagged Wikipedia pages, in which location coordinates are explicitly attached to texts. These links can also be place mentions, such as those in news articles, travel blogs, or historical archives, in which texts are implicitly connected to the mentioned places. This kind of data is referred to as geo-text data. The availability of large amounts of geo-text data brings both challenges and opportunities. On the one hand, it is challenging to automatically process this kind of data due to the unstructured texts and the complex spatial footprints of some places. On the other hand, geo-text data offers unique research opportunities through the rich information contained in texts and the special links between texts and geography. As a result, geo-text data facilitates various studies especially those in data-driven geospatial semantics. This paper discusses geo-text data and related concepts. With a focus on data-driven research, this paper systematically reviews a large number of studies that have discovered multiple types of knowledge from geo-text data. Based on the literature review, a generalized workflow is extracted and key challenges for future work are discussed.
Tasks
Published	2018-09-15
URL	http://arxiv.org/abs/1809.05636v1
PDF	http://arxiv.org/pdf/1809.05636v1.pdf
PWC	https://paperswithcode.com/paper/geo-text-data-and-data-driven-geospatial
Repo
Framework

On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization


Title	On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Authors	Dongruo Zhou, Yiqi Tang, Ziyan Yang, Yuan Cao, Quanquan Gu
Abstract	Adaptive gradient methods are workhorses in deep learning. However, the convergence guarantees of adaptive gradient methods for nonconvex optimization have not been sufficiently studied. In this paper, we provide a sharp analysis of a recently proposed adaptive gradient method namely partially adaptive momentum estimation method (Padam) (Chen and Gu, 2018), which admits many existing adaptive gradient methods such as RMSProp and AMSGrad as special cases. Our analysis shows that, for smooth nonconvex functions, Padam converges to a first-order stationary point at the rate of $O\big((\sum_{i=1}^d\mathbf{g}_{1:T,i}_2)^{1/2}/T^{3/4} + d/T\big)$, where $T$ is the number of iterations, $d$ is the dimension, $\mathbf{g}_1,\ldots,\mathbf{g}_T$ are the stochastic gradients, and $\mathbf{g}_{1:T,i} = [g_{1,i},g_{2,i},\ldots,g_{T,i}]^\top$. Our theoretical result also suggests that in order to achieve faster convergence rate, it is necessary to use Padam instead of AMSGrad. This is well-aligned with the empirical results of deep learning reported in Chen and Gu (2018).
Tasks
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05671v2
PDF	http://arxiv.org/pdf/1808.05671v2.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-adaptive-gradient
Repo
Framework

The Toybox Dataset of Egocentric Visual Object Transformations


Title	The Toybox Dataset of Egocentric Visual Object Transformations
Authors	Xiaohan Wang, Tengyu Ma, James Ainooson, Seunghwan Cha, Xiaotian Wang, Azhar Molla, Maithilee Kunda
Abstract	In object recognition research, many commonly used datasets (e.g., ImageNet and similar) contain relatively sparse distributions of object instances and views, e.g., one might see a thousand different pictures of a thousand different giraffes, mostly taken from a few conventionally photographed angles. These distributional properties constrain the types of computational experiments that are able to be conducted with such datasets, and also do not reflect naturalistic patterns of embodied visual experience. As a contribution to the small (but growing) number of multi-view object datasets that have been created to bridge this gap, we introduce a new video dataset called Toybox that contains egocentric (i.e., first-person perspective) videos of common household objects and toys being manually manipulated to undergo structured transformations, such as rotation, translation, and zooming. To illustrate potential uses of Toybox, we also present initial neural network experiments that examine 1) how training on different distributions of object instances and views affects recognition performance, and 2) how viewpoint-dependent object concepts are represented within the hidden layers of a trained network.
Tasks	Object Recognition
Published	2018-06-15
URL	http://arxiv.org/abs/1806.06034v3
PDF	http://arxiv.org/pdf/1806.06034v3.pdf
PWC	https://paperswithcode.com/paper/the-toybox-dataset-of-egocentric-visual
Repo
Framework

Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Convolutional Neural Network


Title	Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Convolutional Neural Network
Authors	Rezoana Bente Arif, Md. Abu Bakr Siddique, Mohammad Mahmudur Rahman Khan, Mahjabin Rahman Oishe
Abstract	Nowadays, deep learning can be employed to a wide ranges of fields including medicine, engineering, etc. In deep learning, Convolutional Neural Network (CNN) is extensively used in the pattern and sequence recognition, video analysis, natural language processing, spam detection, topic categorization, regression analysis, speech recognition, image classification, object detection, segmentation, face recognition, robotics, and control. The benefits associated with its near human level accuracies in large applications lead to the growing acceptance of CNN in recent years. The primary contribution of this paper is to analyze the impact of the pattern of the hidden layers of a CNN over the overall performance of the network. To demonstrate this influence, we applied neural network with different layers on the Modified National Institute of Standards and Technology (MNIST) dataset. Also, is to observe the variations of accuracies of the network for various numbers of hidden layers and epochs and to make comparison and contrast among them. The system is trained utilizing stochastic gradient and backpropagation algorithm and tested with feedforward algorithm.
Tasks	Face Recognition, Image Classification, Object Detection, Speech Recognition
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06187v3
PDF	http://arxiv.org/pdf/1809.06187v3.pdf
PWC	https://paperswithcode.com/paper/study-and-observation-of-the-variations-of-1
Repo
Framework

Adaptive Kernel Estimation of the Spectral Density with Boundary Kernel Analysis


Title	Adaptive Kernel Estimation of the Spectral Density with Boundary Kernel Analysis
Authors	Alexander Sidorenko, Kurt S. Riedel
Abstract	A hybrid estimator of the log-spectral density of a stationary time series is proposed. First, a multiple taper estimate is performed, followed by kernel smoothing the log-multitaper estimate. This procedure reduces the expected mean square error by $({\pi^2 \over 4})^{.8}$ over simply smoothing the log tapered periodogram. The optimal number of tapers is $O(N^{8/15})$. A data adaptive implementation of a variable bandwidth kernel smoother is given. When the spectral density is discontinuous, one sided smoothing estimates are used.
Tasks	Time Series
Published	2018-03-11
URL	https://arxiv.org/abs/1803.03906v1
PDF	https://arxiv.org/pdf/1803.03906v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-kernel-estimation-of-the-spectral
Repo
Framework

What and Where: A Context-based Recommendation System for Object Insertion


Title	What and Where: A Context-based Recommendation System for Object Insertion
Authors	Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xin Dong, Dun Liang, Peter Hall, Shi-Min Hu
Abstract	In this work, we propose a novel topic consisting of two dual tasks: 1) given a scene, recommend objects to insert, 2) given an object category, retrieve suitable background scenes. A bounding box for the inserted object is predicted in both tasks, which helps downstream applications such as semi-automated advertising and video composition. The major challenge lies in the fact that the target object is neither present nor localized at test time, whereas available datasets only provide scenes with existing objects. To tackle this problem, we build an unsupervised algorithm based on object-level contexts, which explicitly models the joint probability distribution of object categories and bounding boxes with a Gaussian mixture model. Experiments on our newly annotated test set demonstrate that our system outperforms existing baselines on all subtasks, and do so under a unified framework. Our contribution promises future extensions and applications.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09783v1
PDF	http://arxiv.org/pdf/1811.09783v1.pdf
PWC	https://paperswithcode.com/paper/what-and-where-a-context-based-recommendation
Repo
Framework

Simple Fusion: Return of the Language Model


Title	Simple Fusion: Return of the Language Model
Authors	Felix Stahlberg, James Cross, Veselin Stoyanov
Abstract	Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To achieve that, we train the translation model to predict the residual probability of the training data added to the prediction of the LM. This enables the TM to focus its capacity on modeling the source sentence since it can rely on the LM for fluency. We show that our method outperforms previous approaches to integrate LMs into NMT while the architecture is simpler as it does not require gating networks to balance TM and LM. We observe gains of between +0.24 and +2.36 BLEU on all four test sets (English-Turkish, Turkish-English, Estonian-English, Xhosa-English) on top of ensembles without LM. We compare our method with alternative ways to utilize monolingual data such as backtranslation, shallow fusion, and cold fusion.
Tasks	Language Modelling, Machine Translation
Published	2018-09-01
URL	http://arxiv.org/abs/1809.00125v2
PDF	http://arxiv.org/pdf/1809.00125v2.pdf
PWC	https://paperswithcode.com/paper/simple-fusion-return-of-the-language-model
Repo
Framework

TOP-GAN: Label-Free Cancer Cell Classification Using Deep Learning with a Small Training Set


Title	TOP-GAN: Label-Free Cancer Cell Classification Using Deep Learning with a Small Training Set
Authors	Moran Rubin, Omer Stein, Nir A. Turko, Yoav Nygate, Darina Roitshtain, Lidor Karako, Itay Barnea, Raja Giryes, Natan T. Shaked
Abstract	We propose a new deep learning approach for medical imaging that copes with the problem of a small training set, the main bottleneck of deep learning, and apply it for classification of healthy and cancer cells acquired by quantitative phase imaging. The proposed method, called transferring of pre-trained generative adversarial network (TOP-GAN), is a hybridization between transfer learning and generative adversarial networks (GANs). Healthy cells and cancer cells of different metastatic potential have been imaged by low-coherence off-axis holography. After the acquisition, the optical path delay maps of the cells have been extracted and directly used as an input to the deep networks. In order to cope with the small number of classified images, we have used GANs to train a large number of unclassified images from another cell type (sperm cells). After this preliminary training, and after transforming the last layer of the network with new ones, we have designed an automatic classifier for the correct cell type (healthy/primary cancer/metastatic cancer) with 90-99% accuracy, although small training sets of down to several images have been used. These results are better in comparison to other classic methods that aim at coping with the same problem of a small training set. We believe that our approach makes the combination of holographic microscopy and deep learning networks more accessible to the medical field by enabling a rapid, automatic and accurate classification in stain-free imaging flow cytometry. Furthermore, our approach is expected to be applicable to many other medical image classification tasks, suffering from a small training set.
Tasks	Image Classification, Transfer Learning
Published	2018-12-17
URL	http://arxiv.org/abs/1812.11006v1
PDF	http://arxiv.org/pdf/1812.11006v1.pdf
PWC	https://paperswithcode.com/paper/top-gan-label-free-cancer-cell-classification
Repo
Framework

K-Beam Minimax: Efficient Optimization for Deep Adversarial Learning


Title	K-Beam Minimax: Efficient Optimization for Deep Adversarial Learning
Authors	Jihun Hamm, Yung-Kyun Noh
Abstract	Minimax optimization plays a key role in adversarial training of machine learning algorithms, such as learning generative models, domain adaptation, privacy preservation, and robust learning. In this paper, we demonstrate the failure of alternating gradient descent in minimax optimization problems due to the discontinuity of solutions of the inner maximization. To address this, we propose a new epsilon-subgradient descent algorithm that addresses this problem by simultaneously tracking K candidate solutions. Practically, the algorithm can find solutions that previous saddle-point algorithms cannot find, with only a sublinear increase of complexity in K. We analyze the conditions under which the algorithm converges to the true solution in detail. A significant improvement in stability and convergence speed of the algorithm is observed in simple representative problems, GAN training, and domain-adaptation problems.
Tasks	Domain Adaptation
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11640v2
PDF	http://arxiv.org/pdf/1805.11640v2.pdf
PWC	https://paperswithcode.com/paper/k-beam-minimax-efficient-optimization-for
Repo
Framework

The Effect of Heterogeneous Data for Alzheimer’s Disease Detection from Speech


Title	The Effect of Heterogeneous Data for Alzheimer’s Disease Detection from Speech
Authors	Aparna Balagopalan, Jekaterina Novikova, Frank Rudzicz, Marzyeh Ghassemi
Abstract	Speech datasets for identifying Alzheimer’s disease (AD) are generally restricted to participants performing a single task, e.g. describing an image shown to them. As a result, models trained on linguistic features derived from such datasets may not be generalizable across tasks. Building on prior work demonstrating that same-task data of healthy participants helps improve AD detection on a single-task dataset of pathological speech, we augment an AD-specific dataset consisting of subjects describing a picture with multi-task healthy data. We demonstrate that normative data from multiple speech-based tasks helps improve AD detection by up to 9%. Visualization of decision boundaries reveals that models trained on a combination of structured picture descriptions and unstructured conversational speech have the least out-of-task error and show the most potential to generalize to multiple tasks. We analyze the impact of age of the added samples and if they affect fairness in classification. We also provide explanations for a possible inductive bias effect across tasks using model-agnostic feature anchors. This work highlights the need for heterogeneous datasets for encoding changes in multiple facets of cognition and for developing a task-independent AD detection model.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12254v1
PDF	http://arxiv.org/pdf/1811.12254v1.pdf
PWC	https://paperswithcode.com/paper/the-effect-of-heterogeneous-data-for
Repo
Framework

High Dimensional Model Representation as a Glass Box in Supervised Machine Learning


Title	High Dimensional Model Representation as a Glass Box in Supervised Machine Learning
Authors	Caleb Deen Bastian, Herschel Rabitz
Abstract	Prediction and explanation are key objects in supervised machine learning, where predictive models are known as black boxes and explanatory models are known as glass boxes. Explanation provides the necessary and sufficient information to interpret the model output in terms of the model input. It includes assessments of model output dependence on important input variables and measures of input variable importance to model output. High dimensional model representation (HDMR), also known as the generalized functional ANOVA expansion, provides useful insight into the input-output behavior of supervised machine learning models. This article gives applications of HDMR in supervised machine learning. The first application is characterizing information leakage in ``big-data’’ settings. The second application is reduced-order representation of elementary symmetric polynomials. The third application is analysis of variance with correlated variables. The last application is estimation of HDMR from kernel machine and decision tree black box representations. These results suggest HDMR to have broad utility within machine learning as a glass box representation. \|
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10320v1
PDF	http://arxiv.org/pdf/1807.10320v1.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-model-representation-as-a
Repo
Framework

Multimodal Dual Attention Memory for Video Story Question Answering


Title	Multimodal Dual Attention Memory for Video Story Question Answering
Authors	Kyung-Min Kim, Seong-Ho Choi, Jin-Hwa Kim, Byoung-Tak Zhang
Abstract	We propose a video story question-answering (QA) architecture, Multimodal Dual Attention Memory (MDAM). The key idea is to use a dual attention mechanism with late fusion. MDAM uses self-attention to learn the latent concepts in scene frames and captions. Given a question, MDAM uses the second attention over these latent concepts. Multimodal fusion is performed after the dual attention processes (late fusion). Using this processing pipeline, MDAM learns to infer a high-level vision-language joint representation from an abstraction of the full video content. We evaluate MDAM on PororoQA and MovieQA datasets which have large-scale QA annotations on cartoon videos and movies, respectively. For both datasets, MDAM achieves new state-of-the-art results with significant margins compared to the runner-up models. We confirm the best performance of the dual attention mechanism combined with late fusion by ablation studies. We also perform qualitative analysis by visualizing the inference mechanisms of MDAM.
Tasks	Question Answering
Published	2018-09-21
URL	http://arxiv.org/abs/1809.07999v1
PDF	http://arxiv.org/pdf/1809.07999v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-dual-attention-memory-for-video
Repo
Framework