July 28, 2019

3091 words 15 mins read

Paper Group ANR 263

Paper Group ANR 263

Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition. A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units. PVEs: Position-Velocity Encoders for Unsupervised Learning of Structured State Representations. A Sequential Model for Classifying Temporal Relations between Intra-Sentence Event …

Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition

Title Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition
Authors Jie Miao, Xiangmin Xu, Xiaofen Xing, Dacheng Tao
Abstract Dynamic textures exist in various forms, e.g., fire, smoke, and traffic jams, but recognizing dynamic texture is challenging due to the complex temporal variations. In this paper, we present a novel approach stemmed from slow feature analysis (SFA) for dynamic texture recognition. SFA extracts slowly varying features from fast varying signals. Fortunately, SFA is capable to leach invariant representations from dynamic textures. However, complex temporal variations require high-level semantic representations to fully achieve temporal slowness, and thus it is impractical to learn a high-level representation from dynamic textures directly by SFA. In order to learn a robust low-level feature to resolve the complexity of dynamic textures, we propose manifold regularized SFA (MR-SFA) by exploring the neighbor relationship of the initial state of each temporal transition and retaining the locality of their variations. Therefore, the learned features are not only slowly varying, but also partly predictable. MR-SFA for dynamic texture recognition is proposed in the following steps: 1) learning feature extraction functions as convolution filters by MR-SFA, 2) extracting local features by convolution and pooling, and 3) employing Fisher vectors to form a video-level representation for classification. Experimental results on dynamic texture and dynamic scene recognition datasets validate the effectiveness of the proposed approach.
Tasks Dynamic Texture Recognition, Scene Recognition
Published 2017-06-09
URL http://arxiv.org/abs/1706.03015v1
PDF http://arxiv.org/pdf/1706.03015v1.pdf
PWC https://paperswithcode.com/paper/manifold-regularized-slow-feature-analysis
Repo
Framework

A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units

Title A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units
Authors Niranjani Prasad, Li-Fang Cheng, Corey Chivers, Michael Draugelis, Barbara E Engelhardt
Abstract The management of invasive mechanical ventilation, and the regulation of sedation and analgesia during ventilation, constitutes a major part of the care of patients admitted to intensive care units. Both prolonged dependence on mechanical ventilation and premature extubation are associated with increased risk of complications and higher hospital costs, but clinical opinion on the best protocol for weaning patients off of a ventilator varies. This work aims to develop a decision support tool that uses available patient information to predict time-to-extubation readiness and to recommend a personalized regime of sedation dosage and ventilator support. To this end, we use off-policy reinforcement learning algorithms to determine the best action at a given patient state from sub-optimal historical ICU data. We compare treatment policies from fitted Q-iteration with extremely randomized trees and with feedforward neural networks, and demonstrate that the policies learnt show promise in recommending weaning protocols with improved outcomes, in terms of minimizing rates of reintubation and regulating physiological stability.
Tasks
Published 2017-04-20
URL http://arxiv.org/abs/1704.06300v1
PDF http://arxiv.org/pdf/1704.06300v1.pdf
PWC https://paperswithcode.com/paper/a-reinforcement-learning-approach-to-weaning
Repo
Framework

PVEs: Position-Velocity Encoders for Unsupervised Learning of Structured State Representations

Title PVEs: Position-Velocity Encoders for Unsupervised Learning of Structured State Representations
Authors Rico Jonschkowski, Roland Hafner, Jonathan Scholz, Martin Riedmiller
Abstract We propose position-velocity encoders (PVEs) which learn—without supervision—to encode images to positions and velocities of task-relevant objects. PVEs encode a single image into a low-dimensional position state and compute the velocity state from finite differences in position. In contrast to autoencoders, position-velocity encoders are not trained by image reconstruction, but by making the position-velocity representation consistent with priors about interacting with the physical world. We applied PVEs to several simulated control tasks from pixels and achieved promising preliminary results.
Tasks Image Reconstruction
Published 2017-05-27
URL http://arxiv.org/abs/1705.09805v3
PDF http://arxiv.org/pdf/1705.09805v3.pdf
PWC https://paperswithcode.com/paper/pves-position-velocity-encoders-for
Repo
Framework

A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events

Title A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events
Authors Prafulla Kumar Choubey, Ruihong Huang
Abstract We present a sequential model for temporal relation classification between intra-sentence events. The key observation is that the overall syntactic structure and compositional meanings of the multi-word context between events are important for distinguishing among fine-grained temporal relations. Specifically, our approach first extracts a sequence of context words that indicates the temporal relation between two events, which well align with the dependency path between two event mentions. The context word sequence, together with a parts-of-speech tag sequence and a dependency relation sequence that are generated corresponding to the word sequence, are then provided as input to bidirectional recurrent neural network (LSTM) models. The neural nets learn compositional syntactic and semantic representations of contexts surrounding the two events and predict the temporal relation between them. Evaluation of the proposed approach on TimeBank corpus shows that sequential modeling is capable of accurately recognizing temporal relations between events, which outperforms a neural net model using various discrete features as input that imitates previous feature based models.
Tasks Relation Classification
Published 2017-07-23
URL http://arxiv.org/abs/1707.07343v1
PDF http://arxiv.org/pdf/1707.07343v1.pdf
PWC https://paperswithcode.com/paper/a-sequential-model-for-classifying-temporal
Repo
Framework

A Theory of Output-Side Unsupervised Domain Adaptation

Title A Theory of Output-Side Unsupervised Domain Adaptation
Authors Tomer Galanti, Lior Wolf
Abstract When learning a mapping from an input space to an output space, the assumption that the sample distribution of the training data is the same as that of the test data is often violated. Unsupervised domain shift methods adapt the learned function in order to correct for this shift. Previous work has focused on utilizing unlabeled samples from the target distribution. We consider the complementary problem in which the unlabeled samples are given post mapping, i.e., we are given the outputs of the mapping of unknown samples from the shifted domain. Two other variants are also studied: the two sided version, in which unlabeled samples are give from both the input and the output spaces, and the Domain Transfer problem, which was recently formalized. In all cases, we derive generalization bounds that employ discrepancy terms.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2017-03-05
URL http://arxiv.org/abs/1703.01606v1
PDF http://arxiv.org/pdf/1703.01606v1.pdf
PWC https://paperswithcode.com/paper/a-theory-of-output-side-unsupervised-domain
Repo
Framework

A Large-Scale CNN Ensemble for Medication Safety Analysis

Title A Large-Scale CNN Ensemble for Medication Safety Analysis
Authors Liliya Akhtyamova, Andrey Ignatov, John Cardiff
Abstract Revealing Adverse Drug Reactions (ADR) is an essential part of post-marketing drug surveillance, and data from health-related forums and medical communities can be of a great significance for estimating such effects. In this paper, we propose an end-to-end CNN-based method for predicting drug safety on user comments from healthcare discussion forums. We present an architecture that is based on a vast ensemble of CNNs with varied structural parameters, where the prediction is determined by the majority vote. To evaluate the performance of the proposed solution, we present a large-scale dataset collected from a medical website that consists of over 50 thousand reviews for more than 4000 drugs. The results demonstrate that our model significantly outperforms conventional approaches and predicts medicine safety with an accuracy of 87.17% for binary and 62.88% for multi-classification tasks.
Tasks
Published 2017-06-17
URL http://arxiv.org/abs/1706.05549v1
PDF http://arxiv.org/pdf/1706.05549v1.pdf
PWC https://paperswithcode.com/paper/a-large-scale-cnn-ensemble-for-medication
Repo
Framework

Wikipedia for Smart Machines and Double Deep Machine Learning

Title Wikipedia for Smart Machines and Double Deep Machine Learning
Authors Moshe BenBassat
Abstract Very important breakthroughs in data centric deep learning algorithms led to impressive performance in transactional point applications of Artificial Intelligence (AI) such as Face Recognition, or EKG classification. With all due appreciation, however, knowledge blind data only machine learning algorithms have severe limitations for non-transactional AI applications, such as medical diagnosis beyond the EKG results. Such applications require deeper and broader knowledge in their problem solving capabilities, e.g. integrating anatomy and physiology knowledge with EKG results and other patient findings. Following a review and illustrations of such limitations for several real life AI applications, we point at ways to overcome them. The proposed Wikipedia for Smart Machines initiative aims at building repositories of software structures that represent humanity science & technology knowledge in various parts of life; knowledge that we all learn in schools, universities and during our professional life. Target readers for these repositories are smart machines; not human. AI software developers will have these Reusable Knowledge structures readily available, hence, the proposed name ReKopedia. Big Data is by now a mature technology, it is time to focus on Big Knowledge. Some will be derived from data, some will be obtained from mankind gigantic repository of knowledge. Wikipedia for smart machines along with the new Double Deep Learning approach offer a paradigm for integrating datacentric deep learning algorithms with algorithms that leverage deep knowledge, e.g. evidential reasoning and causality reasoning. For illustration, a project is described to produce ReKopedia knowledge modules for medical diagnosis of about 1,000 disorders. Data is important, but knowledge deep, basic, and commonsense is equally important.
Tasks Face Recognition, Medical Diagnosis
Published 2017-11-17
URL http://arxiv.org/abs/1711.06517v2
PDF http://arxiv.org/pdf/1711.06517v2.pdf
PWC https://paperswithcode.com/paper/wikipedia-for-smart-machines-and-double-deep
Repo
Framework

Pruning Convolutional Neural Networks for Image Instance Retrieval

Title Pruning Convolutional Neural Networks for Image Instance Retrieval
Authors Gaurav Manek, Jie Lin, Vijay Chandrasekhar, Lingyu Duan, Sateesh Giduthuri, Xiaoli Li, Tomaso Poggio
Abstract In this work, we focus on the problem of image instance retrieval with deep descriptors extracted from pruned Convolutional Neural Networks (CNN). The objective is to heavily prune convolutional edges while maintaining retrieval performance. To this end, we introduce both data-independent and data-dependent heuristics to prune convolutional edges, and evaluate their performance across various compression rates with different deep descriptors over several benchmark datasets. Further, we present an end-to-end framework to fine-tune the pruned network, with a triplet loss function specially designed for the retrieval task. We show that the combination of heuristic pruning and fine-tuning offers 5x compression rate without considerable loss in retrieval performance.
Tasks Image Instance Retrieval
Published 2017-07-18
URL http://arxiv.org/abs/1707.05455v1
PDF http://arxiv.org/pdf/1707.05455v1.pdf
PWC https://paperswithcode.com/paper/pruning-convolutional-neural-networks-for-1
Repo
Framework

Single Image Action Recognition by Predicting Space-Time Saliency

Title Single Image Action Recognition by Predicting Space-Time Saliency
Authors Marjaneh Safaei, Hassan Foroosh
Abstract We propose a novel approach based on deep Convolutional Neural Networks (CNN) to recognize human actions in still images by predicting the future motion, and detecting the shape and location of the salient parts of the image. We make the following major contributions to this important area of research: (i) We use the predicted future motion in the static image (Walker et al., 2015) as a means of compensating for the missing temporal information, while using the saliency map to represent the the spatial information in the form of location and shape of what is predicted as significant. (ii) We cast action classification in static images as a domain adaptation problem by transfer learning. We first map the input static image to a new domain that we refer to as the Predicted Optical Flow-Saliency Map domain (POF-SM), and then fine-tune the layers of a deep CNN model trained on classifying the ImageNet dataset to perform action classification in the POF-SM domain. (iii) We tested our method on the popular Willow dataset. But unlike existing methods, we also tested on a more realistic and challenging dataset of over 2M still images that we collected and labeled by taking random frames from the UCF-101 video dataset. We call our dataset the UCF Still Image dataset or UCFSI-101 in short. Our results outperform the state of the art.
Tasks Action Classification, Domain Adaptation, Optical Flow Estimation, Temporal Action Localization, Transfer Learning
Published 2017-05-12
URL http://arxiv.org/abs/1705.04641v1
PDF http://arxiv.org/pdf/1705.04641v1.pdf
PWC https://paperswithcode.com/paper/single-image-action-recognition-by-predicting
Repo
Framework

Fatiguing STDP: Learning from Spike-Timing Codes in the Presence of Rate Codes

Title Fatiguing STDP: Learning from Spike-Timing Codes in the Presence of Rate Codes
Authors Timoleon Moraitis, Abu Sebastian, Irem Boybat, Manuel Le Gallo, Tomas Tuma, Evangelos Eleftheriou
Abstract Spiking neural networks (SNNs) could play a key role in unsupervised machine learning applications, by virtue of strengths related to learning from the fine temporal structure of event-based signals. However, some spike-timing-related strengths of SNNs are hindered by the sensitivity of spike-timing-dependent plasticity (STDP) rules to input spike rates, as fine temporal correlations may be obstructed by coarser correlations between firing rates. In this article, we propose a spike-timing-dependent learning rule that allows a neuron to learn from the temporally-coded information despite the presence of rate codes. Our long-term plasticity rule makes use of short-term synaptic fatigue dynamics. We show analytically that, in contrast to conventional STDP rules, our fatiguing STDP (FSTDP) helps learn the temporal code, and we derive the necessary conditions to optimize the learning process. We showcase the effectiveness of FSTDP in learning spike-timing correlations among processes of different rates in synthetic data. Finally, we use FSTDP to detect correlations in real-world weather data from the United States in an experimental realization of the algorithm that uses a neuromorphic hardware platform comprising phase-change memristive devices. Taken together, our analyses and demonstrations suggest that FSTDP paves the way for the exploitation of the spike-based strengths of SNNs in real-world applications.
Tasks
Published 2017-06-17
URL http://arxiv.org/abs/1706.05563v1
PDF http://arxiv.org/pdf/1706.05563v1.pdf
PWC https://paperswithcode.com/paper/fatiguing-stdp-learning-from-spike-timing
Repo
Framework

Computational Model for Predicting Visual Fixations from Childhood to Adulthood

Title Computational Model for Predicting Visual Fixations from Childhood to Adulthood
Authors Olivier Le Meur, Antoine Coutrot, Zhi Liu, Adrien Le Roch, Andrea Helo, Pia Rama
Abstract How people look at visual information reveals fundamental information about themselves, their interests and their state of mind. While previous visual attention models output static 2-dimensional saliency maps, saccadic models aim to predict not only where observers look at but also how they move their eyes to explore the scene. Here we demonstrate that saccadic models are a flexible framework that can be tailored to emulate observer’s viewing tendencies. More specifically, we use the eye data from 101 observers split in 5 age groups (adults, 8-10 y.o., 6-8 y.o., 4-6 y.o. and 2 y.o.) to train our saccadic model for different stages of the development of the human visual system. We show that the joint distribution of saccade amplitude and orientation is a visual signature specific to each age group, and can be used to generate age-dependent scanpaths. Our age-dependent saccadic model not only outputs human-like, age-specific visual scanpath, but also significantly outperforms other state-of-the-art saliency models. In this paper, we demonstrate that the computational modelling of visual attention, through the use of saccadic model, can be efficiently adapted to emulate the gaze behavior of a specific group of observers.
Tasks
Published 2017-02-15
URL http://arxiv.org/abs/1702.04657v1
PDF http://arxiv.org/pdf/1702.04657v1.pdf
PWC https://paperswithcode.com/paper/computational-model-for-predicting-visual
Repo
Framework

Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation

Title Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation
Authors Lotem Peled, Roi Reichart
Abstract Sarcasm is a form of speech in which speakers say the opposite of what they truly mean in order to convey a strong sentiment. In other words, “Sarcasm is the giant chasm between what I say, and the person who doesn’t get it.". In this paper we present the novel task of sarcasm interpretation, defined as the generation of a non-sarcastic utterance conveying the same message as the original sarcastic one. We introduce a novel dataset of 3000 sarcastic tweets, each interpreted by five human judges. Addressing the task as monolingual machine translation (MT), we experiment with MT algorithms and evaluation measures. We then present SIGN: an MT based sarcasm interpretation algorithm that targets sentiment words, a defining element of textual sarcasm. We show that while the scores of n-gram based automatic measures are similar for all interpretation models, SIGN’s interpretations are scored higher by humans for adequacy and sentiment polarity. We conclude with a discussion on future research directions for our new task.
Tasks Machine Translation
Published 2017-04-22
URL http://arxiv.org/abs/1704.06836v1
PDF http://arxiv.org/pdf/1704.06836v1.pdf
PWC https://paperswithcode.com/paper/sarcasm-sign-interpreting-sarcasm-with
Repo
Framework

RGB-D-based Human Motion Recognition with Deep Learning: A Survey

Title RGB-D-based Human Motion Recognition with Deep Learning: A Survey
Authors Pichao Wang, Wanqing Li, Philip Ogunbona, Jun Wan, Sergio Escalera
Abstract Human motion recognition is one of the most important branches of human-centered research activities. In recent years, motion recognition based on RGB-D data has attracted much attention. Along with the development in artificial intelligence, deep learning techniques have gained remarkable success in computer vision. In particular, convolutional neural networks (CNN) have achieved great success for image-based tasks, and recurrent neural networks (RNN) are renowned for sequence-based problems. Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data. In this paper, a detailed overview of recent advances in RGB-D-based motion recognition is presented. The reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth-based, skeleton-based and RGB+D-based. As a survey focused on the application of deep learning to RGB-D-based motion recognition, we explicitly discuss the advantages and limitations of existing techniques. Particularly, we highlighted the methods of encoding spatial-temporal-structural information inherent in video sequence, and discuss potential directions for future research.
Tasks
Published 2017-10-31
URL http://arxiv.org/abs/1711.08362v2
PDF http://arxiv.org/pdf/1711.08362v2.pdf
PWC https://paperswithcode.com/paper/rgb-d-based-human-motion-recognition-with
Repo
Framework

A deep architecture for unified aesthetic prediction

Title A deep architecture for unified aesthetic prediction
Authors Naila Murray, Albert Gordo
Abstract Image aesthetics has become an important criterion for visual content curation on social media sites and media content repositories. Previous work on aesthetic prediction models in the computer vision community has focused on aesthetic score prediction or binary image labeling. However, raw aesthetic annotations are in the form of score histograms and provide richer and more precise information than binary labels or mean scores. Consequently, in this work we focus on the rarely-studied problem of predicting aesthetic score distributions and propose a novel architecture and training procedure for our model. Our model achieves state-of-the-art results on the standard AVA large-scale benchmark dataset for three tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction, all while using one model trained only for the distribution prediction task. We also introduce a method to modify an image such that its predicted aesthetics changes, and use this modification to gain insight into our model.
Tasks
Published 2017-08-16
URL http://arxiv.org/abs/1708.04890v1
PDF http://arxiv.org/pdf/1708.04890v1.pdf
PWC https://paperswithcode.com/paper/a-deep-architecture-for-unified-aesthetic
Repo
Framework

Techniques for visualizing LSTMs applied to electrocardiograms

Title Techniques for visualizing LSTMs applied to electrocardiograms
Authors Jos van der Westhuizen, Joan Lasenby
Abstract This paper explores four different visualization techniques for long short-term memory (LSTM) networks applied to continuous-valued time series. On the datasets analysed, we find that the best visualization technique is to learn an input deletion mask that optimally reduces the true class score. With a specific focus on single-lead electrocardiograms from the MIT-BIH arrhythmia dataset, we show that salient input features for the LSTM classifier align well with medical theory.
Tasks Time Series
Published 2017-05-23
URL http://arxiv.org/abs/1705.08153v3
PDF http://arxiv.org/pdf/1705.08153v3.pdf
PWC https://paperswithcode.com/paper/techniques-for-visualizing-lstms-applied-to
Repo
Framework
comments powered by Disqus