October 18, 2019

3387 words 16 mins read

Paper Group ANR 527

A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges. Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping. Distributed Convex Optimization With Limited Communications. Disease Classification in Metagenomics with 2D Embeddings and Deep Learning. Medical Exam Question Answering with Large-sca …

A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges


Title	A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges
Authors	Yee-Hui Oh, John See, Anh Cat Le Ngo, Raphael Chung-Wei Phan, Vishnu Monn Baskaran
Abstract	Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions possible today, in contrast to decades ago when it was primarily the domain of psychiatrists where analysis was largely manual. Indeed, although the study of facial micro-expressions is a well-established field in psychology, it is still relatively new from the computational perspective with many interesting problems. In this survey, we present a comprehensive review of state-of-the-art databases and methods for micro-expressions spotting and recognition. Individual stages involved in the automation of these tasks are also described and reviewed at length. In addition, we also deliberate on the challenges and future directions in this growing field of automatic facial micro-expression analysis.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05781v1
PDF	http://arxiv.org/pdf/1806.05781v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-of-automatic-facial-micro-expression
Repo
Framework

Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping


Title	Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping
Authors	Jonathan Juett, Benjamin Kuipers
Abstract	The young infant explores its body, its sensorimotor system, and the immediately accessible parts of its environment, over the course of a few months creating a model of peripersonal space useful for reaching and grasping objects around it. Drawing on constraints from the empirical literature on infant behavior, we present a preliminary computational model of this learning process, implemented and evaluated on a physical robot. The learning agent explores the relationship between the configuration space of the arm, sensing joint angles through proprioception, and its visual perceptions of the hand and grippers. The resulting knowledge is represented as the peripersonal space (PPS) graph, where nodes represent states of the arm, edges represent safe movements, and paths represent safe trajectories from one pose to another. In our model, the learning process is driven by intrinsic motivation. When repeatedly performing an action, the agent learns the typical result, but also detects unusual outcomes, and is motivated to learn how to make those unusual results reliable. Arm motions typically leave the static background unchanged, but occasionally bump an object, changing its static position. The reach action is learned as a reliable way to bump and move an object in the environment. Similarly, once a reliable reach action is learned, it typically makes a quasi-static change in the environment, moving an object from one static position to another. The unusual outcome is that the object is accidentally grasped (thanks to the innate Palmar reflex), and thereafter moves dynamically with the hand. Learning to make grasps reliable is more complex than for reaches, but we demonstrate significant progress. Our current results are steps toward autonomous sensorimotor learning of motion, reaching, and grasping in peripersonal space, based on unguided exploration and intrinsic motivation.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10788v1
PDF	http://arxiv.org/pdf/1809.10788v1.pdf
PWC	https://paperswithcode.com/paper/learning-and-acting-in-peripersonal-space
Repo
Framework

Distributed Convex Optimization With Limited Communications


Title	Distributed Convex Optimization With Limited Communications
Authors	Milind Rao, Stefano Rini, Andrea Goldsmith
Abstract	In this paper, a distributed convex optimization algorithm, termed \emph{distributed coordinate dual averaging} (DCDA) algorithm, is proposed. The DCDA algorithm addresses the scenario of a large distributed optimization problem with limited communication among nodes in the network. Currently known distributed subgradient methods, such as the distributed dual averaging or the distributed alternating direction method of multipliers algorithms, assume that nodes can exchange messages of large cardinality. Such network communication capabilities are not valid in many scenarios of practical relevance. In the DCDA algorithm, on the other hand, communication of each coordinate of the optimization variable is restricted over time. For the proposed algorithm, we bound the rate of convergence under different communication protocols and network architectures. We also consider the extensions to the case of imperfect gradient knowledge and the case in which transmitted messages are corrupted by additive noise or are quantized. Relevant numerical simulations are also provided.
Tasks	Distributed Optimization
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12457v1
PDF	http://arxiv.org/pdf/1810.12457v1.pdf
PWC	https://paperswithcode.com/paper/distributed-convex-optimization-with-limited
Repo
Framework

Disease Classification in Metagenomics with 2D Embeddings and Deep Learning


Title	Disease Classification in Metagenomics with 2D Embeddings and Deep Learning
Authors	Thanh Hai Nguyen, Edi Prifti, Yann Chevaleyre, Nataliya Sokolovska, Jean-Daniel Zucker
Abstract	Deep learning (DL) techniques have shown unprecedented success when applied to images, waveforms, and text. Generally, when the sample size ($N$) is much bigger than the number of features ($d$), DL often outperforms other machine learning (ML) techniques, often through the use of Convolutional Neural Networks (CNNs). However, in many bioinformatics fields (including metagenomics), we encounter the opposite situation where $d$ is significantly greater than $N$. In these situations, applying DL techniques would lead to severe overfitting. Here we aim to improve classification of various diseases with metagenomic data through the use of CNNs. For this we proposed to represent metagenomic data as images. The proposed Met2Img approach relies on taxonomic and t-SNE embeddings to transform abundance data into “synthetic images”. We applied our approach to twelve benchmark data sets including more than 1400 metagenomic samples. Our results show significant improvements over the state-of-the-art algorithms (Random Forest (RF), Support Vector Machine (SVM)). We observe that the integration of phylogenetic information alongside abundance data improves classification. The proposed approach is not only important in classification setting but also allows to visualize complex metagenomic data. The Met2Img is implemented in Python.
Tasks
Published	2018-06-23
URL	http://arxiv.org/abs/1806.09046v1
PDF	http://arxiv.org/pdf/1806.09046v1.pdf
PWC	https://paperswithcode.com/paper/disease-classification-in-metagenomics-with
Repo
Framework

Medical Exam Question Answering with Large-scale Reading Comprehension


Title	Medical Exam Question Answering with Large-scale Reading Comprehension
Authors	Xiao Zhang, Ji Wu, Zhiyang He, Xien Liu, Ying Su
Abstract	Reading and understanding text is one important component in computer aided diagnosis in clinical medicine, also being a major research problem in the field of NLP. In this work, we introduce a question-answering task called MedQA to study answering questions in clinical medicine using knowledge in a large-scale document collection. The aim of MedQA is to answer real-world questions with large-scale reading comprehension. We propose our solution SeaReader–a modular end-to-end reading comprehension model based on LSTM networks and dual-path attention architecture. The novel dual-path attention models information flow from two perspectives and has the ability to simultaneously read individual documents and integrate information across multiple documents. In experiments our SeaReader achieved a large increase in accuracy on MedQA over competing models. Additionally, we develop a series of novel techniques to demonstrate the interpretation of the question answering process in SeaReader.
Tasks	Question Answering, Reading Comprehension
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10279v1
PDF	http://arxiv.org/pdf/1802.10279v1.pdf
PWC	https://paperswithcode.com/paper/medical-exam-question-answering-with-large
Repo
Framework

Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples


Title	Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples
Authors	Jacob M. Springer, Charles S. Strauss, Austin M. Thresher, Edward Kim, Garrett T. Kenyon
Abstract	Although deep learning has shown great success in recent years, researchers have discovered a critical flaw where small, imperceptible changes in the input to the system can drastically change the output classification. These attacks are exploitable in nearly all of the existing deep learning classification frameworks. However, the susceptibility of deep sparse coding models to adversarial examples has not been examined. Here, we show that classifiers based on a deep sparse coding model whose classification accuracy is competitive with a variety of deep neural network models are robust to adversarial examples that effectively fool those same deep learning models. We demonstrate both quantitatively and qualitatively that the robustness of deep sparse coding models to adversarial examples arises from two key properties. First, because deep sparse coding models learn general features corresponding to generators of the dataset as a whole, rather than highly discriminative features for distinguishing specific classes, the resulting classifiers are less dependent on idiosyncratic features that might be more easily exploited. Second, because deep sparse coding models utilize fixed point attractor dynamics with top-down feedback, it is more difficult to find small changes to the input that drive the resulting representations out of the correct attractor basin.
Tasks
Published	2018-11-17
URL	http://arxiv.org/abs/1811.07211v2
PDF	http://arxiv.org/pdf/1811.07211v2.pdf
PWC	https://paperswithcode.com/paper/classifiers-based-on-deep-sparse-coding
Repo
Framework

An Optimal Itinerary Generation in a Configuration Space of Large Intellectual Agent Groups with Linear Logic


Title	An Optimal Itinerary Generation in a Configuration Space of Large Intellectual Agent Groups with Linear Logic
Authors	Dmitry Maximov
Abstract	A group of intelligent agents which fulfill a set of tasks in parallel is represented first by the tensor multiplication of corresponding processes in a linear logic game category. An optimal itinerary in the configuration space of the group states is defined as a play with maximal total reward in the category. New moments also are: the reward is represented as a degree of certainty (visibility) of an agent goal, and the system goals are chosen by the greatest value corresponding to these processes in the system goal lattice.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02216v1
PDF	http://arxiv.org/pdf/1811.02216v1.pdf
PWC	https://paperswithcode.com/paper/an-optimal-itinerary-generation-in-a
Repo
Framework

NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference


Title	NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference
Authors	Mahdi Nazemi, Ghasem Pasandi, Massoud Pedram
Abstract	Deep neural networks have been successfully deployed in a wide variety of applications including computer vision and speech recognition. However, computational and storage complexity of these models has forced the majority of computations to be performed on high-end computing platforms or on the cloud. To cope with computational and storage complexity of these models, this paper presents a training method that enables a radically different approach for realization of deep neural networks through Boolean logic minimization. The aforementioned realization completely removes the energy-hungry step of accessing memory for obtaining model parameters, consumes about two orders of magnitude fewer computing resources compared to realizations that use floatingpoint operations, and has a substantially lower latency.
Tasks	Speech Recognition
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08716v2
PDF	http://arxiv.org/pdf/1807.08716v2.pdf
PWC	https://paperswithcode.com/paper/nullanet-training-deep-neural-networks-for
Repo
Framework

End to End Brain Fiber Orientation Estimation using Deep Learning


Title	End to End Brain Fiber Orientation Estimation using Deep Learning
Authors	Nandakishore Puttashamachar, Ulas Bagci
Abstract	In this work, we explore the various Brain Neuron tracking techniques, which is one of the most significant applications of Diffusion Tensor Imaging. Tractography provides us with a non-invasive method to analyze underlying tissue micro-structure. Understanding the structure and organization of the tissues facilitates us with a diagnosis method to identify any aberrations and provide acute information on the occurrences of brain ischemia or stroke, the mutation of neurological diseases such as Alzheimer, multiple sclerosis and so on. Time if of essence and accurate localization of the aberrations can help save or change a diseased life. Following up with the limitations introduced by the current Tractography techniques such as computational complexity, reconstruction errors during tensor estimation and standardization, we aim to elucidate these limitations through our research findings. We introduce an end to end Deep Learning framework which can accurately estimate the most probable likelihood orientation at each voxel along a neuronal pathway. We use Probabilistic Tractography as our baseline model to obtain the training data and which also serve as a Tractography Gold Standard for our evaluations. Through experiments we show that our Deep Network can do a significant improvement over current Tractography implementations by reducing the run-time complexity to a significant new level. Our architecture also allows for variable sized input DWI signals eliminating the need to worry about memory issues as seen with the traditional techniques. The advantage of this architecture is that it is perfectly desirable to be processed on a cloud setup and utilize the existing multi GPU frameworks to perform whole brain Tractography in minutes rather than hours. We evaluate our network with Gold Standard and benchmark its performance across several parameters.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.03969v1
PDF	http://arxiv.org/pdf/1806.03969v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-brain-fiber-orientation-estimation
Repo
Framework

Capsule networks for low-data transfer learning


Title	Capsule networks for low-data transfer learning
Authors	Andrew Gritsevskiy, Maksym Korablyov
Abstract	We propose a capsule network-based architecture for generalizing learning to new data with few examples. Using both generative and non-generative capsule networks with intermediate routing, we are able to generalize to new information over 25 times faster than a similar convolutional neural network. We train the networks on the multiMNIST dataset lacking one digit. After the networks reach their maximum accuracy, we inject 1-100 examples of the missing digit into the training set, and measure the number of batches needed to return to a comparable level of accuracy. We then discuss the improvement in low-data transfer learning that capsule networks bring, and propose future directions for capsule research.
Tasks	Transfer Learning
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10172v1
PDF	http://arxiv.org/pdf/1804.10172v1.pdf
PWC	https://paperswithcode.com/paper/capsule-networks-for-low-data-transfer
Repo
Framework

A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates


Title	A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates
Authors	Kaiwen Zhou, Fanhua Shang, James Cheng
Abstract	Recent years have witnessed exciting progress in the study of stochastic variance reduced gradient methods (e.g., SVRG, SAGA), their accelerated variants (e.g, Katyusha) and their extensions in many different settings (e.g., online, sparse, asynchronous, distributed). Among them, accelerated methods enjoy improved convergence rates but have complex coupling structures, which makes them hard to be extended to more settings (e.g., sparse and asynchronous) due to the existence of perturbation. In this paper, we introduce a simple stochastic variance reduced algorithm (MiG), which enjoys the best-known convergence rates for both strongly convex and non-strongly convex problems. Moreover, we also present its efficient sparse and asynchronous variants, and theoretically analyze its convergence rates in these settings. Finally, extensive experiments for various machine learning problems such as logistic regression are given to illustrate the practical improvement in both serial and asynchronous settings.
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.11027v1
PDF	http://arxiv.org/pdf/1806.11027v1.pdf
PWC	https://paperswithcode.com/paper/a-simple-stochastic-variance-reduced
Repo
Framework

Montage based 3D Medical Image Retrieval from Traumatic Brain Injury Cohort using Deep Convolutional Neural Network


Title	Montage based 3D Medical Image Retrieval from Traumatic Brain Injury Cohort using Deep Convolutional Neural Network
Authors	Cailey I. Kerley, Yuankai Huo, Shikha Chaganti, Shunxing Bao, Mayur B. Patel, Bennett A. Landman
Abstract	Brain imaging analysis on clinically acquired computed tomography (CT) is essential for the diagnosis, risk prediction of progression, and treatment of the structural phenotypes of traumatic brain injury (TBI). However, in real clinical imaging scenarios, entire body CT images (e.g., neck, abdomen, chest, pelvis) are typically captured along with whole brain CT scans. For instance, in a typical sample of clinical TBI imaging cohort, only ~15% of CT scans actually contain whole brain CT images suitable for volumetric brain analyses; the remaining are partial brain or non-brain images. Therefore, a manual image retrieval process is typically required to isolate the whole brain CT scans from the entire cohort. However, the manual image retrieval is time and resource consuming and even more difficult for the larger cohorts. To alleviate the manual efforts, in this paper we propose an automated 3D medical image retrieval pipeline, called deep montage-based image retrieval (dMIR), which performs classification on 2D montage images via a deep convolutional neural network. The novelty of the proposed method for image processing is to characterize the medical image retrieval task based on the montage images. In a cohort of 2000 clinically acquired TBI scans, 794 scans were used as training data, 206 scans were used as validation data, and the remaining 1000 scans were used as testing data. The proposed achieved accuracy=1.0, recall=1.0, precision=1.0, f1=1.0 for validation data, while achieved accuracy=0.988, recall=0.962, precision=0.962, f1=0.962 for testing data. Thus, the proposed dMIR is able to perform accurate CT whole brain image retrieval from large-scale clinical cohorts.
Tasks	Computed Tomography (CT), Image Retrieval, Medical Image Retrieval
Published	2018-12-10
URL	http://arxiv.org/abs/1812.04118v1
PDF	http://arxiv.org/pdf/1812.04118v1.pdf
PWC	https://paperswithcode.com/paper/montage-based-3d-medical-image-retrieval-from
Repo
Framework

Sparsity-based Convolutional Kernel Network for Unsupervised Medical Image Analysis


Title	Sparsity-based Convolutional Kernel Network for Unsupervised Medical Image Analysis
Authors	Euijoon Ahn, Jinman Kim, Ashnil Kumar, Michael Fulham, Dagan Feng
Abstract	The availability of large-scale annotated image datasets coupled with recent advances in supervised deep learning methods are enabling the derivation of representative image features that can potentially impact different image analysis problems. However, such supervised approaches are not feasible in the medical domain where it is challenging to obtain a large volume of labelled data due to the complexity of manual annotation and inter- and intra-observer variability in label assignment. Algorithms designed to work on small annotated datasets are useful but have limited applications. In an effort to address the lack of annotated data in the medical image analysis domain, we propose an algorithm for hierarchical unsupervised feature learning. Our algorithm introduces three new contributions: (i) we use kernel learning to identify and represent invariant characteristics across image sub-patches in an unsupervised manner; (ii) we leverage the sparsity inherent to medical image data and propose a new sparse convolutional kernel network (S-CKN) that can be pre-trained in a layer-wise fashion, thereby providing initial discriminative features for medical data; and (iii) we propose a spatial pyramid pooling framework to capture subtle geometric differences in medical image data. Our experiments evaluate our algorithm in two common application areas of medical image retrieval and classification using two public datasets. Our results demonstrate that the medical image feature representations extracted with our algorithm enable a higher accuracy in both application areas compared to features extracted from other conventional unsupervised methods. Furthermore, our approach achieves an accuracy that is competitive with state-of-the-art supervised CNNs.
Tasks	Image Retrieval, Medical Image Retrieval
Published	2018-07-16
URL	https://arxiv.org/abs/1807.05648v3
PDF	https://arxiv.org/pdf/1807.05648v3.pdf
PWC	https://paperswithcode.com/paper/sparsity-based-convolutional-kernel-network
Repo
Framework

BAR: Bayesian Activity Recognition using variational inference


Title	BAR: Bayesian Activity Recognition using variational inference
Authors	Ranganath Krishnan, Mahesh Subedar, Omesh Tickoo
Abstract	Uncertainty estimation in deep neural networks is essential for designing reliable and robust AI systems. Applications such as video surveillance for identifying suspicious activities are designed with deep neural networks (DNNs), but DNNs do not provide uncertainty estimates. Capturing reliable uncertainty estimates in safety and security critical applications will help to establish trust in the AI system. Our contribution is to apply Bayesian deep learning framework to visual activity recognition application and quantify model uncertainty along with principled confidence. We utilize the stochastic variational inference technique while training the Bayesian DNNs to infer the approximate posterior distribution around model parameters and perform Monte Carlo sampling on the posterior of model parameters to obtain the predictive distribution. We show that the Bayesian inference applied to DNNs provide reliable confidence measures for visual activity recognition task as compared to conventional DNNs. We also show that our method improves the visual activity recognition precision-recall AUC by 6.2% compared to non-Bayesian baseline. We evaluate our models on Moments-In-Time (MiT) activity recognition dataset by selecting a subset of in- and out-of-distribution video samples.
Tasks	Activity Recognition, Bayesian Inference
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03305v2
PDF	http://arxiv.org/pdf/1811.03305v2.pdf
PWC	https://paperswithcode.com/paper/bar-bayesian-activity-recognition-using
Repo
Framework

A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain


Title	A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Authors	Shuai Li, Dinei Florencio, Wanqing Li, Yaqin Zhao, Chris Cook
Abstract	Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address this problem in the wavelet domain. We first show that the small differences in the image domain can be highlighted in certain wavelet bands. Then the likelihood of each wavelet coefficient being foreground is estimated by formulating foreground and background models for each wavelet band. The proposed framework effectively aggregates the likelihoods from different wavelet bands based on the characteristics of the wavelet transform. Experimental results demonstrated that the proposed method significantly outperformed existing methods in detecting camouflaged foreground objects. Specifically, the average F-measure for the proposed algorithm was 0.87, compared to 0.71 to 0.8 for the other state-of-the-art methods.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05984v1
PDF	http://arxiv.org/pdf/1804.05984v1.pdf
PWC	https://paperswithcode.com/paper/a-fusion-framework-for-camouflaged-moving
Repo
Framework