Paper Group ANR 527
A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges. Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping. Distributed Convex Optimization With Limited Communications. Disease Classification in Metagenomics with 2D Embeddings and Deep Learning. Medical Exam Question Answering with Large-sca …
A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges
Title | A Survey of Automatic Facial Micro-expression Analysis: Databases, Methods and Challenges |
Authors | Yee-Hui Oh, John See, Anh Cat Le Ngo, Raphael Chung-Wei Phan, Vishnu Monn Baskaran |
Abstract | Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions possible today, in contrast to decades ago when it was primarily the domain of psychiatrists where analysis was largely manual. Indeed, although the study of facial micro-expressions is a well-established field in psychology, it is still relatively new from the computational perspective with many interesting problems. In this survey, we present a comprehensive review of state-of-the-art databases and methods for micro-expressions spotting and recognition. Individual stages involved in the automation of these tasks are also described and reviewed at length. In addition, we also deliberate on the challenges and future directions in this growing field of automatic facial micro-expression analysis. |
Tasks | |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.05781v1 |
http://arxiv.org/pdf/1806.05781v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-automatic-facial-micro-expression |
Repo | |
Framework | |
Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping
Title | Learning and Acting in Peripersonal Space: Moving, Reaching, and Grasping |
Authors | Jonathan Juett, Benjamin Kuipers |
Abstract | The young infant explores its body, its sensorimotor system, and the immediately accessible parts of its environment, over the course of a few months creating a model of peripersonal space useful for reaching and grasping objects around it. Drawing on constraints from the empirical literature on infant behavior, we present a preliminary computational model of this learning process, implemented and evaluated on a physical robot. The learning agent explores the relationship between the configuration space of the arm, sensing joint angles through proprioception, and its visual perceptions of the hand and grippers. The resulting knowledge is represented as the peripersonal space (PPS) graph, where nodes represent states of the arm, edges represent safe movements, and paths represent safe trajectories from one pose to another. In our model, the learning process is driven by intrinsic motivation. When repeatedly performing an action, the agent learns the typical result, but also detects unusual outcomes, and is motivated to learn how to make those unusual results reliable. Arm motions typically leave the static background unchanged, but occasionally bump an object, changing its static position. The reach action is learned as a reliable way to bump and move an object in the environment. Similarly, once a reliable reach action is learned, it typically makes a quasi-static change in the environment, moving an object from one static position to another. The unusual outcome is that the object is accidentally grasped (thanks to the innate Palmar reflex), and thereafter moves dynamically with the hand. Learning to make grasps reliable is more complex than for reaches, but we demonstrate significant progress. Our current results are steps toward autonomous sensorimotor learning of motion, reaching, and grasping in peripersonal space, based on unguided exploration and intrinsic motivation. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10788v1 |
http://arxiv.org/pdf/1809.10788v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-and-acting-in-peripersonal-space |
Repo | |
Framework | |
Distributed Convex Optimization With Limited Communications
Title | Distributed Convex Optimization With Limited Communications |
Authors | Milind Rao, Stefano Rini, Andrea Goldsmith |
Abstract | In this paper, a distributed convex optimization algorithm, termed \emph{distributed coordinate dual averaging} (DCDA) algorithm, is proposed. The DCDA algorithm addresses the scenario of a large distributed optimization problem with limited communication among nodes in the network. Currently known distributed subgradient methods, such as the distributed dual averaging or the distributed alternating direction method of multipliers algorithms, assume that nodes can exchange messages of large cardinality. Such network communication capabilities are not valid in many scenarios of practical relevance. In the DCDA algorithm, on the other hand, communication of each coordinate of the optimization variable is restricted over time. For the proposed algorithm, we bound the rate of convergence under different communication protocols and network architectures. We also consider the extensions to the case of imperfect gradient knowledge and the case in which transmitted messages are corrupted by additive noise or are quantized. Relevant numerical simulations are also provided. |
Tasks | Distributed Optimization |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12457v1 |
http://arxiv.org/pdf/1810.12457v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-convex-optimization-with-limited |
Repo | |
Framework | |
Disease Classification in Metagenomics with 2D Embeddings and Deep Learning
Title | Disease Classification in Metagenomics with 2D Embeddings and Deep Learning |
Authors | Thanh Hai Nguyen, Edi Prifti, Yann Chevaleyre, Nataliya Sokolovska, Jean-Daniel Zucker |
Abstract | Deep learning (DL) techniques have shown unprecedented success when applied to images, waveforms, and text. Generally, when the sample size ($N$) is much bigger than the number of features ($d$), DL often outperforms other machine learning (ML) techniques, often through the use of Convolutional Neural Networks (CNNs). However, in many bioinformatics fields (including metagenomics), we encounter the opposite situation where $d$ is significantly greater than $N$. In these situations, applying DL techniques would lead to severe overfitting. Here we aim to improve classification of various diseases with metagenomic data through the use of CNNs. For this we proposed to represent metagenomic data as images. The proposed Met2Img approach relies on taxonomic and t-SNE embeddings to transform abundance data into “synthetic images”. We applied our approach to twelve benchmark data sets including more than 1400 metagenomic samples. Our results show significant improvements over the state-of-the-art algorithms (Random Forest (RF), Support Vector Machine (SVM)). We observe that the integration of phylogenetic information alongside abundance data improves classification. The proposed approach is not only important in classification setting but also allows to visualize complex metagenomic data. The Met2Img is implemented in Python. |
Tasks | |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.09046v1 |
http://arxiv.org/pdf/1806.09046v1.pdf | |
PWC | https://paperswithcode.com/paper/disease-classification-in-metagenomics-with |
Repo | |
Framework | |
Medical Exam Question Answering with Large-scale Reading Comprehension
Title | Medical Exam Question Answering with Large-scale Reading Comprehension |
Authors | Xiao Zhang, Ji Wu, Zhiyang He, Xien Liu, Ying Su |
Abstract | Reading and understanding text is one important component in computer aided diagnosis in clinical medicine, also being a major research problem in the field of NLP. In this work, we introduce a question-answering task called MedQA to study answering questions in clinical medicine using knowledge in a large-scale document collection. The aim of MedQA is to answer real-world questions with large-scale reading comprehension. We propose our solution SeaReader–a modular end-to-end reading comprehension model based on LSTM networks and dual-path attention architecture. The novel dual-path attention models information flow from two perspectives and has the ability to simultaneously read individual documents and integrate information across multiple documents. In experiments our SeaReader achieved a large increase in accuracy on MedQA over competing models. Additionally, we develop a series of novel techniques to demonstrate the interpretation of the question answering process in SeaReader. |
Tasks | Question Answering, Reading Comprehension |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10279v1 |
http://arxiv.org/pdf/1802.10279v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-exam-question-answering-with-large |
Repo | |
Framework | |
Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples
Title | Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples |
Authors | Jacob M. Springer, Charles S. Strauss, Austin M. Thresher, Edward Kim, Garrett T. Kenyon |
Abstract | Although deep learning has shown great success in recent years, researchers have discovered a critical flaw where small, imperceptible changes in the input to the system can drastically change the output classification. These attacks are exploitable in nearly all of the existing deep learning classification frameworks. However, the susceptibility of deep sparse coding models to adversarial examples has not been examined. Here, we show that classifiers based on a deep sparse coding model whose classification accuracy is competitive with a variety of deep neural network models are robust to adversarial examples that effectively fool those same deep learning models. We demonstrate both quantitatively and qualitatively that the robustness of deep sparse coding models to adversarial examples arises from two key properties. First, because deep sparse coding models learn general features corresponding to generators of the dataset as a whole, rather than highly discriminative features for distinguishing specific classes, the resulting classifiers are less dependent on idiosyncratic features that might be more easily exploited. Second, because deep sparse coding models utilize fixed point attractor dynamics with top-down feedback, it is more difficult to find small changes to the input that drive the resulting representations out of the correct attractor basin. |
Tasks | |
Published | 2018-11-17 |
URL | http://arxiv.org/abs/1811.07211v2 |
http://arxiv.org/pdf/1811.07211v2.pdf | |
PWC | https://paperswithcode.com/paper/classifiers-based-on-deep-sparse-coding |
Repo | |
Framework | |
An Optimal Itinerary Generation in a Configuration Space of Large Intellectual Agent Groups with Linear Logic
Title | An Optimal Itinerary Generation in a Configuration Space of Large Intellectual Agent Groups with Linear Logic |
Authors | Dmitry Maximov |
Abstract | A group of intelligent agents which fulfill a set of tasks in parallel is represented first by the tensor multiplication of corresponding processes in a linear logic game category. An optimal itinerary in the configuration space of the group states is defined as a play with maximal total reward in the category. New moments also are: the reward is represented as a degree of certainty (visibility) of an agent goal, and the system goals are chosen by the greatest value corresponding to these processes in the system goal lattice. |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02216v1 |
http://arxiv.org/pdf/1811.02216v1.pdf | |
PWC | https://paperswithcode.com/paper/an-optimal-itinerary-generation-in-a |
Repo | |
Framework | |
NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference
Title | NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference |
Authors | Mahdi Nazemi, Ghasem Pasandi, Massoud Pedram |
Abstract | Deep neural networks have been successfully deployed in a wide variety of applications including computer vision and speech recognition. However, computational and storage complexity of these models has forced the majority of computations to be performed on high-end computing platforms or on the cloud. To cope with computational and storage complexity of these models, this paper presents a training method that enables a radically different approach for realization of deep neural networks through Boolean logic minimization. The aforementioned realization completely removes the energy-hungry step of accessing memory for obtaining model parameters, consumes about two orders of magnitude fewer computing resources compared to realizations that use floatingpoint operations, and has a substantially lower latency. |
Tasks | Speech Recognition |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08716v2 |
http://arxiv.org/pdf/1807.08716v2.pdf | |
PWC | https://paperswithcode.com/paper/nullanet-training-deep-neural-networks-for |
Repo | |
Framework | |
End to End Brain Fiber Orientation Estimation using Deep Learning
Title | End to End Brain Fiber Orientation Estimation using Deep Learning |
Authors | Nandakishore Puttashamachar, Ulas Bagci |
Abstract | In this work, we explore the various Brain Neuron tracking techniques, which is one of the most significant applications of Diffusion Tensor Imaging. Tractography provides us with a non-invasive method to analyze underlying tissue micro-structure. Understanding the structure and organization of the tissues facilitates us with a diagnosis method to identify any aberrations and provide acute information on the occurrences of brain ischemia or stroke, the mutation of neurological diseases such as Alzheimer, multiple sclerosis and so on. Time if of essence and accurate localization of the aberrations can help save or change a diseased life. Following up with the limitations introduced by the current Tractography techniques such as computational complexity, reconstruction errors during tensor estimation and standardization, we aim to elucidate these limitations through our research findings. We introduce an end to end Deep Learning framework which can accurately estimate the most probable likelihood orientation at each voxel along a neuronal pathway. We use Probabilistic Tractography as our baseline model to obtain the training data and which also serve as a Tractography Gold Standard for our evaluations. Through experiments we show that our Deep Network can do a significant improvement over current Tractography implementations by reducing the run-time complexity to a significant new level. Our architecture also allows for variable sized input DWI signals eliminating the need to worry about memory issues as seen with the traditional techniques. The advantage of this architecture is that it is perfectly desirable to be processed on a cloud setup and utilize the existing multi GPU frameworks to perform whole brain Tractography in minutes rather than hours. We evaluate our network with Gold Standard and benchmark its performance across several parameters. |
Tasks | |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.03969v1 |
http://arxiv.org/pdf/1806.03969v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-brain-fiber-orientation-estimation |
Repo | |
Framework | |
Capsule networks for low-data transfer learning
Title | Capsule networks for low-data transfer learning |
Authors | Andrew Gritsevskiy, Maksym Korablyov |
Abstract | We propose a capsule network-based architecture for generalizing learning to new data with few examples. Using both generative and non-generative capsule networks with intermediate routing, we are able to generalize to new information over 25 times faster than a similar convolutional neural network. We train the networks on the multiMNIST dataset lacking one digit. After the networks reach their maximum accuracy, we inject 1-100 examples of the missing digit into the training set, and measure the number of batches needed to return to a comparable level of accuracy. We then discuss the improvement in low-data transfer learning that capsule networks bring, and propose future directions for capsule research. |
Tasks | Transfer Learning |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.10172v1 |
http://arxiv.org/pdf/1804.10172v1.pdf | |
PWC | https://paperswithcode.com/paper/capsule-networks-for-low-data-transfer |
Repo | |
Framework | |
A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates
Title | A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates |
Authors | Kaiwen Zhou, Fanhua Shang, James Cheng |
Abstract | Recent years have witnessed exciting progress in the study of stochastic variance reduced gradient methods (e.g., SVRG, SAGA), their accelerated variants (e.g, Katyusha) and their extensions in many different settings (e.g., online, sparse, asynchronous, distributed). Among them, accelerated methods enjoy improved convergence rates but have complex coupling structures, which makes them hard to be extended to more settings (e.g., sparse and asynchronous) due to the existence of perturbation. In this paper, we introduce a simple stochastic variance reduced algorithm (MiG), which enjoys the best-known convergence rates for both strongly convex and non-strongly convex problems. Moreover, we also present its efficient sparse and asynchronous variants, and theoretically analyze its convergence rates in these settings. Finally, extensive experiments for various machine learning problems such as logistic regression are given to illustrate the practical improvement in both serial and asynchronous settings. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.11027v1 |
http://arxiv.org/pdf/1806.11027v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-stochastic-variance-reduced |
Repo | |
Framework | |
Montage based 3D Medical Image Retrieval from Traumatic Brain Injury Cohort using Deep Convolutional Neural Network
Title | Montage based 3D Medical Image Retrieval from Traumatic Brain Injury Cohort using Deep Convolutional Neural Network |
Authors | Cailey I. Kerley, Yuankai Huo, Shikha Chaganti, Shunxing Bao, Mayur B. Patel, Bennett A. Landman |
Abstract | Brain imaging analysis on clinically acquired computed tomography (CT) is essential for the diagnosis, risk prediction of progression, and treatment of the structural phenotypes of traumatic brain injury (TBI). However, in real clinical imaging scenarios, entire body CT images (e.g., neck, abdomen, chest, pelvis) are typically captured along with whole brain CT scans. For instance, in a typical sample of clinical TBI imaging cohort, only ~15% of CT scans actually contain whole brain CT images suitable for volumetric brain analyses; the remaining are partial brain or non-brain images. Therefore, a manual image retrieval process is typically required to isolate the whole brain CT scans from the entire cohort. However, the manual image retrieval is time and resource consuming and even more difficult for the larger cohorts. To alleviate the manual efforts, in this paper we propose an automated 3D medical image retrieval pipeline, called deep montage-based image retrieval (dMIR), which performs classification on 2D montage images via a deep convolutional neural network. The novelty of the proposed method for image processing is to characterize the medical image retrieval task based on the montage images. In a cohort of 2000 clinically acquired TBI scans, 794 scans were used as training data, 206 scans were used as validation data, and the remaining 1000 scans were used as testing data. The proposed achieved accuracy=1.0, recall=1.0, precision=1.0, f1=1.0 for validation data, while achieved accuracy=0.988, recall=0.962, precision=0.962, f1=0.962 for testing data. Thus, the proposed dMIR is able to perform accurate CT whole brain image retrieval from large-scale clinical cohorts. |
Tasks | Computed Tomography (CT), Image Retrieval, Medical Image Retrieval |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.04118v1 |
http://arxiv.org/pdf/1812.04118v1.pdf | |
PWC | https://paperswithcode.com/paper/montage-based-3d-medical-image-retrieval-from |
Repo | |
Framework | |
Sparsity-based Convolutional Kernel Network for Unsupervised Medical Image Analysis
Title | Sparsity-based Convolutional Kernel Network for Unsupervised Medical Image Analysis |
Authors | Euijoon Ahn, Jinman Kim, Ashnil Kumar, Michael Fulham, Dagan Feng |
Abstract | The availability of large-scale annotated image datasets coupled with recent advances in supervised deep learning methods are enabling the derivation of representative image features that can potentially impact different image analysis problems. However, such supervised approaches are not feasible in the medical domain where it is challenging to obtain a large volume of labelled data due to the complexity of manual annotation and inter- and intra-observer variability in label assignment. Algorithms designed to work on small annotated datasets are useful but have limited applications. In an effort to address the lack of annotated data in the medical image analysis domain, we propose an algorithm for hierarchical unsupervised feature learning. Our algorithm introduces three new contributions: (i) we use kernel learning to identify and represent invariant characteristics across image sub-patches in an unsupervised manner; (ii) we leverage the sparsity inherent to medical image data and propose a new sparse convolutional kernel network (S-CKN) that can be pre-trained in a layer-wise fashion, thereby providing initial discriminative features for medical data; and (iii) we propose a spatial pyramid pooling framework to capture subtle geometric differences in medical image data. Our experiments evaluate our algorithm in two common application areas of medical image retrieval and classification using two public datasets. Our results demonstrate that the medical image feature representations extracted with our algorithm enable a higher accuracy in both application areas compared to features extracted from other conventional unsupervised methods. Furthermore, our approach achieves an accuracy that is competitive with state-of-the-art supervised CNNs. |
Tasks | Image Retrieval, Medical Image Retrieval |
Published | 2018-07-16 |
URL | https://arxiv.org/abs/1807.05648v3 |
https://arxiv.org/pdf/1807.05648v3.pdf | |
PWC | https://paperswithcode.com/paper/sparsity-based-convolutional-kernel-network |
Repo | |
Framework | |
BAR: Bayesian Activity Recognition using variational inference
Title | BAR: Bayesian Activity Recognition using variational inference |
Authors | Ranganath Krishnan, Mahesh Subedar, Omesh Tickoo |
Abstract | Uncertainty estimation in deep neural networks is essential for designing reliable and robust AI systems. Applications such as video surveillance for identifying suspicious activities are designed with deep neural networks (DNNs), but DNNs do not provide uncertainty estimates. Capturing reliable uncertainty estimates in safety and security critical applications will help to establish trust in the AI system. Our contribution is to apply Bayesian deep learning framework to visual activity recognition application and quantify model uncertainty along with principled confidence. We utilize the stochastic variational inference technique while training the Bayesian DNNs to infer the approximate posterior distribution around model parameters and perform Monte Carlo sampling on the posterior of model parameters to obtain the predictive distribution. We show that the Bayesian inference applied to DNNs provide reliable confidence measures for visual activity recognition task as compared to conventional DNNs. We also show that our method improves the visual activity recognition precision-recall AUC by 6.2% compared to non-Bayesian baseline. We evaluate our models on Moments-In-Time (MiT) activity recognition dataset by selecting a subset of in- and out-of-distribution video samples. |
Tasks | Activity Recognition, Bayesian Inference |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03305v2 |
http://arxiv.org/pdf/1811.03305v2.pdf | |
PWC | https://paperswithcode.com/paper/bar-bayesian-activity-recognition-using |
Repo | |
Framework | |
A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Title | A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain |
Authors | Shuai Li, Dinei Florencio, Wanqing Li, Yaqin Zhao, Chris Cook |
Abstract | Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address this problem in the wavelet domain. We first show that the small differences in the image domain can be highlighted in certain wavelet bands. Then the likelihood of each wavelet coefficient being foreground is estimated by formulating foreground and background models for each wavelet band. The proposed framework effectively aggregates the likelihoods from different wavelet bands based on the characteristics of the wavelet transform. Experimental results demonstrated that the proposed method significantly outperformed existing methods in detecting camouflaged foreground objects. Specifically, the average F-measure for the proposed algorithm was 0.87, compared to 0.71 to 0.8 for the other state-of-the-art methods. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05984v1 |
http://arxiv.org/pdf/1804.05984v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fusion-framework-for-camouflaged-moving |
Repo | |
Framework | |