May 7, 2019

3348 words 16 mins read

Paper Group ANR 42

Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm. Where is my Phone ? Personal Object Retrieval from Egocentric Images. Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup. Consistent Kernel Mean Estimation for Functions of Random Variables. Deep Amortized Inference for Probabilist …

Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm


Title	Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm
Authors	Seth D. Billings, Ayushi Sinha, Austin Reiter, Simon Leonard, Masaru Ishii, Gregory D. Hager, Russell H. Taylor
Abstract	Functional endoscopic sinus surgery (FESS) is a surgical procedure used to treat acute cases of sinusitis and other sinus diseases. FESS is fast becoming the preferred choice of treatment due to its minimally invasive nature. However, due to the limited field of view of the endoscope, surgeons rely on navigation systems to guide them within the nasal cavity. State of the art navigation systems report registration accuracy of over 1mm, which is large compared to the size of the nasal airways. We present an anatomically constrained video-CT registration algorithm that incorporates multiple video features. Our algorithm is robust in the presence of outliers. We also test our algorithm on simulated and in-vivo data, and test its accuracy against degrading initializations.
Tasks
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07931v1
PDF	http://arxiv.org/pdf/1610.07931v1.pdf
PWC	https://paperswithcode.com/paper/anatomically-constrained-video-ct
Repo
Framework

Where is my Phone ? Personal Object Retrieval from Egocentric Images


Title	Where is my Phone ? Personal Object Retrieval from Egocentric Images
Authors	Cristian Reyes, Eva Mohedano, Kevin McGuinness, Noel E. O’Connor, Xavier Giro-i-Nieto
Abstract	This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera. Each personal object is modelled by a small set of images that define a query for a visual search engine.The retrieved results are reranked considering the temporal timestamps of the images to increase the relevance of the later detections. Finally, a temporal interleaving of the results is introduced for robustness against false detections. The Mean Reciprocal Rank is proposed as a metric to evaluate this problem. This application could help into developing personal assistants capable of helping users when they do not remember where they left their personal belongings.
Tasks
Published	2016-08-29
URL	http://arxiv.org/abs/1608.08139v2
PDF	http://arxiv.org/pdf/1608.08139v2.pdf
PWC	https://paperswithcode.com/paper/where-is-my-phone-personal-object-retrieval
Repo
Framework

Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup


Title	Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup
Authors	Kilho Son, Ming-Yu Liu, Yuichi Taguchi
Abstract	Range images captured by Time-of-Flight (ToF) cameras are corrupted with multipath distortions due to interaction between modulated light signals and scenes. The interaction is often complicated, which makes a model-based solution elusive. We propose a learning-based approach for removing the multipath distortions for a ToF camera in a robotic arm setup. Our approach is based on deep learning. We use the robotic arm to automatically collect a large amount of ToF range images containing various multipath distortions. The training images are automatically labeled by leveraging a high precision structured light sensor available only in the training time. In the test time, we apply the learned model to remove the multipath distortions. This allows our robotic arm setup to enjoy the speed and compact form of the ToF camera without compromising with its range measurement errors. We conduct extensive experimental validations and compare the proposed method to several baseline algorithms. The experiment results show that our method achieves 55% error reduction in range estimation and largely outperforms the baseline algorithms.
Tasks
Published	2016-01-08
URL	http://arxiv.org/abs/1601.01750v3
PDF	http://arxiv.org/pdf/1601.01750v3.pdf
PWC	https://paperswithcode.com/paper/learning-to-remove-multipath-distortions-in
Repo
Framework

Consistent Kernel Mean Estimation for Functions of Random Variables


Title	Consistent Kernel Mean Estimation for Functions of Random Variables
Authors	Carl-Johann Simon-Gabriel, Adam Ścibior, Ilya Tolstikhin, Bernhard Schölkopf
Abstract	We provide a theoretical foundation for non-parametric estimation of functions of random variables using kernel mean embeddings. We show that for any continuous function $f$, consistent estimators of the mean embedding of a random variable $X$ lead to consistent estimators of the mean embedding of $f(X)$. For Mat'ern kernels and sufficiently smooth functions we also provide rates of convergence. Our results extend to functions of multiple random variables. If the variables are dependent, we require an estimator of the mean embedding of their joint distribution as a starting point; if they are independent, it is sufficient to have separate estimators of the mean embeddings of their marginal distributions. In either case, our results cover both mean embeddings based on i.i.d. samples as well as “reduced set” expansions in terms of dependent expansion points. The latter serves as a justification for using such expansions to limit memory resources when applying the approach as a basis for probabilistic programming.
Tasks	Probabilistic Programming
Published	2016-10-19
URL	http://arxiv.org/abs/1610.05950v1
PDF	http://arxiv.org/pdf/1610.05950v1.pdf
PWC	https://paperswithcode.com/paper/consistent-kernel-mean-estimation-for
Repo
Framework

Deep Amortized Inference for Probabilistic Programs


Title	Deep Amortized Inference for Probabilistic Programs
Authors	Daniel Ritchie, Paul Horsfall, Noah D. Goodman
Abstract	Probabilistic programming languages (PPLs) are a powerful modeling tool, able to represent any computable probability distribution. Unfortunately, probabilistic program inference is often intractable, and existing PPLs mostly rely on expensive, approximate sampling-based methods. To alleviate this problem, one could try to learn from past inferences, so that future inferences run faster. This strategy is known as amortized inference; it has recently been applied to Bayesian networks and deep generative models. This paper proposes a system for amortized inference in PPLs. In our system, amortization comes in the form of a parameterized guide program. Guide programs have similar structure to the original program, but can have richer data flow, including neural network components. These networks can be optimized so that the guide approximately samples from the posterior distribution defined by the original program. We present a flexible interface for defining guide programs and a stochastic gradient-based scheme for optimizing guide parameters, as well as some preliminary results on automatically deriving guide programs. We explore in detail the common machine learning pattern in which a ‘local’ model is specified by ‘global’ random values and used to generate independent observed data points; this gives rise to amortized local inference supporting global model learning.
Tasks	Probabilistic Programming
Published	2016-10-18
URL	http://arxiv.org/abs/1610.05735v1
PDF	http://arxiv.org/pdf/1610.05735v1.pdf
PWC	https://paperswithcode.com/paper/deep-amortized-inference-for-probabilistic
Repo
Framework

Deep Convolutional Neural Network for Inverse Problems in Imaging


Title	Deep Convolutional Neural Network for Inverse Problems in Imaging
Authors	Kyong Hwan Jin, Michael T. McCann, Emmanuel Froustey, Michael Unser
Abstract	In this paper, we propose a novel deep convolutional neural network (CNN)-based algorithm for solving ill-posed inverse problems. Regularized iterative algorithms have emerged as the standard approach to ill-posed inverse problems in the past few decades. These methods produce excellent results, but can be challenging to deploy in practice due to factors including the high computational cost of the forward and adjoint operators and the difficulty of hyper parameter selection. The starting point of our work is the observation that unrolled iterative methods have the form of a CNN (filtering followed by point-wise non-linearity) when the normal operator (H*H, the adjoint of H times H) of the forward model is a convolution. Based on this observation, we propose using direct inversion followed by a CNN to solve normal-convolutional inverse problems. The direct inversion encapsulates the physical model of the system, but leads to artifacts when the problem is ill-posed; the CNN combines multiresolution decomposition and residual learning in order to learn to remove these artifacts while preserving image structure. We demonstrate the performance of the proposed network in sparse-view reconstruction (down to 50 views) on parallel beam X-ray computed tomography in synthetic phantoms as well as in real experimental sinograms. The proposed network outperforms total variation-regularized iterative reconstruction for the more realistic phantoms and requires less than a second to reconstruct a 512 x 512 image on GPU.
Tasks
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03679v1
PDF	http://arxiv.org/pdf/1611.03679v1.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-neural-network-for-inverse
Repo
Framework

Robust Energy Storage Scheduling for Imbalance Reduction of Strategically Formed Energy Balancing Groups


Title	Robust Energy Storage Scheduling for Imbalance Reduction of Strategically Formed Energy Balancing Groups
Authors	Shantanu Chakraborty, Toshiya Okabe
Abstract	Imbalance (on-line energy gap between contracted supply and actual demand, and associated cost) reduction is going to be a crucial service for a Power Producer and Supplier (PPS) in the deregulated energy market. PPS requires forward market interactions to procure energy as precisely as possible in order to reduce imbalance energy. This paper presents, 1) (off-line) an effective demand aggregation based strategy for creating a number of balancing groups that leads to higher predictability of group-wise aggregated demand, 2) (on-line) a robust energy storage scheduling that minimizes the imbalance for a particular balancing group considering the demand prediction uncertainty. The group formation is performed by a Probabilistic Programming approach using Bayesian Markov Chain Monte Carlo (MCMC) method after applied on the historical demand statistics. Apart from the group formation, the aggregation strategy (with the help of Bayesian Inference) also clears out the upper-limit of the required storage capacity for a formed group, fraction of which is to be utilized in on-line operation. For on-line operation, a robust energy storage scheduling method is proposed that minimizes expected imbalance energy and cost (a non-linear function of imbalance energy) while incorporating the demand uncertainty of a particular group. The proposed methods are applied on the real apartment buildings’ demand data in Tokyo, Japan. Simulation results are presented to verify the effectiveness of the proposed methods.
Tasks	Bayesian Inference, Probabilistic Programming
Published	2016-08-30
URL	http://arxiv.org/abs/1608.08292v1
PDF	http://arxiv.org/pdf/1608.08292v1.pdf
PWC	https://paperswithcode.com/paper/robust-energy-storage-scheduling-for
Repo
Framework

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos


Title	EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos
Authors	Andru P. Twinanda, Sherif Shehata, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy
Abstract	Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the visual features used are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Extensive experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks.
Tasks
Published	2016-02-09
URL	http://arxiv.org/abs/1602.03012v2
PDF	http://arxiv.org/pdf/1602.03012v2.pdf
PWC	https://paperswithcode.com/paper/endonet-a-deep-architecture-for-recognition
Repo
Framework

Understanding Anatomy Classification Through Attentive Response Maps


Title	Understanding Anatomy Classification Through Attentive Response Maps
Authors	Devinder Kumar, Vlado Menkovski, Graham W. Taylor, Alexander Wong
Abstract	One of the main challenges for broad adoption of deep learning based models such as convolutional neural networks (CNN), is the lack of understanding of their decisions. In many applications, a simpler, less capable model that can be easily understood is favorable to a black-box model that has superior performance. In this paper, we present an approach for designing CNNs based on visualization of the internal activations of the model. We visualize the model’s response through attentive response maps obtained using a fractional stride convolution technique and compare the results with known imaging landmarks from the medical literature. We show that sufficiently deep and capable models can be successfully trained to use the same medical landmarks a human expert would use. Our approach allows for communicating the model decision process well, but also offers insight towards detecting biases.
Tasks
Published	2016-11-19
URL	http://arxiv.org/abs/1611.06284v3
PDF	http://arxiv.org/pdf/1611.06284v3.pdf
PWC	https://paperswithcode.com/paper/understanding-anatomy-classification-through
Repo
Framework

Multi-Cue Zero-Shot Learning with Strong Supervision


Title	Multi-Cue Zero-Shot Learning with Strong Supervision
Authors	Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele
Abstract	Scaling up visual category recognition to large numbers of classes remains challenging. A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes. Ultimately, this may allow to use textbook knowledge that humans employ to learn about new classes by transferring knowledge from classes they know well. The most successful zero-shot learning approaches currently require a particular type of auxiliary information – namely attribute annotations performed by humans – that is not readily available for most classes. Our goal is to circumvent this bottleneck by substituting such annotations by extracting multiple pieces of information from multiple unstructured text sources readily available on the web. To compensate for the weaker form of auxiliary information, we incorporate stronger supervision in the form of semantic part annotations on the classes from which we transfer knowledge. We achieve our goal by a joint embedding framework that maps multiple text parts as well as multiple semantic parts into a common space. Our results consistently and significantly improve on the state-of-the-art in zero-short recognition and retrieval.
Tasks	Zero-Shot Learning
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08754v1
PDF	http://arxiv.org/pdf/1603.08754v1.pdf
PWC	https://paperswithcode.com/paper/multi-cue-zero-shot-learning-with-strong
Repo
Framework

Analysis of the Human-Computer Interaction on the Example of Image-based CAPTCHA by Association Rule Mining


Title	Analysis of the Human-Computer Interaction on the Example of Image-based CAPTCHA by Association Rule Mining
Authors	Darko Brodić, Alessia Amelio
Abstract	The paper analyzes the interaction between humans and computers in terms of response time in solving the image-based CAPTCHA. In particular, the analysis focuses on the attitude of the different Internet users in easily solving four different types of image-based CAPTCHAs which include facial expressions like: animated character, old woman, surprised face, worried face. To pursue this goal, an experiment is realized involving 100 Internet users in solving the four types of CAPTCHAs, differentiated by age, Internet experience, and education level. The response times are collected for each user. Then, association rules are extracted from user data, for evaluating the dependence of the response time in solving the CAPTCHA from age, education level and experience in internet usage by statistical analysis. The results implicitly capture the users’ psychological states showing in what states the users are more sensible. It reveals to be a novelty and a meaningful analysis in the state-of-the-art.
Tasks
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00203v2
PDF	http://arxiv.org/pdf/1612.00203v2.pdf
PWC	https://paperswithcode.com/paper/analysis-of-the-human-computer-interaction-on
Repo
Framework

DisCSPs with Privacy Recast as Planning Problems for Utility-based Agents


Title	DisCSPs with Privacy Recast as Planning Problems for Utility-based Agents
Authors	Julien Savaux, Julien Vion, Sylvain Piechowiak, René Mandiau, Toshihiro Matsui, Katsutoshi Hirayama, Makoto Yokoo, Shakre Elmane, Marius Silaghi
Abstract	Privacy has traditionally been a major motivation for decentralized problem solving. However, even though several metrics have been proposed to quantify it, none of them is easily integrated with common solvers. Constraint programming is a fundamental paradigm used to approach various families of problems. We introduce Utilitarian Distributed Constraint Satisfaction Problems (UDisCSP) where the utility of each state is estimated as the difference between the the expected rewards for agreements on assignments for shared variables, and the expected cost of privacy loss. Therefore, a traditional DisCSP with privacy requirements is viewed as a planning problem. The actions available to agents are: communication and local inference. Common decentralized solvers are evaluated here from the point of view of their interpretation as greedy planners. Further, we investigate some simple extensions where these solvers start taking into account the utility function. In these extensions we assume that the planning problem is further restricting the set of communication actions to only the communication primitives present in the corresponding solver protocols. The solvers obtained for the new type of problems propose the action (communication/inference) to be performed in each situation, defining thereby the policy.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06790v1
PDF	http://arxiv.org/pdf/1604.06790v1.pdf
PWC	https://paperswithcode.com/paper/discsps-with-privacy-recast-as-planning
Repo
Framework

The happiness paradox: your friends are happier than you


Title	The happiness paradox: your friends are happier than you
Authors	Johan Bollen, Bruno Gonçalves, Ingrid van de Leemput, Guangchen Ruan
Abstract	Most individuals in social networks experience a so-called Friendship Paradox: they are less popular than their friends on average. This effect may explain recent findings that widespread social network media use leads to reduced happiness. However the relation between popularity and happiness is poorly understood. A Friendship paradox does not necessarily imply a Happiness paradox where most individuals are less happy than their friends. Here we report the first direct observation of a significant Happiness Paradox in a large-scale online social network of $39,110$ Twitter users. Our results reveal that popular individuals are indeed happier and that a majority of individuals experience a significant Happiness paradox. The magnitude of the latter effect is shaped by complex interactions between individual popularity, happiness, and the fact that users cluster assortatively by level of happiness. Our results indicate that the topology of online social networks and the distribution of happiness in some populations can cause widespread psycho-social effects that affect the well-being of billions of individuals.
Tasks
Published	2016-02-08
URL	http://arxiv.org/abs/1602.02665v1
PDF	http://arxiv.org/pdf/1602.02665v1.pdf
PWC	https://paperswithcode.com/paper/the-happiness-paradox-your-friends-are
Repo
Framework

Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning


Title	Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Authors	Daniel J. Luckett, Eric B. Laber, Anna R. Kahkoska, David M. Maahs, Elizabeth Mayer-Davis, Michael R. Kosorok
Abstract	The vision for precision medicine is to use individual patient characteristics to inform a personalized treatment plan that leads to the best healthcare possible for each patient. Mobile technologies have an important role to play in this vision as they offer a means to monitor a patient’s health status in real-time and subsequently to deliver interventions if, when, and in the dose that they are needed. Dynamic treatment regimes formalize individualized treatment plans as sequences of decision rules, one per stage of clinical intervention, that map current patient information to a recommended treatment. However, existing methods for estimating optimal dynamic treatment regimes are designed for a small number of fixed decision points occurring on a coarse time-scale. We propose a new reinforcement learning method for estimating an optimal treatment regime that is applicable to data collected using mobile technologies in an outpatient setting. The proposed method accommodates an indefinite time horizon and minute-by-minute decision making that are common in mobile health applications. We show the proposed estimators are consistent and asymptotically normal under mild conditions. The proposed methods are applied to estimate an optimal dynamic treatment regime for controlling blood glucose levels in patients with type 1 diabetes.
Tasks	Decision Making
Published	2016-11-10
URL	http://arxiv.org/abs/1611.03531v2
PDF	http://arxiv.org/pdf/1611.03531v2.pdf
PWC	https://paperswithcode.com/paper/estimating-dynamic-treatment-regimes-in
Repo
Framework

A Generic Coordinate Descent Framework for Learning from Implicit Feedback


Title	A Generic Coordinate Descent Framework for Learning from Implicit Feedback
Authors	Immanuel Bayer, Xiangnan He, Bhargav Kanagal, Steffen Rendle
Abstract	In recent years, interest in recommender research has shifted from explicit feedback towards implicit feedback data. A diversity of complex models has been proposed for a wide variety of applications. Despite this, learning from implicit feedback is still computationally challenging. So far, most work relies on stochastic gradient descent (SGD) solvers which are easy to derive, but in practice challenging to apply, especially for tasks with many items. For the simple matrix factorization model, an efficient coordinate descent (CD) solver has been previously proposed. However, efficient CD approaches have not been derived for more complex models. In this paper, we provide a new framework for deriving efficient CD algorithms for complex recommender models. We identify and introduce the property of k-separable models. We show that k-separability is a sufficient property to allow efficient optimization of implicit recommender problems with CD. We illustrate this framework on a variety of state-of-the-art models including factorization machines and Tucker decomposition. To summarize, our work provides the theory and building blocks to derive efficient implicit CD algorithms for complex recommender models.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04666v1
PDF	http://arxiv.org/pdf/1611.04666v1.pdf
PWC	https://paperswithcode.com/paper/a-generic-coordinate-descent-framework-for
Repo
Framework