Paper Group ANR 42
Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm. Where is my Phone ? Personal Object Retrieval from Egocentric Images. Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup. Consistent Kernel Mean Estimation for Functions of Random Variables. Deep Amortized Inference for Probabilist …
Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm
Title | Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm |
Authors | Seth D. Billings, Ayushi Sinha, Austin Reiter, Simon Leonard, Masaru Ishii, Gregory D. Hager, Russell H. Taylor |
Abstract | Functional endoscopic sinus surgery (FESS) is a surgical procedure used to treat acute cases of sinusitis and other sinus diseases. FESS is fast becoming the preferred choice of treatment due to its minimally invasive nature. However, due to the limited field of view of the endoscope, surgeons rely on navigation systems to guide them within the nasal cavity. State of the art navigation systems report registration accuracy of over 1mm, which is large compared to the size of the nasal airways. We present an anatomically constrained video-CT registration algorithm that incorporates multiple video features. Our algorithm is robust in the presence of outliers. We also test our algorithm on simulated and in-vivo data, and test its accuracy against degrading initializations. |
Tasks | |
Published | 2016-10-25 |
URL | http://arxiv.org/abs/1610.07931v1 |
http://arxiv.org/pdf/1610.07931v1.pdf | |
PWC | https://paperswithcode.com/paper/anatomically-constrained-video-ct |
Repo | |
Framework | |
Where is my Phone ? Personal Object Retrieval from Egocentric Images
Title | Where is my Phone ? Personal Object Retrieval from Egocentric Images |
Authors | Cristian Reyes, Eva Mohedano, Kevin McGuinness, Noel E. O’Connor, Xavier Giro-i-Nieto |
Abstract | This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera. Each personal object is modelled by a small set of images that define a query for a visual search engine.The retrieved results are reranked considering the temporal timestamps of the images to increase the relevance of the later detections. Finally, a temporal interleaving of the results is introduced for robustness against false detections. The Mean Reciprocal Rank is proposed as a metric to evaluate this problem. This application could help into developing personal assistants capable of helping users when they do not remember where they left their personal belongings. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.08139v2 |
http://arxiv.org/pdf/1608.08139v2.pdf | |
PWC | https://paperswithcode.com/paper/where-is-my-phone-personal-object-retrieval |
Repo | |
Framework | |
Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup
Title | Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup |
Authors | Kilho Son, Ming-Yu Liu, Yuichi Taguchi |
Abstract | Range images captured by Time-of-Flight (ToF) cameras are corrupted with multipath distortions due to interaction between modulated light signals and scenes. The interaction is often complicated, which makes a model-based solution elusive. We propose a learning-based approach for removing the multipath distortions for a ToF camera in a robotic arm setup. Our approach is based on deep learning. We use the robotic arm to automatically collect a large amount of ToF range images containing various multipath distortions. The training images are automatically labeled by leveraging a high precision structured light sensor available only in the training time. In the test time, we apply the learned model to remove the multipath distortions. This allows our robotic arm setup to enjoy the speed and compact form of the ToF camera without compromising with its range measurement errors. We conduct extensive experimental validations and compare the proposed method to several baseline algorithms. The experiment results show that our method achieves 55% error reduction in range estimation and largely outperforms the baseline algorithms. |
Tasks | |
Published | 2016-01-08 |
URL | http://arxiv.org/abs/1601.01750v3 |
http://arxiv.org/pdf/1601.01750v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-remove-multipath-distortions-in |
Repo | |
Framework | |
Consistent Kernel Mean Estimation for Functions of Random Variables
Title | Consistent Kernel Mean Estimation for Functions of Random Variables |
Authors | Carl-Johann Simon-Gabriel, Adam Ścibior, Ilya Tolstikhin, Bernhard Schölkopf |
Abstract | We provide a theoretical foundation for non-parametric estimation of functions of random variables using kernel mean embeddings. We show that for any continuous function $f$, consistent estimators of the mean embedding of a random variable $X$ lead to consistent estimators of the mean embedding of $f(X)$. For Mat'ern kernels and sufficiently smooth functions we also provide rates of convergence. Our results extend to functions of multiple random variables. If the variables are dependent, we require an estimator of the mean embedding of their joint distribution as a starting point; if they are independent, it is sufficient to have separate estimators of the mean embeddings of their marginal distributions. In either case, our results cover both mean embeddings based on i.i.d. samples as well as “reduced set” expansions in terms of dependent expansion points. The latter serves as a justification for using such expansions to limit memory resources when applying the approach as a basis for probabilistic programming. |
Tasks | Probabilistic Programming |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.05950v1 |
http://arxiv.org/pdf/1610.05950v1.pdf | |
PWC | https://paperswithcode.com/paper/consistent-kernel-mean-estimation-for |
Repo | |
Framework | |
Deep Amortized Inference for Probabilistic Programs
Title | Deep Amortized Inference for Probabilistic Programs |
Authors | Daniel Ritchie, Paul Horsfall, Noah D. Goodman |
Abstract | Probabilistic programming languages (PPLs) are a powerful modeling tool, able to represent any computable probability distribution. Unfortunately, probabilistic program inference is often intractable, and existing PPLs mostly rely on expensive, approximate sampling-based methods. To alleviate this problem, one could try to learn from past inferences, so that future inferences run faster. This strategy is known as amortized inference; it has recently been applied to Bayesian networks and deep generative models. This paper proposes a system for amortized inference in PPLs. In our system, amortization comes in the form of a parameterized guide program. Guide programs have similar structure to the original program, but can have richer data flow, including neural network components. These networks can be optimized so that the guide approximately samples from the posterior distribution defined by the original program. We present a flexible interface for defining guide programs and a stochastic gradient-based scheme for optimizing guide parameters, as well as some preliminary results on automatically deriving guide programs. We explore in detail the common machine learning pattern in which a ‘local’ model is specified by ‘global’ random values and used to generate independent observed data points; this gives rise to amortized local inference supporting global model learning. |
Tasks | Probabilistic Programming |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05735v1 |
http://arxiv.org/pdf/1610.05735v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-amortized-inference-for-probabilistic |
Repo | |
Framework | |
Deep Convolutional Neural Network for Inverse Problems in Imaging
Title | Deep Convolutional Neural Network for Inverse Problems in Imaging |
Authors | Kyong Hwan Jin, Michael T. McCann, Emmanuel Froustey, Michael Unser |
Abstract | In this paper, we propose a novel deep convolutional neural network (CNN)-based algorithm for solving ill-posed inverse problems. Regularized iterative algorithms have emerged as the standard approach to ill-posed inverse problems in the past few decades. These methods produce excellent results, but can be challenging to deploy in practice due to factors including the high computational cost of the forward and adjoint operators and the difficulty of hyper parameter selection. The starting point of our work is the observation that unrolled iterative methods have the form of a CNN (filtering followed by point-wise non-linearity) when the normal operator (H*H, the adjoint of H times H) of the forward model is a convolution. Based on this observation, we propose using direct inversion followed by a CNN to solve normal-convolutional inverse problems. The direct inversion encapsulates the physical model of the system, but leads to artifacts when the problem is ill-posed; the CNN combines multiresolution decomposition and residual learning in order to learn to remove these artifacts while preserving image structure. We demonstrate the performance of the proposed network in sparse-view reconstruction (down to 50 views) on parallel beam X-ray computed tomography in synthetic phantoms as well as in real experimental sinograms. The proposed network outperforms total variation-regularized iterative reconstruction for the more realistic phantoms and requires less than a second to reconstruct a 512 x 512 image on GPU. |
Tasks | |
Published | 2016-11-11 |
URL | http://arxiv.org/abs/1611.03679v1 |
http://arxiv.org/pdf/1611.03679v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-neural-network-for-inverse |
Repo | |
Framework | |
Robust Energy Storage Scheduling for Imbalance Reduction of Strategically Formed Energy Balancing Groups
Title | Robust Energy Storage Scheduling for Imbalance Reduction of Strategically Formed Energy Balancing Groups |
Authors | Shantanu Chakraborty, Toshiya Okabe |
Abstract | Imbalance (on-line energy gap between contracted supply and actual demand, and associated cost) reduction is going to be a crucial service for a Power Producer and Supplier (PPS) in the deregulated energy market. PPS requires forward market interactions to procure energy as precisely as possible in order to reduce imbalance energy. This paper presents, 1) (off-line) an effective demand aggregation based strategy for creating a number of balancing groups that leads to higher predictability of group-wise aggregated demand, 2) (on-line) a robust energy storage scheduling that minimizes the imbalance for a particular balancing group considering the demand prediction uncertainty. The group formation is performed by a Probabilistic Programming approach using Bayesian Markov Chain Monte Carlo (MCMC) method after applied on the historical demand statistics. Apart from the group formation, the aggregation strategy (with the help of Bayesian Inference) also clears out the upper-limit of the required storage capacity for a formed group, fraction of which is to be utilized in on-line operation. For on-line operation, a robust energy storage scheduling method is proposed that minimizes expected imbalance energy and cost (a non-linear function of imbalance energy) while incorporating the demand uncertainty of a particular group. The proposed methods are applied on the real apartment buildings’ demand data in Tokyo, Japan. Simulation results are presented to verify the effectiveness of the proposed methods. |
Tasks | Bayesian Inference, Probabilistic Programming |
Published | 2016-08-30 |
URL | http://arxiv.org/abs/1608.08292v1 |
http://arxiv.org/pdf/1608.08292v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-energy-storage-scheduling-for |
Repo | |
Framework | |
EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos
Title | EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos |
Authors | Andru P. Twinanda, Sherif Shehata, Didier Mutter, Jacques Marescaux, Michel de Mathelin, Nicolas Padoy |
Abstract | Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the visual features used are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Extensive experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.03012v2 |
http://arxiv.org/pdf/1602.03012v2.pdf | |
PWC | https://paperswithcode.com/paper/endonet-a-deep-architecture-for-recognition |
Repo | |
Framework | |
Understanding Anatomy Classification Through Attentive Response Maps
Title | Understanding Anatomy Classification Through Attentive Response Maps |
Authors | Devinder Kumar, Vlado Menkovski, Graham W. Taylor, Alexander Wong |
Abstract | One of the main challenges for broad adoption of deep learning based models such as convolutional neural networks (CNN), is the lack of understanding of their decisions. In many applications, a simpler, less capable model that can be easily understood is favorable to a black-box model that has superior performance. In this paper, we present an approach for designing CNNs based on visualization of the internal activations of the model. We visualize the model’s response through attentive response maps obtained using a fractional stride convolution technique and compare the results with known imaging landmarks from the medical literature. We show that sufficiently deep and capable models can be successfully trained to use the same medical landmarks a human expert would use. Our approach allows for communicating the model decision process well, but also offers insight towards detecting biases. |
Tasks | |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06284v3 |
http://arxiv.org/pdf/1611.06284v3.pdf | |
PWC | https://paperswithcode.com/paper/understanding-anatomy-classification-through |
Repo | |
Framework | |
Multi-Cue Zero-Shot Learning with Strong Supervision
Title | Multi-Cue Zero-Shot Learning with Strong Supervision |
Authors | Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele |
Abstract | Scaling up visual category recognition to large numbers of classes remains challenging. A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes. Ultimately, this may allow to use textbook knowledge that humans employ to learn about new classes by transferring knowledge from classes they know well. The most successful zero-shot learning approaches currently require a particular type of auxiliary information – namely attribute annotations performed by humans – that is not readily available for most classes. Our goal is to circumvent this bottleneck by substituting such annotations by extracting multiple pieces of information from multiple unstructured text sources readily available on the web. To compensate for the weaker form of auxiliary information, we incorporate stronger supervision in the form of semantic part annotations on the classes from which we transfer knowledge. We achieve our goal by a joint embedding framework that maps multiple text parts as well as multiple semantic parts into a common space. Our results consistently and significantly improve on the state-of-the-art in zero-short recognition and retrieval. |
Tasks | Zero-Shot Learning |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08754v1 |
http://arxiv.org/pdf/1603.08754v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-cue-zero-shot-learning-with-strong |
Repo | |
Framework | |
Analysis of the Human-Computer Interaction on the Example of Image-based CAPTCHA by Association Rule Mining
Title | Analysis of the Human-Computer Interaction on the Example of Image-based CAPTCHA by Association Rule Mining |
Authors | Darko Brodić, Alessia Amelio |
Abstract | The paper analyzes the interaction between humans and computers in terms of response time in solving the image-based CAPTCHA. In particular, the analysis focuses on the attitude of the different Internet users in easily solving four different types of image-based CAPTCHAs which include facial expressions like: animated character, old woman, surprised face, worried face. To pursue this goal, an experiment is realized involving 100 Internet users in solving the four types of CAPTCHAs, differentiated by age, Internet experience, and education level. The response times are collected for each user. Then, association rules are extracted from user data, for evaluating the dependence of the response time in solving the CAPTCHA from age, education level and experience in internet usage by statistical analysis. The results implicitly capture the users’ psychological states showing in what states the users are more sensible. It reveals to be a novelty and a meaningful analysis in the state-of-the-art. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00203v2 |
http://arxiv.org/pdf/1612.00203v2.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-the-human-computer-interaction-on |
Repo | |
Framework | |
DisCSPs with Privacy Recast as Planning Problems for Utility-based Agents
Title | DisCSPs with Privacy Recast as Planning Problems for Utility-based Agents |
Authors | Julien Savaux, Julien Vion, Sylvain Piechowiak, René Mandiau, Toshihiro Matsui, Katsutoshi Hirayama, Makoto Yokoo, Shakre Elmane, Marius Silaghi |
Abstract | Privacy has traditionally been a major motivation for decentralized problem solving. However, even though several metrics have been proposed to quantify it, none of them is easily integrated with common solvers. Constraint programming is a fundamental paradigm used to approach various families of problems. We introduce Utilitarian Distributed Constraint Satisfaction Problems (UDisCSP) where the utility of each state is estimated as the difference between the the expected rewards for agreements on assignments for shared variables, and the expected cost of privacy loss. Therefore, a traditional DisCSP with privacy requirements is viewed as a planning problem. The actions available to agents are: communication and local inference. Common decentralized solvers are evaluated here from the point of view of their interpretation as greedy planners. Further, we investigate some simple extensions where these solvers start taking into account the utility function. In these extensions we assume that the planning problem is further restricting the set of communication actions to only the communication primitives present in the corresponding solver protocols. The solvers obtained for the new type of problems propose the action (communication/inference) to be performed in each situation, defining thereby the policy. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06790v1 |
http://arxiv.org/pdf/1604.06790v1.pdf | |
PWC | https://paperswithcode.com/paper/discsps-with-privacy-recast-as-planning |
Repo | |
Framework | |
The happiness paradox: your friends are happier than you
Title | The happiness paradox: your friends are happier than you |
Authors | Johan Bollen, Bruno Gonçalves, Ingrid van de Leemput, Guangchen Ruan |
Abstract | Most individuals in social networks experience a so-called Friendship Paradox: they are less popular than their friends on average. This effect may explain recent findings that widespread social network media use leads to reduced happiness. However the relation between popularity and happiness is poorly understood. A Friendship paradox does not necessarily imply a Happiness paradox where most individuals are less happy than their friends. Here we report the first direct observation of a significant Happiness Paradox in a large-scale online social network of $39,110$ Twitter users. Our results reveal that popular individuals are indeed happier and that a majority of individuals experience a significant Happiness paradox. The magnitude of the latter effect is shaped by complex interactions between individual popularity, happiness, and the fact that users cluster assortatively by level of happiness. Our results indicate that the topology of online social networks and the distribution of happiness in some populations can cause widespread psycho-social effects that affect the well-being of billions of individuals. |
Tasks | |
Published | 2016-02-08 |
URL | http://arxiv.org/abs/1602.02665v1 |
http://arxiv.org/pdf/1602.02665v1.pdf | |
PWC | https://paperswithcode.com/paper/the-happiness-paradox-your-friends-are |
Repo | |
Framework | |
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Title | Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning |
Authors | Daniel J. Luckett, Eric B. Laber, Anna R. Kahkoska, David M. Maahs, Elizabeth Mayer-Davis, Michael R. Kosorok |
Abstract | The vision for precision medicine is to use individual patient characteristics to inform a personalized treatment plan that leads to the best healthcare possible for each patient. Mobile technologies have an important role to play in this vision as they offer a means to monitor a patient’s health status in real-time and subsequently to deliver interventions if, when, and in the dose that they are needed. Dynamic treatment regimes formalize individualized treatment plans as sequences of decision rules, one per stage of clinical intervention, that map current patient information to a recommended treatment. However, existing methods for estimating optimal dynamic treatment regimes are designed for a small number of fixed decision points occurring on a coarse time-scale. We propose a new reinforcement learning method for estimating an optimal treatment regime that is applicable to data collected using mobile technologies in an outpatient setting. The proposed method accommodates an indefinite time horizon and minute-by-minute decision making that are common in mobile health applications. We show the proposed estimators are consistent and asymptotically normal under mild conditions. The proposed methods are applied to estimate an optimal dynamic treatment regime for controlling blood glucose levels in patients with type 1 diabetes. |
Tasks | Decision Making |
Published | 2016-11-10 |
URL | http://arxiv.org/abs/1611.03531v2 |
http://arxiv.org/pdf/1611.03531v2.pdf | |
PWC | https://paperswithcode.com/paper/estimating-dynamic-treatment-regimes-in |
Repo | |
Framework | |
A Generic Coordinate Descent Framework for Learning from Implicit Feedback
Title | A Generic Coordinate Descent Framework for Learning from Implicit Feedback |
Authors | Immanuel Bayer, Xiangnan He, Bhargav Kanagal, Steffen Rendle |
Abstract | In recent years, interest in recommender research has shifted from explicit feedback towards implicit feedback data. A diversity of complex models has been proposed for a wide variety of applications. Despite this, learning from implicit feedback is still computationally challenging. So far, most work relies on stochastic gradient descent (SGD) solvers which are easy to derive, but in practice challenging to apply, especially for tasks with many items. For the simple matrix factorization model, an efficient coordinate descent (CD) solver has been previously proposed. However, efficient CD approaches have not been derived for more complex models. In this paper, we provide a new framework for deriving efficient CD algorithms for complex recommender models. We identify and introduce the property of k-separable models. We show that k-separability is a sufficient property to allow efficient optimization of implicit recommender problems with CD. We illustrate this framework on a variety of state-of-the-art models including factorization machines and Tucker decomposition. To summarize, our work provides the theory and building blocks to derive efficient implicit CD algorithms for complex recommender models. |
Tasks | |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04666v1 |
http://arxiv.org/pdf/1611.04666v1.pdf | |
PWC | https://paperswithcode.com/paper/a-generic-coordinate-descent-framework-for |
Repo | |
Framework | |