May 7, 2019

3272 words 16 mins read

Paper Group ANR 32

Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm. Interactive Storytelling over Document Collections. Low-dose CT denoising with convolutional neural network. Attribute Recognition from Adaptive Parts. Modeling of Item-Difficulty for Ontology-based MCQs. Point-wise mutual information-ba …

Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm


Title	Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm
Authors	Seong-Gyun Jeong, Yuliya Tarabalka, Nicolas Nisse, Josiane Zerubia
Abstract	We propose a novel tree-like curvilinear structure reconstruction algorithm based on supervised learning and graph theory. In this work we analyze image patches to obtain the local major orientations and the rankings that correspond to the curvilinear structure. To extract local curvilinear features, we compute oriented gradient information using steerable filters. We then employ Structured Support Vector Machine for ordinal regression of the input image patches, where the ordering is determined by shape similarity to latent curvilinear structure. Finally, we progressively reconstruct the curvilinear structure by looking for geodesic paths connecting remote vertices in the graph built on the structured output rankings. Experimental results show that the proposed algorithm faithfully provides topological features of the curvilinear structures using minimal pixels for various datasets.
Tasks
Published	2016-12-08
URL	http://arxiv.org/abs/1612.02631v1
PDF	http://arxiv.org/pdf/1612.02631v1.pdf
PWC	https://paperswithcode.com/paper/progressive-tree-like-curvilinear-structure
Repo
Framework

Interactive Storytelling over Document Collections


Title	Interactive Storytelling over Document Collections
Authors	Dipayan Maiti, Mohammad Raihanul Islam, Scotland Leman, Naren Ramakrishnan
Abstract	Storytelling algorithms aim to ‘connect the dots’ between disparate documents by linking starting and ending documents through a series of intermediate documents. Existing storytelling algorithms are based on notions of coherence and connectivity, and thus the primary way by which users can steer the story construction is via design of suitable similarity functions. We present an alternative approach to storytelling wherein the user can interactively and iteratively provide ‘must use’ constraints to preferentially support the construction of some stories over others. The three innovations in our approach are distance measures based on (inferred) topic distributions, the use of constraints to define sets of linear inequalities over paths, and the introduction of slack and surplus variables to condition the topic distribution to preferentially emphasize desired terms over others. We describe experimental results to illustrate the effectiveness of our interactive storytelling approach over multiple text datasets.
Tasks
Published	2016-02-21
URL	http://arxiv.org/abs/1602.06566v1
PDF	http://arxiv.org/pdf/1602.06566v1.pdf
PWC	https://paperswithcode.com/paper/interactive-storytelling-over-document
Repo
Framework

Low-dose CT denoising with convolutional neural network


Title	Low-dose CT denoising with convolutional neural network
Authors	Hu Chen, Yi Zhang, Weihua Zhang, Peixi Liao, Ke Li, Jiliu Zhou, Ge Wang
Abstract	To reduce the potential radiation risk, low-dose CT has attracted much attention. However, simply lowering the radiation dose will lead to significant deterioration of the image quality. In this paper, we propose a noise reduction method for low-dose CT via deep neural network without accessing original projection data. A deep convolutional neural network is trained to transform low-dose CT images towards normal-dose CT images, patch by patch. Visual and quantitative evaluation demonstrates a competing performance of the proposed method.
Tasks	Denoising
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00321v1
PDF	http://arxiv.org/pdf/1610.00321v1.pdf
PWC	https://paperswithcode.com/paper/low-dose-ct-denoising-with-convolutional
Repo
Framework

Attribute Recognition from Adaptive Parts


Title	Attribute Recognition from Adaptive Parts
Authors	Luwei Yang, Ligen Zhu, Yichen Wei, Shuang Liang, Ping Tan
Abstract	Previous part-based attribute recognition approaches perform part detection and attribute recognition in separate steps. The parts are not optimized for attribute recognition and therefore could be sub-optimal. We present an end-to-end deep learning approach to overcome the limitation. It generates object parts from key points and perform attribute recognition accordingly, allowing adaptive spatial transform of the parts. Both key point estimation and attribute recognition are learnt jointly in a multi-task setting. Extensive experiments on two datasets verify the efficacy of proposed end-to-end approach.
Tasks
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01437v2
PDF	http://arxiv.org/pdf/1607.01437v2.pdf
PWC	https://paperswithcode.com/paper/attribute-recognition-from-adaptive-parts
Repo
Framework

Modeling of Item-Difficulty for Ontology-based MCQs


Title	Modeling of Item-Difficulty for Ontology-based MCQs
Authors	Vinu E. V, Tahani Alsubait, P. Sreenivasa Kumar
Abstract	Multiple choice questions (MCQs) that can be generated from a domain ontology can significantly reduce human effort & time required for authoring & administering assessments in an e-Learning environment. Even though here are various methods for generating MCQs from ontologies, methods for determining the difficulty-levels of such MCQs are less explored. In this paper, we study various aspects and factors that are involved in determining the difficulty-score of an MCQ, and propose an ontology-based model for the prediction. This model characterizes the difficulty values associated with the stem and choice set of the MCQs, and describes a measure which combines both the scores. Further more, the notion of assigning difficultly-scores based on the skill level of the test taker is utilized for predicating difficulty-score of a stem. We studied the effectiveness of the predicted difficulty-scores with the help of a psychometric model from the Item Response Theory, by involving real-students and domain experts. Our results show that, the predicated difficulty-levels of the MCQs are having high correlation with their actual difficulty-levels.
Tasks
Published	2016-07-04
URL	http://arxiv.org/abs/1607.00869v1
PDF	http://arxiv.org/pdf/1607.00869v1.pdf
PWC	https://paperswithcode.com/paper/modeling-of-item-difficulty-for-ontology
Repo
Framework

Point-wise mutual information-based video segmentation with high temporal consistency


Title	Point-wise mutual information-based video segmentation with high temporal consistency
Authors	Margret Keuper, Thomas Brox
Abstract	In this paper, we tackle the problem of temporally consistent boundary detection and hierarchical segmentation in videos. While finding the best high-level reasoning of region assignments in videos is the focus of much recent research, temporal consistency in boundary detection has so far only rarely been tackled. We argue that temporally consistent boundaries are a key component to temporally consistent region assignment. The proposed method is based on the point-wise mutual information (PMI) of spatio-temporal voxels. Temporal consistency is established by an evaluation of PMI-based point affinities in the spectral domain over space and time. Thus, the proposed method is independent of any optical flow computation or previously learned motion models. The proposed low-level video segmentation method outperforms the learning-based state of the art in terms of standard region metrics.
Tasks	Boundary Detection, Optical Flow Estimation, Video Semantic Segmentation
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02467v1
PDF	http://arxiv.org/pdf/1606.02467v1.pdf
PWC	https://paperswithcode.com/paper/point-wise-mutual-information-based-video
Repo
Framework

Stochastic Variational Deep Kernel Learning


Title	Stochastic Variational Deep Kernel Learning
Authors	Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing
Abstract	Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet.
Tasks	Gaussian Processes, Multi-Task Learning
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00336v2
PDF	http://arxiv.org/pdf/1611.00336v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-variational-deep-kernel-learning
Repo
Framework

Choquet integral in decision analysis - lessons from the axiomatization


Title	Choquet integral in decision analysis - lessons from the axiomatization
Authors	Mikhail Timonin
Abstract	The Choquet integral is a powerful aggregation operator which lists many well-known models as its special cases. We look at these special cases and provide their axiomatic analysis. In cases where an axiomatization has been previously given in the literature, we connect the existing results with the framework that we have developed. Next we turn to the question of learning, which is especially important for the practical applications of the model. So far, learning of the Choquet integral has been mostly confined to the learning of the capacity. Such an approach requires making a powerful assumption that all dimensions (e.g. criteria) are evaluated on the same scale, which is rarely justified in practice. Too often categorical data is given arbitrary numerical labels (e.g. AHP), and numerical data is considered cardinally and ordinally commensurate, sometimes after a simple normalization. Such approaches clearly lack scientific rigour, and yet they are commonly seen in all kinds of applications. We discuss the pros and cons of making such an assumption and look at the consequences which axiomatization uniqueness results have for the learning problems. Finally, we review some of the applications of the Choquet integral in decision analysis. Apart from MCDA, which is the main area of interest for our results, we also discuss how the model can be interpreted in the social choice context. We look in detail at the state-dependent utility, and show how comonotonicity, central to the previous axiomatizations, actually implies state-independency in the Choquet integral model. We also discuss the conditions required to have a meaningful state-dependent utility representation and show the novelty of our results compared to the previous methods of building state-dependent models.
Tasks
Published	2016-11-29
URL	http://arxiv.org/abs/1611.09926v1
PDF	http://arxiv.org/pdf/1611.09926v1.pdf
PWC	https://paperswithcode.com/paper/choquet-integral-in-decision-analysis-lessons
Repo
Framework

Will People Like Your Image? Learning the Aesthetic Space


Title	Will People Like Your Image? Learning the Aesthetic Space
Authors	Katharina Schwarz, Patrick Wieschollek, Hendrik P. A. Lensch
Abstract	Rating how aesthetically pleasing an image appears is a highly complex matter and depends on a large number of different visual factors. Previous work has tackled the aesthetic rating problem by ranking on a 1-dimensional rating scale, e.g., incorporating handcrafted attributes. In this paper, we propose a rather general approach to automatically map aesthetic pleasingness with all its complexity into an “aesthetic space” to allow for a highly fine-grained resolution. In detail, making use of deep learning, our method directly learns an encoding of a given image into this high-dimensional feature space resembling visual aesthetics. Additionally to the mentioned visual factors, differences in personal judgments have a large impact on the likeableness of a photograph. Nowadays, online platforms allow users to “like” or favor certain content with a single click. To incorporate a huge diversity of people, we make use of such multi-user agreements and assemble a large data set of 380K images (AROD) with associated meta information and derive a score to rate how visually pleasing a given photo is. We validate our derived model of aesthetics in a user study. Further, without any extra data labeling or handcrafted features, we achieve state-of-the art accuracy on the AVA benchmark data set. Finally, as our approach is able to predict the aesthetic quality of any arbitrary image or video, we demonstrate our results on applications for resorting photo collections, capturing the best shot on mobile devices and aesthetic key-frame extraction from videos.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05203v2
PDF	http://arxiv.org/pdf/1611.05203v2.pdf
PWC	https://paperswithcode.com/paper/will-people-like-your-image-learning-the
Repo
Framework

Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues


Title	Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Authors	Anand Mishra, Karteek Alahari, C. V. Jawahar
Abstract	Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance.
Tasks	Scene Text Recognition
Published	2016-01-13
URL	http://arxiv.org/abs/1601.03128v1
PDF	http://arxiv.org/pdf/1601.03128v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-energy-minimization-framework-for
Repo
Framework

Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition


Title	Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition
Authors	Guo Qiang, Tu Dan, Li Guohui, Lei Jun
Abstract	Text recognition in natural scene is a challenging problem due to the many factors affecting text appearance. In this paper, we presents a method that directly transcribes scene text images to text without needing of sophisticated character segmentation. We leverage recent advances of deep neural networks to model the appearance of scene text images with temporal dynamics. Specifically, we integrates convolutional neural network (CNN) and recurrent neural network (RNN) which is motivated by observing the complementary modeling capabilities of the two models. The main contribution of this work is investigating how temporal memory helps in an segmentation free fashion for this specific problem. By using long short-term memory (LSTM) blocks as hidden units, our model can retain long-term memory compared with HMMs which only maintain short-term state dependences. We conduct experiments on Street View House Number dataset containing highly variable number images. The results demonstrate the superiority of the proposed method over traditional HMM based methods.
Tasks	Scene Text Recognition
Published	2016-01-06
URL	http://arxiv.org/abs/1601.01100v1
PDF	http://arxiv.org/pdf/1601.01100v1.pdf
PWC	https://paperswithcode.com/paper/memory-matters-convolutional-recurrent-neural
Repo
Framework

Batched Gaussian Process Bandit Optimization via Determinantal Point Processes


Title	Batched Gaussian Process Bandit Optimization via Determinantal Point Processes
Authors	Tarun Kathuria, Amit Deshpande, Pushmeet Kohli
Abstract	Gaussian Process bandit optimization has emerged as a powerful tool for optimizing noisy black box functions. One example in machine learning is hyper-parameter optimization where each evaluation of the target function requires training a model which may involve days or even weeks of computation. Most methods for this so-called “Bayesian optimization” only allow sequential exploration of the parameter space. However, it is often desirable to propose batches or sets of parameter values to explore simultaneously, especially when there are large parallel processing facilities at our disposal. Batch methods require modeling the interaction between the different evaluations in the batch, which can be expensive in complex scenarios. In this paper, we propose a new approach for parallelizing Bayesian optimization by modeling the diversity of a batch via Determinantal point processes (DPPs) whose kernels are learned automatically. This allows us to generalize a previous result as well as prove better regret bounds based on DPP sampling. Our experiments on a variety of synthetic and real-world robotics and hyper-parameter optimization tasks indicate that our DPP-based methods, especially those based on DPP sampling, outperform state-of-the-art methods.
Tasks	Point Processes
Published	2016-11-13
URL	http://arxiv.org/abs/1611.04088v1
PDF	http://arxiv.org/pdf/1611.04088v1.pdf
PWC	https://paperswithcode.com/paper/batched-gaussian-process-bandit-optimization
Repo
Framework

A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genetics


Title	A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genetics
Authors	Keelin Greenlaw, Elena Szefer, Jinko Graham, Mary Lesperance, Farouk S. Nathoo
Abstract	Motivation: Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. Wang et al. (Bioinformatics, 2012) have developed an approach for the analysis of imaging genomic studies using penalized multi-task regression with regularization based on a novel group $l_{2,1}$-norm penalty which encourages structured sparsity at both the gene level and SNP level. While incorporating a number of useful features, the proposed method only furnishes a point estimate of the regression coefficients; techniques for conducting statistical inference are not provided. A new Bayesian method is proposed here to overcome this limitation. Results: We develop a Bayesian hierarchical modeling formulation where the posterior mode corresponds to the estimator proposed by Wang et al. (Bioinformatics, 2012), and an approach that allows for full posterior inference including the construction of interval estimates for the regression parameters. We show that the proposed hierarchical model can be expressed as a three-level Gaussian scale mixture and this representation facilitates the use of a Gibbs sampling algorithm for posterior simulation. Simulation studies demonstrate that the interval estimates obtained using our approach achieve adequate coverage probabilities that outperform those obtained from the nonparametric bootstrap. Our proposed methodology is applied to the analysis of neuroimaging and genetic data collected as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and this analysis of the ADNI cohort demonstrates clearly the value added of incorporating interval estimation beyond only point estimation when relating SNPs to brain imaging endophenotypes.
Tasks
Published	2016-05-07
URL	http://arxiv.org/abs/1605.02234v2
PDF	http://arxiv.org/pdf/1605.02234v2.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-group-sparse-multi-task-regression
Repo
Framework

Active Object Localization in Visual Situations


Title	Active Object Localization in Visual Situations
Authors	Max H. Quinn, Anthony D. Rhodes, Melanie Mitchell
Abstract	We describe a method for performing active localization of objects in instances of visual situations. A visual situation is an abstract concept—e.g., “a boxing match”, “a birthday party”, “walking the dog”, “waiting for a bus”—whose image instantiations are linked more by their common spatial and semantic structure than by low-level visual similarity. Our system combines given and learned knowledge of the structure of a particular situation, and adapts that knowledge to a new situation instance as it actively searches for objects. More specifically, the system learns a set of probability distributions describing spatial and other relationships among relevant objects. The system uses those distributions to iteratively sample object proposals on a test image, but also continually uses information from those object proposals to adaptively modify the distributions based on what the system has detected. We test our approach’s ability to efficiently localize objects, using a situation-specific image dataset created by our group. We compare the results with several baselines and variations on our method, and demonstrate the strong benefit of using situation knowledge and active context-driven localization. Finally, we contrast our method with several other approaches that use context as well as active search for object localization in images.
Tasks	Active Object Localization, Object Localization
Published	2016-07-02
URL	http://arxiv.org/abs/1607.00548v1
PDF	http://arxiv.org/pdf/1607.00548v1.pdf
PWC	https://paperswithcode.com/paper/active-object-localization-in-visual
Repo
Framework

Self-organized control for musculoskeletal robots


Title	Self-organized control for musculoskeletal robots
Authors	Ralf Der, Georg Martius
Abstract	With the accelerated development of robot technologies, optimal control becomes one of the central themes of research. In traditional approaches, the controller, by its internal functionality, finds appropriate actions on the basis of the history of sensor values, guided by the goals, intentions, objectives, learning schemes, and so on planted into it. The idea is that the controller controls the world—the body plus its environment—as reliably as possible. However, in elastically actuated robots this approach faces severe difficulties. This paper advocates for a new paradigm of self-organized control. The paper presents a solution with a controller that is devoid of any functionalities of its own, given by a fixed, explicit and context-free function of the recent history of the sensor values. When applying this controller to a muscle-tendon driven arm-shoulder system from the Myorobotics toolkit, we observe a vast variety of self-organized behavior patterns: when left alone, the arm realizes pseudo-random sequences of different poses but one can also manipulate the system into definite motion patterns. But most interestingly, after attaching an object, the controller gets in a functional resonance with the object’s internal dynamics: when given a half-filled bottle, the system spontaneously starts shaking the bottle so that maximum response from the dynamics of the water is being generated. After attaching a pendulum to the arm, the controller drives the pendulum into a circular mode. In this way, the robot discovers dynamical affordances of objects its body is interacting with. We also discuss perspectives for using this controller paradigm for intention driven behavior generation.
Tasks
Published	2016-02-09
URL	http://arxiv.org/abs/1602.02990v2
PDF	http://arxiv.org/pdf/1602.02990v2.pdf
PWC	https://paperswithcode.com/paper/self-organized-control-for-musculoskeletal
Repo
Framework