Paper Group ANR 32
Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm. Interactive Storytelling over Document Collections. Low-dose CT denoising with convolutional neural network. Attribute Recognition from Adaptive Parts. Modeling of Item-Difficulty for Ontology-based MCQs. Point-wise mutual information-ba …
Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm
Title | Progressive Tree-like Curvilinear Structure Reconstruction with Structured Ranking Learning and Graph Algorithm |
Authors | Seong-Gyun Jeong, Yuliya Tarabalka, Nicolas Nisse, Josiane Zerubia |
Abstract | We propose a novel tree-like curvilinear structure reconstruction algorithm based on supervised learning and graph theory. In this work we analyze image patches to obtain the local major orientations and the rankings that correspond to the curvilinear structure. To extract local curvilinear features, we compute oriented gradient information using steerable filters. We then employ Structured Support Vector Machine for ordinal regression of the input image patches, where the ordering is determined by shape similarity to latent curvilinear structure. Finally, we progressively reconstruct the curvilinear structure by looking for geodesic paths connecting remote vertices in the graph built on the structured output rankings. Experimental results show that the proposed algorithm faithfully provides topological features of the curvilinear structures using minimal pixels for various datasets. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02631v1 |
http://arxiv.org/pdf/1612.02631v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-tree-like-curvilinear-structure |
Repo | |
Framework | |
Interactive Storytelling over Document Collections
Title | Interactive Storytelling over Document Collections |
Authors | Dipayan Maiti, Mohammad Raihanul Islam, Scotland Leman, Naren Ramakrishnan |
Abstract | Storytelling algorithms aim to ‘connect the dots’ between disparate documents by linking starting and ending documents through a series of intermediate documents. Existing storytelling algorithms are based on notions of coherence and connectivity, and thus the primary way by which users can steer the story construction is via design of suitable similarity functions. We present an alternative approach to storytelling wherein the user can interactively and iteratively provide ‘must use’ constraints to preferentially support the construction of some stories over others. The three innovations in our approach are distance measures based on (inferred) topic distributions, the use of constraints to define sets of linear inequalities over paths, and the introduction of slack and surplus variables to condition the topic distribution to preferentially emphasize desired terms over others. We describe experimental results to illustrate the effectiveness of our interactive storytelling approach over multiple text datasets. |
Tasks | |
Published | 2016-02-21 |
URL | http://arxiv.org/abs/1602.06566v1 |
http://arxiv.org/pdf/1602.06566v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-storytelling-over-document |
Repo | |
Framework | |
Low-dose CT denoising with convolutional neural network
Title | Low-dose CT denoising with convolutional neural network |
Authors | Hu Chen, Yi Zhang, Weihua Zhang, Peixi Liao, Ke Li, Jiliu Zhou, Ge Wang |
Abstract | To reduce the potential radiation risk, low-dose CT has attracted much attention. However, simply lowering the radiation dose will lead to significant deterioration of the image quality. In this paper, we propose a noise reduction method for low-dose CT via deep neural network without accessing original projection data. A deep convolutional neural network is trained to transform low-dose CT images towards normal-dose CT images, patch by patch. Visual and quantitative evaluation demonstrates a competing performance of the proposed method. |
Tasks | Denoising |
Published | 2016-10-02 |
URL | http://arxiv.org/abs/1610.00321v1 |
http://arxiv.org/pdf/1610.00321v1.pdf | |
PWC | https://paperswithcode.com/paper/low-dose-ct-denoising-with-convolutional |
Repo | |
Framework | |
Attribute Recognition from Adaptive Parts
Title | Attribute Recognition from Adaptive Parts |
Authors | Luwei Yang, Ligen Zhu, Yichen Wei, Shuang Liang, Ping Tan |
Abstract | Previous part-based attribute recognition approaches perform part detection and attribute recognition in separate steps. The parts are not optimized for attribute recognition and therefore could be sub-optimal. We present an end-to-end deep learning approach to overcome the limitation. It generates object parts from key points and perform attribute recognition accordingly, allowing adaptive spatial transform of the parts. Both key point estimation and attribute recognition are learnt jointly in a multi-task setting. Extensive experiments on two datasets verify the efficacy of proposed end-to-end approach. |
Tasks | |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01437v2 |
http://arxiv.org/pdf/1607.01437v2.pdf | |
PWC | https://paperswithcode.com/paper/attribute-recognition-from-adaptive-parts |
Repo | |
Framework | |
Modeling of Item-Difficulty for Ontology-based MCQs
Title | Modeling of Item-Difficulty for Ontology-based MCQs |
Authors | Vinu E. V, Tahani Alsubait, P. Sreenivasa Kumar |
Abstract | Multiple choice questions (MCQs) that can be generated from a domain ontology can significantly reduce human effort & time required for authoring & administering assessments in an e-Learning environment. Even though here are various methods for generating MCQs from ontologies, methods for determining the difficulty-levels of such MCQs are less explored. In this paper, we study various aspects and factors that are involved in determining the difficulty-score of an MCQ, and propose an ontology-based model for the prediction. This model characterizes the difficulty values associated with the stem and choice set of the MCQs, and describes a measure which combines both the scores. Further more, the notion of assigning difficultly-scores based on the skill level of the test taker is utilized for predicating difficulty-score of a stem. We studied the effectiveness of the predicted difficulty-scores with the help of a psychometric model from the Item Response Theory, by involving real-students and domain experts. Our results show that, the predicated difficulty-levels of the MCQs are having high correlation with their actual difficulty-levels. |
Tasks | |
Published | 2016-07-04 |
URL | http://arxiv.org/abs/1607.00869v1 |
http://arxiv.org/pdf/1607.00869v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-of-item-difficulty-for-ontology |
Repo | |
Framework | |
Point-wise mutual information-based video segmentation with high temporal consistency
Title | Point-wise mutual information-based video segmentation with high temporal consistency |
Authors | Margret Keuper, Thomas Brox |
Abstract | In this paper, we tackle the problem of temporally consistent boundary detection and hierarchical segmentation in videos. While finding the best high-level reasoning of region assignments in videos is the focus of much recent research, temporal consistency in boundary detection has so far only rarely been tackled. We argue that temporally consistent boundaries are a key component to temporally consistent region assignment. The proposed method is based on the point-wise mutual information (PMI) of spatio-temporal voxels. Temporal consistency is established by an evaluation of PMI-based point affinities in the spectral domain over space and time. Thus, the proposed method is independent of any optical flow computation or previously learned motion models. The proposed low-level video segmentation method outperforms the learning-based state of the art in terms of standard region metrics. |
Tasks | Boundary Detection, Optical Flow Estimation, Video Semantic Segmentation |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02467v1 |
http://arxiv.org/pdf/1606.02467v1.pdf | |
PWC | https://paperswithcode.com/paper/point-wise-mutual-information-based-video |
Repo | |
Framework | |
Stochastic Variational Deep Kernel Learning
Title | Stochastic Variational Deep Kernel Learning |
Authors | Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric P. Xing |
Abstract | Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet. |
Tasks | Gaussian Processes, Multi-Task Learning |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00336v2 |
http://arxiv.org/pdf/1611.00336v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variational-deep-kernel-learning |
Repo | |
Framework | |
Choquet integral in decision analysis - lessons from the axiomatization
Title | Choquet integral in decision analysis - lessons from the axiomatization |
Authors | Mikhail Timonin |
Abstract | The Choquet integral is a powerful aggregation operator which lists many well-known models as its special cases. We look at these special cases and provide their axiomatic analysis. In cases where an axiomatization has been previously given in the literature, we connect the existing results with the framework that we have developed. Next we turn to the question of learning, which is especially important for the practical applications of the model. So far, learning of the Choquet integral has been mostly confined to the learning of the capacity. Such an approach requires making a powerful assumption that all dimensions (e.g. criteria) are evaluated on the same scale, which is rarely justified in practice. Too often categorical data is given arbitrary numerical labels (e.g. AHP), and numerical data is considered cardinally and ordinally commensurate, sometimes after a simple normalization. Such approaches clearly lack scientific rigour, and yet they are commonly seen in all kinds of applications. We discuss the pros and cons of making such an assumption and look at the consequences which axiomatization uniqueness results have for the learning problems. Finally, we review some of the applications of the Choquet integral in decision analysis. Apart from MCDA, which is the main area of interest for our results, we also discuss how the model can be interpreted in the social choice context. We look in detail at the state-dependent utility, and show how comonotonicity, central to the previous axiomatizations, actually implies state-independency in the Choquet integral model. We also discuss the conditions required to have a meaningful state-dependent utility representation and show the novelty of our results compared to the previous methods of building state-dependent models. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09926v1 |
http://arxiv.org/pdf/1611.09926v1.pdf | |
PWC | https://paperswithcode.com/paper/choquet-integral-in-decision-analysis-lessons |
Repo | |
Framework | |
Will People Like Your Image? Learning the Aesthetic Space
Title | Will People Like Your Image? Learning the Aesthetic Space |
Authors | Katharina Schwarz, Patrick Wieschollek, Hendrik P. A. Lensch |
Abstract | Rating how aesthetically pleasing an image appears is a highly complex matter and depends on a large number of different visual factors. Previous work has tackled the aesthetic rating problem by ranking on a 1-dimensional rating scale, e.g., incorporating handcrafted attributes. In this paper, we propose a rather general approach to automatically map aesthetic pleasingness with all its complexity into an “aesthetic space” to allow for a highly fine-grained resolution. In detail, making use of deep learning, our method directly learns an encoding of a given image into this high-dimensional feature space resembling visual aesthetics. Additionally to the mentioned visual factors, differences in personal judgments have a large impact on the likeableness of a photograph. Nowadays, online platforms allow users to “like” or favor certain content with a single click. To incorporate a huge diversity of people, we make use of such multi-user agreements and assemble a large data set of 380K images (AROD) with associated meta information and derive a score to rate how visually pleasing a given photo is. We validate our derived model of aesthetics in a user study. Further, without any extra data labeling or handcrafted features, we achieve state-of-the art accuracy on the AVA benchmark data set. Finally, as our approach is able to predict the aesthetic quality of any arbitrary image or video, we demonstrate our results on applications for resorting photo collections, capturing the best shot on mobile devices and aesthetic key-frame extraction from videos. |
Tasks | |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05203v2 |
http://arxiv.org/pdf/1611.05203v2.pdf | |
PWC | https://paperswithcode.com/paper/will-people-like-your-image-learning-the |
Repo | |
Framework | |
Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Title | Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues |
Authors | Anand Mishra, Karteek Alahari, C. V. Jawahar |
Abstract | Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance. |
Tasks | Scene Text Recognition |
Published | 2016-01-13 |
URL | http://arxiv.org/abs/1601.03128v1 |
http://arxiv.org/pdf/1601.03128v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-energy-minimization-framework-for |
Repo | |
Framework | |
Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition
Title | Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition |
Authors | Guo Qiang, Tu Dan, Li Guohui, Lei Jun |
Abstract | Text recognition in natural scene is a challenging problem due to the many factors affecting text appearance. In this paper, we presents a method that directly transcribes scene text images to text without needing of sophisticated character segmentation. We leverage recent advances of deep neural networks to model the appearance of scene text images with temporal dynamics. Specifically, we integrates convolutional neural network (CNN) and recurrent neural network (RNN) which is motivated by observing the complementary modeling capabilities of the two models. The main contribution of this work is investigating how temporal memory helps in an segmentation free fashion for this specific problem. By using long short-term memory (LSTM) blocks as hidden units, our model can retain long-term memory compared with HMMs which only maintain short-term state dependences. We conduct experiments on Street View House Number dataset containing highly variable number images. The results demonstrate the superiority of the proposed method over traditional HMM based methods. |
Tasks | Scene Text Recognition |
Published | 2016-01-06 |
URL | http://arxiv.org/abs/1601.01100v1 |
http://arxiv.org/pdf/1601.01100v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-matters-convolutional-recurrent-neural |
Repo | |
Framework | |
Batched Gaussian Process Bandit Optimization via Determinantal Point Processes
Title | Batched Gaussian Process Bandit Optimization via Determinantal Point Processes |
Authors | Tarun Kathuria, Amit Deshpande, Pushmeet Kohli |
Abstract | Gaussian Process bandit optimization has emerged as a powerful tool for optimizing noisy black box functions. One example in machine learning is hyper-parameter optimization where each evaluation of the target function requires training a model which may involve days or even weeks of computation. Most methods for this so-called “Bayesian optimization” only allow sequential exploration of the parameter space. However, it is often desirable to propose batches or sets of parameter values to explore simultaneously, especially when there are large parallel processing facilities at our disposal. Batch methods require modeling the interaction between the different evaluations in the batch, which can be expensive in complex scenarios. In this paper, we propose a new approach for parallelizing Bayesian optimization by modeling the diversity of a batch via Determinantal point processes (DPPs) whose kernels are learned automatically. This allows us to generalize a previous result as well as prove better regret bounds based on DPP sampling. Our experiments on a variety of synthetic and real-world robotics and hyper-parameter optimization tasks indicate that our DPP-based methods, especially those based on DPP sampling, outperform state-of-the-art methods. |
Tasks | Point Processes |
Published | 2016-11-13 |
URL | http://arxiv.org/abs/1611.04088v1 |
http://arxiv.org/pdf/1611.04088v1.pdf | |
PWC | https://paperswithcode.com/paper/batched-gaussian-process-bandit-optimization |
Repo | |
Framework | |
A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genetics
Title | A Bayesian Group Sparse Multi-Task Regression Model for Imaging Genetics |
Authors | Keelin Greenlaw, Elena Szefer, Jinko Graham, Mary Lesperance, Farouk S. Nathoo |
Abstract | Motivation: Recent advances in technology for brain imaging and high-throughput genotyping have motivated studies examining the influence of genetic variation on brain structure. Wang et al. (Bioinformatics, 2012) have developed an approach for the analysis of imaging genomic studies using penalized multi-task regression with regularization based on a novel group $l_{2,1}$-norm penalty which encourages structured sparsity at both the gene level and SNP level. While incorporating a number of useful features, the proposed method only furnishes a point estimate of the regression coefficients; techniques for conducting statistical inference are not provided. A new Bayesian method is proposed here to overcome this limitation. Results: We develop a Bayesian hierarchical modeling formulation where the posterior mode corresponds to the estimator proposed by Wang et al. (Bioinformatics, 2012), and an approach that allows for full posterior inference including the construction of interval estimates for the regression parameters. We show that the proposed hierarchical model can be expressed as a three-level Gaussian scale mixture and this representation facilitates the use of a Gibbs sampling algorithm for posterior simulation. Simulation studies demonstrate that the interval estimates obtained using our approach achieve adequate coverage probabilities that outperform those obtained from the nonparametric bootstrap. Our proposed methodology is applied to the analysis of neuroimaging and genetic data collected as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and this analysis of the ADNI cohort demonstrates clearly the value added of incorporating interval estimation beyond only point estimation when relating SNPs to brain imaging endophenotypes. |
Tasks | |
Published | 2016-05-07 |
URL | http://arxiv.org/abs/1605.02234v2 |
http://arxiv.org/pdf/1605.02234v2.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-group-sparse-multi-task-regression |
Repo | |
Framework | |
Active Object Localization in Visual Situations
Title | Active Object Localization in Visual Situations |
Authors | Max H. Quinn, Anthony D. Rhodes, Melanie Mitchell |
Abstract | We describe a method for performing active localization of objects in instances of visual situations. A visual situation is an abstract concept—e.g., “a boxing match”, “a birthday party”, “walking the dog”, “waiting for a bus”—whose image instantiations are linked more by their common spatial and semantic structure than by low-level visual similarity. Our system combines given and learned knowledge of the structure of a particular situation, and adapts that knowledge to a new situation instance as it actively searches for objects. More specifically, the system learns a set of probability distributions describing spatial and other relationships among relevant objects. The system uses those distributions to iteratively sample object proposals on a test image, but also continually uses information from those object proposals to adaptively modify the distributions based on what the system has detected. We test our approach’s ability to efficiently localize objects, using a situation-specific image dataset created by our group. We compare the results with several baselines and variations on our method, and demonstrate the strong benefit of using situation knowledge and active context-driven localization. Finally, we contrast our method with several other approaches that use context as well as active search for object localization in images. |
Tasks | Active Object Localization, Object Localization |
Published | 2016-07-02 |
URL | http://arxiv.org/abs/1607.00548v1 |
http://arxiv.org/pdf/1607.00548v1.pdf | |
PWC | https://paperswithcode.com/paper/active-object-localization-in-visual |
Repo | |
Framework | |
Self-organized control for musculoskeletal robots
Title | Self-organized control for musculoskeletal robots |
Authors | Ralf Der, Georg Martius |
Abstract | With the accelerated development of robot technologies, optimal control becomes one of the central themes of research. In traditional approaches, the controller, by its internal functionality, finds appropriate actions on the basis of the history of sensor values, guided by the goals, intentions, objectives, learning schemes, and so on planted into it. The idea is that the controller controls the world—the body plus its environment—as reliably as possible. However, in elastically actuated robots this approach faces severe difficulties. This paper advocates for a new paradigm of self-organized control. The paper presents a solution with a controller that is devoid of any functionalities of its own, given by a fixed, explicit and context-free function of the recent history of the sensor values. When applying this controller to a muscle-tendon driven arm-shoulder system from the Myorobotics toolkit, we observe a vast variety of self-organized behavior patterns: when left alone, the arm realizes pseudo-random sequences of different poses but one can also manipulate the system into definite motion patterns. But most interestingly, after attaching an object, the controller gets in a functional resonance with the object’s internal dynamics: when given a half-filled bottle, the system spontaneously starts shaking the bottle so that maximum response from the dynamics of the water is being generated. After attaching a pendulum to the arm, the controller drives the pendulum into a circular mode. In this way, the robot discovers dynamical affordances of objects its body is interacting with. We also discuss perspectives for using this controller paradigm for intention driven behavior generation. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.02990v2 |
http://arxiv.org/pdf/1602.02990v2.pdf | |
PWC | https://paperswithcode.com/paper/self-organized-control-for-musculoskeletal |
Repo | |
Framework | |