May 6, 2019

2932 words 14 mins read

Paper Group ANR 412

Paper Group ANR 412

Segmentation and Classification of Skin Lesions for Disease Diagnosis. Social Behavior Prediction from First Person Videos. Interactive Spoken Content Retrieval by Deep Reinforcement Learning. Ceteris paribus logic in counterfactual reasoning. Fast Task-Specific Target Detection via Graph Based Constraints Representation and Checking. Collaborative …

Segmentation and Classification of Skin Lesions for Disease Diagnosis

Title Segmentation and Classification of Skin Lesions for Disease Diagnosis
Authors Sumithra R, Mahamad Suhil, D. S. Guru
Abstract In this paper, a novel approach for automatic segmentation and classification of skin lesions is proposed. Initially, skin images are filtered to remove unwanted hairs and noise and then the segmentation process is carried out to extract lesion areas. For segmentation, a region growing method is applied by automatic initialization of seed points. The segmentation performance is measured with different well known measures and the results are appreciable. Subsequently, the extracted lesion areas are represented by color and texture features. SVM and k-NN classifiers are used along with their fusion for the classification using the extracted features. The performance of the system is tested on our own dataset of 726 samples from 141 images consisting of 5 different classes of diseases. The results are very promising with 46.71% and 34% of F-measure using SVM and k-NN classifier respectively and with 61% of F-measure for fusion of SVM and k-NN.
Tasks
Published 2016-09-12
URL http://arxiv.org/abs/1609.03277v1
PDF http://arxiv.org/pdf/1609.03277v1.pdf
PWC https://paperswithcode.com/paper/segmentation-and-classification-of-skin
Repo
Framework

Social Behavior Prediction from First Person Videos

Title Social Behavior Prediction from First Person Videos
Authors Shan Su, Jung Pyo Hong, Jianbo Shi, Hyun Soo Park
Abstract This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos. The predicted behaviors reflect an individual physical space that affords to take the next actions while conforming to social behaviors by engaging to joint attention. Our key innovation is to use the 3D reconstruction of multiple first person cameras to automatically annotate each other’s the visual semantics of social configurations. We leverage two learning signals uniquely embedded in first person videos. Individually, a first person video records the visual semantics of a spatial and social layout around a person that allows associating with past similar situations. Collectively, first person videos follow joint attention that can link the individuals to a group. We learn the egocentric visual semantics of group movements using a Siamese neural network to retrieve future trajectories. We consolidate the retrieved trajectories from all players by maximizing a measure of social compatibility—the gaze alignment towards joint attention predicted by their social formation, where the dynamics of joint attention is learned by a long-term recurrent convolutional network. This allows us to characterize which social configuration is more plausible and predict future group trajectories.
Tasks 3D Reconstruction
Published 2016-11-29
URL http://arxiv.org/abs/1611.09464v1
PDF http://arxiv.org/pdf/1611.09464v1.pdf
PWC https://paperswithcode.com/paper/social-behavior-prediction-from-first-person
Repo
Framework

Interactive Spoken Content Retrieval by Deep Reinforcement Learning

Title Interactive Spoken Content Retrieval by Deep Reinforcement Learning
Authors Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-Yi Lee, Lin-Shan Lee
Abstract User-machine interaction is important for spoken content retrieval. For text content retrieval, the user can easily scan through and select on a list of retrieved item. This is impossible for spoken content retrieval, because the retrieved items are difficult to show on screen. Besides, due to the high degree of uncertainty for speech recognition, the retrieval results can be very noisy. One way to counter such difficulties is through user-machine interaction. The machine can take different actions to interact with the user to obtain better retrieval results before showing to the user. The suitable actions depend on the retrieval status, for example requesting for extra information from the user, returning a list of topics for user to select, etc. In our previous work, some hand-crafted states estimated from the present retrieval results are used to determine the proper actions. In this paper, we propose to use Deep-Q-Learning techniques instead to determine the machine actions for interactive spoken content retrieval. Deep-Q-Learning bypasses the need for estimation of the hand-crafted states, and directly determine the best action base on the present retrieval status even without any human knowledge. It is shown to achieve significantly better performance compared with the previous hand-crafted states.
Tasks Q-Learning, Speech Recognition
Published 2016-09-16
URL http://arxiv.org/abs/1609.05234v1
PDF http://arxiv.org/pdf/1609.05234v1.pdf
PWC https://paperswithcode.com/paper/interactive-spoken-content-retrieval-by-deep
Repo
Framework

Ceteris paribus logic in counterfactual reasoning

Title Ceteris paribus logic in counterfactual reasoning
Authors Patrick Girard, Marcus Anthony Triplett
Abstract The semantics for counterfactuals due to David Lewis has been challenged on the basis of unlikely, or impossible, events. Such events may skew a given similarity order in favour of those possible worlds which exhibit them. By updating the relational structure of a model according to a ceteris paribus clause one forces out, in a natural manner, those possible worlds which do not satisfy the requirements of the clause. We develop a ceteris paribus logic for counterfactual reasoning capable of performing such actions, and offer several alternative (relaxed) interpretations of ceteris paribus. We apply this framework in a way which allows us to reason counterfactually without having our similarity order skewed by unlikely events. This continues the investigation of formal ceteris paribus reasoning, which has previously been applied to preferences, logics of game forms, and questions in decision-making, among other areas.
Tasks Decision Making
Published 2016-06-24
URL http://arxiv.org/abs/1606.07522v1
PDF http://arxiv.org/pdf/1606.07522v1.pdf
PWC https://paperswithcode.com/paper/ceteris-paribus-logic-in-counterfactual
Repo
Framework

Fast Task-Specific Target Detection via Graph Based Constraints Representation and Checking

Title Fast Task-Specific Target Detection via Graph Based Constraints Representation and Checking
Authors Went Luan, Yezhou Yang, Cornelia Fermuller, John S. Baras
Abstract In this work, we present a fast target detection framework for real-world robotics applications. Considering that an intelligent agent attends to a task-specific object target during execution, our goal is to detect the object efficiently. We propose the concept of early recognition, which influences the candidate proposal process to achieve fast and reliable detection performance. To check the target constraints efficiently, we put forward a novel policy to generate a sub-optimal checking order, and prove that it has bounded time cost compared to the optimal checking sequence, which is not achievable in polynomial time. Experiments on two different scenarios: 1) rigid object and 2) non-rigid body part detection validate our pipeline. To show that our method is widely applicable, we further present a human-robot interaction system based on our non-rigid body part detection.
Tasks
Published 2016-11-14
URL http://arxiv.org/abs/1611.04519v2
PDF http://arxiv.org/pdf/1611.04519v2.pdf
PWC https://paperswithcode.com/paper/fast-task-specific-target-detection-via-graph
Repo
Framework

Collaborative Training of Tensors for Compositional Distributional Semantics

Title Collaborative Training of Tensors for Compositional Distributional Semantics
Authors Tamara Polajnar
Abstract Type-based compositional distributional semantic models present an interesting line of research into functional representations of linguistic meaning. One of the drawbacks of such models, however, is the lack of training data required to train each word-type combination. In this paper we address this by introducing training methods that share parameters between similar words. We show that these methods enable zero-shot learning for words that have no training data at all, as well as enabling construction of high-quality tensors from very few training examples per word.
Tasks Zero-Shot Learning
Published 2016-07-08
URL http://arxiv.org/abs/1607.02310v3
PDF http://arxiv.org/pdf/1607.02310v3.pdf
PWC https://paperswithcode.com/paper/collaborative-training-of-tensors-for
Repo
Framework

Algorithmic Composition of Melodies with Deep Recurrent Neural Networks

Title Algorithmic Composition of Melodies with Deep Recurrent Neural Networks
Authors Florian Colombo, Samuel P. Muscinelli, Alexander Seeholzer, Johanni Brea, Wulfram Gerstner
Abstract A big challenge in algorithmic composition is to devise a model that is both easily trainable and able to reproduce the long-range temporal dependencies typical of music. Here we investigate how artificial neural networks can be trained on a large corpus of melodies and turned into automated music composers able to generate new melodies coherent with the style they have been trained on. We employ gated recurrent unit networks that have been shown to be particularly efficient in learning complex sequential activations with arbitrary long time lags. Our model processes rhythm and melody in parallel while modeling the relation between these two features. Using such an approach, we were able to generate interesting complete melodies or suggest possible continuations of a melody fragment that is coherent with the characteristics of the fragment itself.
Tasks
Published 2016-06-23
URL http://arxiv.org/abs/1606.07251v1
PDF http://arxiv.org/pdf/1606.07251v1.pdf
PWC https://paperswithcode.com/paper/algorithmic-composition-of-melodies-with-deep
Repo
Framework

Iterative Hough Forest with Histogram of Control Points for 6 DoF Object Registration from Depth Images

Title Iterative Hough Forest with Histogram of Control Points for 6 DoF Object Registration from Depth Images
Authors Caner Sahin, Rigas Kouskouridas, Tae-Kyun Kim
Abstract State-of-the-art techniques proposed for 6D object pose recovery depend on occlusion-free point clouds to accurately register objects in 3D space. To reduce this dependency, we introduce a novel architecture called Iterative Hough Forest with Histogram of Control Points that is capable of estimating occluded and cluttered objects’ 6D pose given a candidate 2D bounding box. Our Iterative Hough Forest is learnt using patches extracted only from the positive samples. These patches are represented with Histogram of Control Points (HoCP), a “scale-variant” implicit volumetric description, which we derive from recently introduced Implicit B-Splines (IBS). The rich discriminative information provided by this scale-variance is leveraged during inference, where the initial pose estimation of the object is iteratively refined based on more discriminative control points by using our Iterative Hough Forest. We conduct experiments on several test objects of a publicly available dataset to test our architecture and to compare with the state-of-the-art.
Tasks Pose Estimation
Published 2016-03-08
URL http://arxiv.org/abs/1603.02617v2
PDF http://arxiv.org/pdf/1603.02617v2.pdf
PWC https://paperswithcode.com/paper/iterative-hough-forest-with-histogram-of
Repo
Framework

Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification

Title Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification
Authors Maxime Bucher, Stéphane Herbin, Frédéric Jurie
Abstract This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images – one of the main ingredients of zero-shot learning – by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute prediction. This results in a novel expression of zero-shot learning not requiring the notion of class in the training phase: only pairs of image/attributes, augmented with a consistency indicator, are given as ground truth. At test time, the learned model can predict the consistency of a test image with a given set of attributes , allowing flexible ways to produce recognition inferences. Despite its simplicity, the proposed approach gives state-of-the-art results on four challenging datasets used for zero-shot recognition evaluation.
Tasks Image Classification, Metric Learning, Zero-Shot Learning
Published 2016-07-27
URL http://arxiv.org/abs/1607.08085v1
PDF http://arxiv.org/pdf/1607.08085v1.pdf
PWC https://paperswithcode.com/paper/improving-semantic-embedding-consistency-by
Repo
Framework

AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge

Title AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge
Authors Michel Valstar, Jonathan Gratch, Bjorn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Guiota Stratou, Roddy Cowie, Maja Pantic
Abstract The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) “Depression, Mood and Emotion” will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multi-modal information processing and to bring together the depression and emotion recognition communities, as well as the audio, video and physiological processing communities, to compare the relative merits of the various approaches to depression and emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks.
Tasks Emotion Recognition
Published 2016-05-05
URL http://arxiv.org/abs/1605.01600v4
PDF http://arxiv.org/pdf/1605.01600v4.pdf
PWC https://paperswithcode.com/paper/avec-2016-depression-mood-and-emotion
Repo
Framework

Reference-Aware Language Models

Title Reference-Aware Language Models
Authors Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling
Abstract We propose a general class of language models that treat reference as an explicit stochastic latent variable. This architecture allows models to create mentions of entities and their attributes by accessing external databases (required by, e.g., dialogue generation and recipe generation) and internal state (required by, e.g. language models which are aware of coreference). This facilitates the incorporation of information that can be accessed in predictable locations in databases or discourse context, even when the targets of the reference may be rare words. Experiments on three tasks shows our model variants based on deterministic attention.
Tasks Dialogue Generation, Recipe Generation
Published 2016-11-05
URL http://arxiv.org/abs/1611.01628v5
PDF http://arxiv.org/pdf/1611.01628v5.pdf
PWC https://paperswithcode.com/paper/reference-aware-language-models
Repo
Framework

Interpretability in Linear Brain Decoding

Title Interpretability in Linear Brain Decoding
Authors Seyed Mostafa Kia, Andrea Passerini
Abstract Improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of brain decoding models. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, we present a simple definition for interpretability of linear brain decoding models. Then, we propose to combine the interpretability and the performance of the brain decoding into a new multi-objective criterion for model selection. Our preliminary results on the toy data show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative linear models. The presented definition provides the theoretical background for quantitative evaluation of interpretability in linear brain decoding.
Tasks Brain Decoding, Model Selection
Published 2016-06-17
URL http://arxiv.org/abs/1606.05672v1
PDF http://arxiv.org/pdf/1606.05672v1.pdf
PWC https://paperswithcode.com/paper/interpretability-in-linear-brain-decoding
Repo
Framework

Interpretability of Multivariate Brain Maps in Brain Decoding: Definition and Quantification

Title Interpretability of Multivariate Brain Maps in Brain Decoding: Definition and Quantification
Authors Seyed Mostafa Kia
Abstract Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed theoretical definition, we formalize a heuristic method for approximating the interpretability of multivariate brain maps in a binary magnetoencephalography (MEG) decoding scenario. Third, we propose to combine the approximated interpretability and the performance of the brain decoding model into a new multi-objective criterion for model selection. Our results for the MEG data show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Tasks Brain Decoding, Model Selection
Published 2016-03-29
URL http://arxiv.org/abs/1603.08704v1
PDF http://arxiv.org/pdf/1603.08704v1.pdf
PWC https://paperswithcode.com/paper/interpretability-of-multivariate-brain-maps
Repo
Framework

Incremental Noising and its Fractal Behavior

Title Incremental Noising and its Fractal Behavior
Authors Konstantinos A. Raftopoulos, Marin Ferecatu, Dionyssios D. Sourlas, Stefanos D. Kollias
Abstract This manuscript is about further elucidating the concept of noising. The concept of noising first appeared in \cite{CVPR14}, in the context of curvature estimation and vertex localization on planar shapes. There are indications that noising can play for global methods the role smoothing plays for local methods in this task. This manuscript is about investigating this claim by introducing incremental noising, in a recursive deterministic manner, analogous to how smoothing is extended to progressive smoothing in similar tasks. As investigating the properties and behavior of incremental noising is the purpose of this manuscript, a surprising connection between incremental noising and progressive smoothing is revealed by the experiments. To explain this phenomenon, the fractal and the space filling properties of the two methods respectively, are considered in a unifying context.
Tasks
Published 2016-07-28
URL http://arxiv.org/abs/1607.08362v2
PDF http://arxiv.org/pdf/1607.08362v2.pdf
PWC https://paperswithcode.com/paper/incremental-noising-and-its-fractal-behavior
Repo
Framework

Boltzmann meets Nash: Energy-efficient routing in optical networks under uncertainty

Title Boltzmann meets Nash: Energy-efficient routing in optical networks under uncertainty
Authors Panayotis Mertikopoulos, Aris L. Moustakas, Anna Tzanakaki
Abstract Motivated by the massive deployment of power-hungry data centers for service provisioning, we examine the problem of routing in optical networks with the aim of minimizing traffic-driven power consumption. To tackle this issue, routing must take into account energy efficiency as well as capacity considerations; moreover, in rapidly-varying network environments, this must be accomplished in a real-time, distributed manner that remains robust in the presence of random disturbances and noise. In view of this, we derive a pricing scheme whose Nash equilibria coincide with the network’s socially optimum states, and we propose a distributed learning method based on the Boltzmann distribution of statistical mechanics. Using tools from stochastic calculus, we show that the resulting Boltzmann routing scheme exhibits remarkable convergence properties under uncertainty: specifically, the long-term average of the network’s power consumption converges within $\varepsilon$ of its minimum value in time which is at most $\tilde O(1/\varepsilon^2)$, irrespective of the fluctuations’ magnitude; additionally, if the network admits a strict, non-mixing optimum state, the algorithm converges to it - again, no matter the noise level. Our analysis is supplemented by extensive numerical simulations which show that Boltzmann routing can lead to a significant decrease in power consumption over basic, shortest-path routing schemes in realistic network conditions.
Tasks
Published 2016-05-04
URL http://arxiv.org/abs/1605.01451v1
PDF http://arxiv.org/pdf/1605.01451v1.pdf
PWC https://paperswithcode.com/paper/boltzmann-meets-nash-energy-efficient-routing
Repo
Framework
comments powered by Disqus