May 7, 2019

3044 words 15 mins read

Paper Group ANR 6

Re-presenting a Story by Emotional Factors using Sentimental Analysis Method. Achievements in Answer Set Programming. Soccer Field Localization from a Single Image. Finite Sample Prediction and Recovery Bounds for Ordinal Embedding. A Classifier-guided Approach for Top-down Salient Object Detection. Finite-time Analysis for the Knowledge-Gradient P …

Re-presenting a Story by Emotional Factors using Sentimental Analysis Method


Title	Re-presenting a Story by Emotional Factors using Sentimental Analysis Method
Authors	Hwiyeol Jo, Yohan Moon, Jong In Kim, Jeong Ryu
Abstract	Remembering an event is affected by personal emotional status. We examined the psychological status and personal factors; depression (Center for Epidemiological Studies - Depression, Radloff, 1977), present affective (Positive Affective and Negative Affective Schedule, Watson et al., 1988), life orient (Life Orient Test, Scheier & Carver, 1985), self-awareness (Core Self Evaluation Scale, Judge et al., 2003), and social factor (Social Support, Sarason et al., 1983) of undergraduate students (N=64) and got summaries of a story, Chronicle of a Death Foretold (Gabriel Garcia Marquez, 1981) from them. We implement a sentimental analysis model based on convolutional neural network (LeCun & Bengio, 1995) to evaluate each summary. From the same vein used for transfer learning (Pan & Yang, 2010), we collected 38,265 movie review data to train the model and then use them to score summaries of each student. The results of CES-D and PANAS show the relationship between emotion and memory retrieval as follows: depressed people have shown a tendency of representing a story more negatively, and they seemed less expressive. People with full of emotion - high in PANAS - have retrieved their memory more expressively than others, using more negative words then others. The contributions of this study can be summarized as follows: First, lightening the relationship between emotion and its effect during times of storing or retrieving a memory. Second, suggesting objective methods to evaluate the intensity of emotion in natural language format, using a sentimental analysis model.
Tasks	Transfer Learning
Published	2016-07-13
URL	http://arxiv.org/abs/1607.03707v1
PDF	http://arxiv.org/pdf/1607.03707v1.pdf
PWC	https://paperswithcode.com/paper/re-presenting-a-story-by-emotional-factors
Repo
Framework

Achievements in Answer Set Programming


Title	Achievements in Answer Set Programming
Authors	Vladimir Lifschitz
Abstract	This paper describes an approach to the methodology of answer set programming (ASP) that can facilitate the design of encodings that are easy to understand and provably correct. Under this approach, after appending a rule or a small group of rules to the emerging program we include a comment that states what has been “achieved” so far. This strategy allows us to set out our understanding of the design of the program by describing the roles of small parts of the program in a mathematically precise way.
Tasks
Published	2016-08-29
URL	https://arxiv.org/abs/1608.08144v2
PDF	https://arxiv.org/pdf/1608.08144v2.pdf
PWC	https://paperswithcode.com/paper/achievements-in-answer-set-programming
Repo
Framework

Soccer Field Localization from a Single Image


Title	Soccer Field Localization from a Single Image
Authors	Namdar Homayounfar, Sanja Fidler, Raquel Urtasun
Abstract	In this work, we propose a novel way of efficiently localizing a soccer field from a single broadcast image of the game. Related work in this area relies on manually annotating a few key frames and extending the localization to similar images, or installing fixed specialized cameras in the stadium from which the layout of the field can be obtained. In contrast, we formulate this problem as a branch and bound inference in a Markov random field where an energy function is defined in terms of field cues such as grass, lines and circles. Moreover, our approach is fully automatic and depends only on single images from the broadcast video of the game. We demonstrate the effectiveness of our method by applying it to various games and obtain promising results. Finally, we posit that our approach can be applied easily to other sports such as hockey and basketball.
Tasks
Published	2016-04-10
URL	http://arxiv.org/abs/1604.02715v1
PDF	http://arxiv.org/pdf/1604.02715v1.pdf
PWC	https://paperswithcode.com/paper/soccer-field-localization-from-a-single-image
Repo
Framework

Finite Sample Prediction and Recovery Bounds for Ordinal Embedding


Title	Finite Sample Prediction and Recovery Bounds for Ordinal Embedding
Authors	Lalit Jain, Kevin Jamieson, Robert Nowak
Abstract	The goal of ordinal embedding is to represent items as points in a low-dimensional Euclidean space given a set of constraints in the form of distance comparisons like “item $i$ is closer to item $j$ than item $k$”. Ordinal constraints like this often come from human judgments. To account for errors and variation in judgments, we consider the noisy situation in which the given constraints are independently corrupted by reversing the correct constraint with some probability. This paper makes several new contributions to this problem. First, we derive prediction error bounds for ordinal embedding with noise by exploiting the fact that the rank of a distance matrix of points in $\mathbb{R}^d$ is at most $d+2$. These bounds characterize how well a learned embedding predicts new comparative judgments. Second, we investigate the special case of a known noise model and study the Maximum Likelihood estimator. Third, knowledge of the noise model enables us to relate prediction errors to embedding accuracy. This relationship is highly non-trivial since we show that the linear map corresponding to distance comparisons is non-invertible, but there exists a nonlinear map that is invertible. Fourth, two new algorithms for ordinal embedding are proposed and evaluated in experiments.
Tasks
Published	2016-06-22
URL	http://arxiv.org/abs/1606.07081v1
PDF	http://arxiv.org/pdf/1606.07081v1.pdf
PWC	https://paperswithcode.com/paper/finite-sample-prediction-and-recovery-bounds
Repo
Framework

A Classifier-guided Approach for Top-down Salient Object Detection


Title	A Classifier-guided Approach for Top-down Salient Object Detection
Authors	Hisham Cholakkal, Jubin Johnson, Deepu Rajan
Abstract	We propose a framework for top-down salient object detection that incorporates a tightly coupled image classification module. The classifier is trained on novel category-aware sparse codes computed on object dictionaries used for saliency modeling. A misclassification indicates that the corresponding saliency model is inaccurate. Hence, the classifier selects images for which the saliency models need to be updated. The category-aware sparse coding produces better image classification accuracy as compared to conventional sparse coding with a reduced computational complexity. A saliency-weighted max-pooling is proposed to improve image classification, which is further used to refine the saliency maps. Experimental results on Graz-02 and PASCAL VOC-07 datasets demonstrate the effectiveness of salient object detection. Although the role of the classifier is to support salient object detection, we evaluate its performance in image classification and also illustrate the utility of thresholded saliency maps for image segmentation.
Tasks	Image Classification, Object Detection, Salient Object Detection, Semantic Segmentation
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06570v1
PDF	http://arxiv.org/pdf/1604.06570v1.pdf
PWC	https://paperswithcode.com/paper/a-classifier-guided-approach-for-top-down
Repo
Framework

Finite-time Analysis for the Knowledge-Gradient Policy


Title	Finite-time Analysis for the Knowledge-Gradient Policy
Authors	Yingfei Wang, Warren Powell
Abstract	We consider sequential decision problems in which we adaptively choose one of finitely many alternatives and observe a stochastic reward. We offer a new perspective of interpreting Bayesian ranking and selection problems as adaptive stochastic multi-set maximization problems and derive the first finite-time bound of the knowledge-gradient policy for adaptive submodular objective functions. In addition, we introduce the concept of prior-optimality and provide another insight into the performance of the knowledge gradient policy based on the submodular assumption on the value of information. We demonstrate submodularity for the two-alternative case and provide other conditions for more general problems, bringing out the issue and importance of submodularity in learning problems. Empirical experiments are conducted to further illustrate the finite time behavior of the knowledge gradient policy.
Tasks
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04624v1
PDF	http://arxiv.org/pdf/1606.04624v1.pdf
PWC	https://paperswithcode.com/paper/finite-time-analysis-for-the-knowledge
Repo
Framework

Calibration of Phone Likelihoods in Automatic Speech Recognition


Title	Calibration of Phone Likelihoods in Automatic Speech Recognition
Authors	David A. van Leeuwen, Joost van Doremalen
Abstract	In this paper we study the probabilistic properties of the posteriors in a speech recognition system that uses a deep neural network (DNN) for acoustic modeling. We do this by reducing Kaldi’s DNN shared pdf-id posteriors to phone likelihoods, and using test set forced alignments to evaluate these using a calibration sensitive metric. Individual frame posteriors are in principle well-calibrated, because the DNN is trained using cross entropy as the objective function, which is a proper scoring rule. When entire phones are assessed, we observe that it is best to average the log likelihoods over the duration of the phone. Further scaling of the average log likelihoods by the logarithm of the duration slightly improves the calibration, and this improvement is retained when tested on independent test data.
Tasks	Calibration, Speech Recognition
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04317v1
PDF	http://arxiv.org/pdf/1606.04317v1.pdf
PWC	https://paperswithcode.com/paper/calibration-of-phone-likelihoods-in-automatic
Repo
Framework

Supervised Transformer Network for Efficient Face Detection


Title	Supervised Transformer Network for Efficient Face Detection
Authors	Dong Chen, Gang Hua, Fang Wen, Jian Sun
Abstract	Large pose variations remain to be a challenge that confronts real-word face detection. We propose a new cascaded Convolutional Neural Network, dubbed the name Supervised Transformer Network, to address this challenge. The first stage is a multi-task Region Proposal Network (RPN), which simultaneously predicts candidate face regions along with associated facial landmarks. The candidate regions are then warped by mapping the detected facial landmarks to their canonical positions to better normalize the face patterns. The second stage, which is a RCNN, then verifies if the warped candidate regions are valid faces or not. We conduct end-to-end learning of the cascaded network, including optimizing the canonical positions of the facial landmarks. This supervised learning of the transformations automatically selects the best scale to differentiate face/non-face patterns. By combining feature maps from both stages of the network, we achieve state-of-the-art detection accuracies on several public benchmarks. For real-time performance, we run the cascaded network only on regions of interests produced from a boosting cascade face detector. Our detector runs at 30 FPS on a single CPU core for a VGA-resolution image.
Tasks	Face Detection
Published	2016-07-19
URL	http://arxiv.org/abs/1607.05477v1
PDF	http://arxiv.org/pdf/1607.05477v1.pdf
PWC	https://paperswithcode.com/paper/supervised-transformer-network-for-efficient
Repo
Framework

Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning


Title	Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning
Authors	Soravit Changpinyo, Wei-Lun Chao, Fei Sha
Abstract	Leveraging class semantic descriptions and examples of known objects, zero-shot learning makes it possible to train a recognition model for an object class whose examples are not available. In this paper, we propose a novel zero-shot learning model that takes advantage of clustering structures in the semantic embedding space. The key idea is to impose the structural constraint that semantic representations must be predictive of the locations of their corresponding visual exemplars. To this end, this reduces to training multiple kernel-based regressors from semantic representation-exemplar pairs from labeled data of the seen object categories. Despite its simplicity, our approach significantly outperforms existing zero-shot learning methods on standard benchmark datasets, including the ImageNet dataset with more than 20,000 unseen categories.
Tasks	Zero-Shot Learning
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08151v2
PDF	http://arxiv.org/pdf/1605.08151v2.pdf
PWC	https://paperswithcode.com/paper/predicting-visual-exemplars-of-unseen-classes
Repo
Framework

Bayesian Learning of Kernel Embeddings


Title	Bayesian Learning of Kernel Embeddings
Authors	Seth Flaxman, Dino Sejdinovic, John P. Cunningham, Sarah Filippi
Abstract	Kernel methods are one of the mainstays of machine learning, but the problem of kernel learning remains challenging, with only a few heuristics and very little theory. This is of particular importance in methods based on estimation of kernel mean embeddings of probability measures. For characteristic kernels, which include most commonly used ones, the kernel mean embedding uniquely determines its probability measure, so it can be used to design a powerful statistical testing framework, which includes nonparametric two-sample and independence tests. In practice, however, the performance of these tests can be very sensitive to the choice of kernel and its lengthscale parameters. To address this central issue, we propose a new probabilistic model for kernel mean embeddings, the Bayesian Kernel Embedding model, combining a Gaussian process prior over the Reproducing Kernel Hilbert Space containing the mean embedding with a conjugate likelihood function, thus yielding a closed form posterior over the mean embedding. The posterior mean of our model is closely related to recently proposed shrinkage estimators for kernel mean embeddings, while the posterior uncertainty is a new, interesting feature with various possible applications. Critically for the purposes of kernel learning, our model gives a simple, closed form marginal pseudolikelihood of the observed data given the kernel hyperparameters. This marginal pseudolikelihood can either be optimized to inform the hyperparameter choice or fully Bayesian inference can be used.
Tasks	Bayesian Inference
Published	2016-03-07
URL	http://arxiv.org/abs/1603.02160v2
PDF	http://arxiv.org/pdf/1603.02160v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-learning-of-kernel-embeddings
Repo
Framework

Poisson intensity estimation with reproducing kernels


Title	Poisson intensity estimation with reproducing kernels
Authors	Seth Flaxman, Yee Whye Teh, Dino Sejdinovic
Abstract	Despite the fundamental nature of the inhomogeneous Poisson process in the theory and application of stochastic processes, and its attractive generalizations (e.g. Cox process), few tractable nonparametric modeling approaches of intensity functions exist, especially when observed points lie in a high-dimensional space. In this paper we develop a new, computationally tractable Reproducing Kernel Hilbert Space (RKHS) formulation for the inhomogeneous Poisson process. We model the square root of the intensity as an RKHS function. Whereas RKHS models used in supervised learning rely on the so-called representer theorem, the form of the inhomogeneous Poisson process likelihood means that the representer theorem does not apply. However, we prove that the representer theorem does hold in an appropriately transformed RKHS, guaranteeing that the optimization of the penalized likelihood can be cast as a tractable finite-dimensional problem. The resulting approach is simple to implement, and readily scales to high dimensions and large-scale datasets.
Tasks
Published	2016-10-27
URL	http://arxiv.org/abs/1610.08623v3
PDF	http://arxiv.org/pdf/1610.08623v3.pdf
PWC	https://paperswithcode.com/paper/poisson-intensity-estimation-with-reproducing
Repo
Framework

Crater Detection via Convolutional Neural Networks


Title	Crater Detection via Convolutional Neural Networks
Authors	Joseph Paul Cohen, Henry Z. Lo, Tingting Lu, Wei Ding
Abstract	Craters are among the most studied geomorphic features in the Solar System because they yield important information about the past and present geological processes and provide information about the relative ages of observed geologic formations. We present a method for automatic crater detection using advanced machine learning to deal with the large amount of satellite imagery collected. The challenge of automatically detecting craters comes from their is complex surface because their shape erodes over time to blend into the surface. Bandeira provided a seminal dataset that embodied this challenge that is still an unsolved pattern recognition problem to this day. There has been work to solve this challenge based on extracting shape and contrast features and then applying classification models on those features. The limiting factor in this existing work is the use of hand crafted filters on the image such as Gabor or Sobel filters or Haar features. These hand crafted methods rely on domain knowledge to construct. We would like to learn the optimal filters and features based on training examples. In order to dynamically learn filters and features we look to Convolutional Neural Networks (CNNs) which have shown their dominance in computer vision. The power of CNNs is that they can learn image filters which generate features for high accuracy classification.
Tasks
Published	2016-01-05
URL	http://arxiv.org/abs/1601.00978v1
PDF	http://arxiv.org/pdf/1601.00978v1.pdf
PWC	https://paperswithcode.com/paper/crater-detection-via-convolutional-neural
Repo
Framework

Deep learning for detection of bird vocalisations


Title	Deep learning for detection of bird vocalisations
Authors	Ilyas Potamitis
Abstract	This work focuses on reliable detection of bird sound emissions as recorded in the open field. Acoustic detection of avian sounds can be used for the automatized monitoring of multiple bird taxa and querying in long-term recordings for species of interest for researchers, conservation practitioners, and decision makers. Recordings in the wild can be very noisy due to the exposure of the microphones to a large number of audio sources originating from all distances and directions, the number and identity of which cannot be known a-priori. The co-existence of the target vocalizations with abiotic interferences in an unconstrained environment is inefficiently treated by current approaches of audio signal enhancement. A technique that would spot only bird vocalization while ignoring other audio sources is of prime importance. These difficulties are tackled in this work, presenting a deep autoencoder that maps the audio spectrogram of bird vocalizations to its corresponding binary mask that encircles the spectral blobs of vocalizations while suppressing other audio sources. The procedure requires minimum human attendance, it is very fast during execution, thus suitable to scan massive volumes of data, in order to analyze them, evaluate insights and hypotheses, identify patterns of bird activity that, hopefully, finally lead to design policies on biodiversity issues.
Tasks
Published	2016-09-25
URL	http://arxiv.org/abs/1609.08408v1
PDF	http://arxiv.org/pdf/1609.08408v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-detection-of-bird
Repo
Framework

Sparse Coding for Alpha Matting


Title	Sparse Coding for Alpha Matting
Authors	Jubin Johnson, Ehsan Shahrian Varnousfaderani, Hisham Cholakkal, Deepu Rajan
Abstract	Existing color sampling based alpha matting methods use the compositing equation to estimate alpha at a pixel from pairs of foreground (F) and background (B) samples. The quality of the matte depends on the selected (F,B) pairs. In this paper, the matting problem is reinterpreted as a sparse coding of pixel features, wherein the sum of the codes gives the estimate of the alpha matte from a set of unpaired F and B samples. A non-parametric probabilistic segmentation provides a certainty measure on the pixel belonging to foreground or background, based on which a dictionary is formed for use in sparse coding. By removing the restriction to conform to (F,B) pairs, this method allows for better alpha estimation from multiple F and B samples. The same framework is extended to videos, where the requirement of temporal coherence is handled effectively. Here, the dictionary is formed by samples from multiple frames. A multi-frame graph model, as opposed to a single image as for image matting, is proposed that can be solved efficiently in closed form. Quantitative and qualitative evaluations on a benchmark dataset are provided to show that the proposed method outperforms current state-of-the-art in image and video matting.
Tasks	Image Matting
Published	2016-04-11
URL	http://arxiv.org/abs/1604.02898v1
PDF	http://arxiv.org/pdf/1604.02898v1.pdf
PWC	https://paperswithcode.com/paper/sparse-coding-for-alpha-matting
Repo
Framework

Stereo image de-fencing using smartphones


Title	Stereo image de-fencing using smartphones
Authors	Sankaraganesh Jonna, Sukla Satapathy, Rajiv R. Sahay
Abstract	Conventional approaches to image de-fencing have limited themselves to using only image data in adjacent frames of the captured video of an approximately static scene. In this work, we present a method to harness disparity using a stereo pair of fenced images in order to detect fence pixels. Tourists and amateur photographers commonly carry smartphones/phablets which can be used to capture a short video sequence of the fenced scene. We model the formation of the occluded frames in the captured video. Furthermore, we propose an optimization framework to estimate the de-fenced image using the total variation prior to regularize the ill-posed problem.
Tasks
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01323v1
PDF	http://arxiv.org/pdf/1612.01323v1.pdf
PWC	https://paperswithcode.com/paper/stereo-image-de-fencing-using-smartphones
Repo
Framework