May 7, 2019

2991 words 15 mins read

Paper Group ANR 105

Paper Group ANR 105

Audio Recording Device Identification Based on Deep Learning. Generic Feature Learning for Wireless Capsule Endoscopy Analysis. Gaussian Process Regression for Out-of-Sample Extension. Unsupervised Dialogue Act Induction using Gaussian Mixtures. The Role of Context Types and Dimensionality in Learning Word Embeddings. Automatic Segmentation of Dyna …

Audio Recording Device Identification Based on Deep Learning

Title Audio Recording Device Identification Based on Deep Learning
Authors Simeng Qi, Zheng Huang, Yan Li, Shaopei Shi
Abstract In this paper we present a research on identification of audio recording devices from background noise, thus providing a method for forensics. The audio signal is the sum of speech signal and noise signal. Usually, people pay more attention to speech signal, because it carries the information to deliver. So a great amount of researches have been dedicated to getting higher Signal-Noise-Ratio (SNR). There are many speech enhancement algorithms to improve the quality of the speech, which can be seen as reducing the noise. However, noises can be regarded as the intrinsic fingerprint traces of an audio recording device. These digital traces can be characterized and identified by new machine learning techniques. Therefore, in our research, we use the noise as the intrinsic features. As for the identification, multiple classifiers of deep learning methods are used and compared. The identification result shows that the method of getting feature vector from the noise of each device and identifying them with deep learning techniques is viable, and well-preformed.
Tasks Speech Enhancement
Published 2016-02-18
URL http://arxiv.org/abs/1602.05682v2
PDF http://arxiv.org/pdf/1602.05682v2.pdf
PWC https://paperswithcode.com/paper/audio-recording-device-identification-based
Repo
Framework

Generic Feature Learning for Wireless Capsule Endoscopy Analysis

Title Generic Feature Learning for Wireless Capsule Endoscopy Analysis
Authors Santi Seguí, Michal Drozdzal, Guillem Pascual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, Jordi Vitrià
Abstract The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, new CAD system has to be designed from scratch. This characteristic makes the design of new CAD systems a very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which avoids the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed by using state of the art hand-crafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase).
Tasks
Published 2016-07-26
URL http://arxiv.org/abs/1607.07604v1
PDF http://arxiv.org/pdf/1607.07604v1.pdf
PWC https://paperswithcode.com/paper/generic-feature-learning-for-wireless-capsule
Repo
Framework

Gaussian Process Regression for Out-of-Sample Extension

Title Gaussian Process Regression for Out-of-Sample Extension
Authors Oren Barkan, Jonathan Weill, Amir Averbuch
Abstract Manifold learning methods are useful for high dimensional data analysis. Many of the existing methods produce a low dimensional representation that attempts to describe the intrinsic geometric structure of the original data. Typically, this process is computationally expensive and the produced embedding is limited to the training data. In many real life scenarios, the ability to produce embedding of unseen samples is essential. In this paper we propose a Bayesian non-parametric approach for out-of-sample extension. The method is based on Gaussian Process Regression and independent of the manifold learning algorithm. Additionally, the method naturally provides a measure for the degree of abnormality for a newly arrived data point that did not participate in the training process. We derive the mathematical connection between the proposed method and the Nystrom extension and show that the latter is a special case of the former. We present extensive experimental results that demonstrate the performance of the proposed method and compare it to other existing out-of-sample extension methods.
Tasks
Published 2016-03-07
URL http://arxiv.org/abs/1603.02194v2
PDF http://arxiv.org/pdf/1603.02194v2.pdf
PWC https://paperswithcode.com/paper/gaussian-process-regression-for-out-of-sample
Repo
Framework

Unsupervised Dialogue Act Induction using Gaussian Mixtures

Title Unsupervised Dialogue Act Induction using Gaussian Mixtures
Authors Tomáš Brychcín, Pavel Král
Abstract This paper introduces a new unsupervised approach for dialogue act induction. Given the sequence of dialogue utterances, the task is to assign them the labels representing their function in the dialogue. Utterances are represented as real-valued vectors encoding their meaning. We model the dialogue as Hidden Markov model with emission probabilities estimated by Gaussian mixtures. We use Gibbs sampling for posterior inference. We present the results on the standard Switchboard-DAMSL corpus. Our algorithm achieves promising results compared with strong supervised baselines and outperforms other unsupervised algorithms.
Tasks
Published 2016-12-20
URL http://arxiv.org/abs/1612.06572v2
PDF http://arxiv.org/pdf/1612.06572v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-dialogue-act-induction-using-1
Repo
Framework

The Role of Context Types and Dimensionality in Learning Word Embeddings

Title The Role of Context Types and Dimensionality in Learning Word Embeddings
Authors Oren Melamud, David McClosky, Siddharth Patwardhan, Mohit Bansal
Abstract We provide the first extensive evaluation of how using different types of context to learn skip-gram word embeddings affects performance on a wide range of intrinsic and extrinsic NLP tasks. Our results suggest that while intrinsic tasks tend to exhibit a clear preference to particular types of contexts and higher dimensionality, more careful tuning is required for finding the optimal settings for most of the extrinsic tasks that we considered. Furthermore, for these extrinsic tasks, we find that once the benefit from increasing the embedding dimensionality is mostly exhausted, simple concatenation of word embeddings, learned with different context types, can yield further performance gains. As an additional contribution, we propose a new variant of the skip-gram model that learns word embeddings from weighted contexts of substitute words.
Tasks Learning Word Embeddings, Word Embeddings
Published 2016-01-05
URL http://arxiv.org/abs/1601.00893v2
PDF http://arxiv.org/pdf/1601.00893v2.pdf
PWC https://paperswithcode.com/paper/the-role-of-context-types-and-dimensionality
Repo
Framework

Automatic Segmentation of Dynamic Objects from an Image Pair

Title Automatic Segmentation of Dynamic Objects from an Image Pair
Authors Sri Raghu Malireddi, Shanmuganathan Raman
Abstract Automatic segmentation of objects from a single image is a challenging problem which generally requires training on large number of images. We consider the problem of automatically segmenting only the dynamic objects from a given pair of images of a scene captured from different positions. We exploit dense correspondences along with saliency measures in order to first localize the interest points on the dynamic objects from the two images. We propose a novel approach based on techniques from computational geometry in order to automatically segment the dynamic objects from both the images using a top-down segmentation strategy. We discuss how the proposed approach is unique in novelty compared to other state-of-the-art segmentation algorithms. We show that the proposed approach for segmentation is efficient in handling large motions and is able to achieve very good segmentation of the objects for different scenes. We analyse the results with respect to the manually marked ground truth segmentation masks created using our own dataset and provide key observations in order to improve the work in future.
Tasks
Published 2016-04-16
URL http://arxiv.org/abs/1604.04724v1
PDF http://arxiv.org/pdf/1604.04724v1.pdf
PWC https://paperswithcode.com/paper/automatic-segmentation-of-dynamic-objects
Repo
Framework

Contrastive Entropy: A new evaluation metric for unnormalized language models

Title Contrastive Entropy: A new evaluation metric for unnormalized language models
Authors Kushal Arora, Anand Rangarajan
Abstract Perplexity (per word) is the most widely used metric for evaluating language models. Despite this, there has been no dearth of criticism for this metric. Most of these criticisms center around lack of correlation with extrinsic metrics like word error rate (WER), dependence upon shared vocabulary for model comparison and unsuitability for unnormalized language model evaluation. In this paper, we address the last problem and propose a new discriminative entropy based intrinsic metric that works for both traditional word level models and unnormalized language models like sentence level models. We also propose a discriminatively trained sentence level interpretation of recurrent neural network based language model (RNN) as an example of unnormalized sentence level model. We demonstrate that for word level models, contrastive entropy shows a strong correlation with perplexity. We also observe that when trained at lower distortion levels, sentence level RNN considerably outperforms traditional RNNs on this new metric.
Tasks Language Modelling
Published 2016-01-03
URL http://arxiv.org/abs/1601.00248v2
PDF http://arxiv.org/pdf/1601.00248v2.pdf
PWC https://paperswithcode.com/paper/contrastive-entropy-a-new-evaluation-metric
Repo
Framework

Dimensionality-Dependent Generalization Bounds for $k$-Dimensional Coding Schemes

Title Dimensionality-Dependent Generalization Bounds for $k$-Dimensional Coding Schemes
Authors Tongliang Liu, Dacheng Tao, Dong Xu
Abstract The $k$-dimensional coding schemes refer to a collection of methods that attempt to represent data using a set of representative $k$-dimensional vectors, and include non-negative matrix factorization, dictionary learning, sparse coding, $k$-means clustering and vector quantization as special cases. Previous generalization bounds for the reconstruction error of the $k$-dimensional coding schemes are mainly dimensionality independent. A major advantage of these bounds is that they can be used to analyze the generalization error when data is mapped into an infinite- or high-dimensional feature space. However, many applications use finite-dimensional data features. Can we obtain dimensionality-dependent generalization bounds for $k$-dimensional coding schemes that are tighter than dimensionality-independent bounds when data is in a finite-dimensional feature space? The answer is positive. In this paper, we address this problem and derive a dimensionality-dependent generalization bound for $k$-dimensional coding schemes by bounding the covering number of the loss function class induced by the reconstruction error. The bound is of order $\mathcal{O}\left(\left(mk\ln(mkn)/n\right)^{\lambda_n}\right)$, where $m$ is the dimension of features, $k$ is the number of the columns in the linear implementation of coding schemes, $n$ is the size of sample, $\lambda_n>0.5$ when $n$ is finite and $\lambda_n=0.5$ when $n$ is infinite. We show that our bound can be tighter than previous results, because it avoids inducing the worst-case upper bound on $k$ of the loss function and converges faster. The proposed generalization bound is also applied to some specific coding schemes to demonstrate that the dimensionality-dependent bound is an indispensable complement to these dimensionality-independent generalization bounds.
Tasks Dictionary Learning, Quantization
Published 2016-01-03
URL http://arxiv.org/abs/1601.00238v2
PDF http://arxiv.org/pdf/1601.00238v2.pdf
PWC https://paperswithcode.com/paper/dimensionality-dependent-generalization
Repo
Framework

Multigrid Neural Architectures

Title Multigrid Neural Architectures
Authors Tsung-Wei Ke, Michael Maire, Stella X. Yu
Abstract We propose a multigrid extension of convolutional neural networks (CNNs). Rather than manipulating representations living on a single spatial grid, our network layers operate across scale space, on a pyramid of grids. They consume multigrid inputs and produce multigrid outputs; convolutional filters themselves have both within-scale and cross-scale extent. This aspect is distinct from simple multiscale designs, which only process the input at different scales. Viewed in terms of information flow, a multigrid network passes messages across a spatial pyramid. As a consequence, receptive field size grows exponentially with depth, facilitating rapid integration of context. Most critically, multigrid structure enables networks to learn internal attention and dynamic routing mechanisms, and use them to accomplish tasks on which modern CNNs fail. Experiments demonstrate wide-ranging performance advantages of multigrid. On CIFAR and ImageNet classification tasks, flipping from a single grid to multigrid within the standard CNN paradigm improves accuracy, while being compute and parameter efficient. Multigrid is independent of other architectural choices; we show synergy in combination with residual connections. Multigrid yields dramatic improvement on a synthetic semantic segmentation dataset. Most strikingly, relatively shallow multigrid networks can learn to directly perform spatial transformation tasks, where, in contrast, current CNNs fail. Together, our results suggest that continuous evolution of features on a multigrid pyramid is a more powerful alternative to existing CNN designs on a flat grid.
Tasks Image Classification, Semantic Segmentation
Published 2016-11-23
URL http://arxiv.org/abs/1611.07661v2
PDF http://arxiv.org/pdf/1611.07661v2.pdf
PWC https://paperswithcode.com/paper/multigrid-neural-architectures
Repo
Framework

An Analysis of Tournament Structure

Title An Analysis of Tournament Structure
Authors Nhien Pham Hoang Bao, Hiroyuki Iida
Abstract This paper explores a novel way for analyzing the tournament structures to find a best suitable one for the tournament under consideration. It concerns about three aspects such as tournament conducting cost, competitiveness development and ranking precision. It then proposes a new method using progress tree to detect potential throwaway matches. The analysis performed using the proposed method reveals the strengths and weaknesses of tournament structures. As a conclusion, single elimination is best if we want to qualify one winner only, all matches conducted are exciting in term of competitiveness. Double elimination with proper seeding system is a better choice if we want to qualify more winners. A reasonable number of extra matches need to be conducted in exchange of being able to qualify top four winners. Round-robin gives reliable ranking precision for all participants. However, its conduction cost is very high, and it fails to maintain competitiveness development.
Tasks
Published 2016-11-16
URL http://arxiv.org/abs/1611.08499v1
PDF http://arxiv.org/pdf/1611.08499v1.pdf
PWC https://paperswithcode.com/paper/an-analysis-of-tournament-structure
Repo
Framework

Adaptive matching pursuit for sparse signal recovery

Title Adaptive matching pursuit for sparse signal recovery
Authors Tiep H. Vu, Hojjat S. Mousavi, Vishal Monga
Abstract Spike and Slab priors have been of much recent interest in signal processing as a means of inducing sparsity in Bayesian inference. Applications domains that benefit from the use of these priors include sparse recovery, regression and classification. It is well-known that solving for the sparse coefficient vector to maximize these priors results in a hard non-convex and mixed integer programming problem. Most existing solutions to this optimization problem either involve simplifying assumptions/relaxations or are computationally expensive. We propose a new greedy and adaptive matching pursuit (AMP) algorithm to directly solve this hard problem. Essentially, in each step of the algorithm, the set of active elements would be updated by either adding or removing one index, whichever results in better improvement. In addition, the intermediate steps of the algorithm are calculated via an inexpensive Cholesky decomposition which makes the algorithm much faster. Results on simulated data sets as well as real-world image recovery challenges confirm the benefits of the proposed AMP, particularly in providing a superior cost-quality trade-off over existing alternatives.
Tasks Bayesian Inference
Published 2016-09-12
URL http://arxiv.org/abs/1610.08495v1
PDF http://arxiv.org/pdf/1610.08495v1.pdf
PWC https://paperswithcode.com/paper/adaptive-matching-pursuit-for-sparse-signal
Repo
Framework

Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories

Title Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Authors Mark Harmon, Patrick Lucey, Diego Klabjan
Abstract In this paper, we predict the likelihood of a player making a shot in basketball from multiagent trajectories. Previous approaches to similar problems center on hand-crafting features to capture domain specific knowledge. Although intuitive, recent work in deep learning has shown this approach is prone to missing important predictive features. To circumvent this issue, we present a convolutional neural network (CNN) approach where we initially represent the multiagent behavior as an image. To encode the adversarial nature of basketball, we use a multi-channel image which we then feed into a CNN. Additionally, to capture the temporal aspect of the trajectories we “fade” the player trajectories. We find that this approach is superior to a traditional FFN model. By using gradient ascent to create images using an already trained CNN, we discover what features the CNN filters learn. Last, we find that a combined CNN+FFN is the best performing network with an error rate of 39%.
Tasks
Published 2016-09-15
URL http://arxiv.org/abs/1609.04849v4
PDF http://arxiv.org/pdf/1609.04849v4.pdf
PWC https://paperswithcode.com/paper/predicting-shot-making-in-basketball-learnt
Repo
Framework

Scale Invariant Interest Points with Shearlets

Title Scale Invariant Interest Points with Shearlets
Authors Miguel A. Duval-Poo, Nicoletta Noceti, Francesca Odone, Ernesto De Vito
Abstract Shearlets are a relatively new directional multi-scale framework for signal analysis, which have been shown effective to enhance signal discontinuities such as edges and corners at multiple scales. In this work we address the problem of detecting and describing blob-like features in the shearlets framework. We derive a measure which is very effective for blob detection and closely related to the Laplacian of Gaussian. We demonstrate the measure satisfies the perfect scale invariance property in the continuous case. In the discrete setting, we derive algorithms for blob detection and keypoint description. Finally, we provide qualitative justifications of our findings as well as a quantitative evaluation on benchmark data. We also report an experimental evidence that our method is very suitable to deal with compressed and noisy images, thanks to the sparsity property of shearlets.
Tasks
Published 2016-07-26
URL http://arxiv.org/abs/1607.07639v1
PDF http://arxiv.org/pdf/1607.07639v1.pdf
PWC https://paperswithcode.com/paper/scale-invariant-interest-points-with
Repo
Framework

Training Auto-encoders Effectively via Eliminating Task-irrelevant Input Variables

Title Training Auto-encoders Effectively via Eliminating Task-irrelevant Input Variables
Authors Hui Shen, Dehua Li, Hong Wu, Zhaoxiang Zang
Abstract Auto-encoders are often used as building blocks of deep network classifier to learn feature extractors, but task-irrelevant information in the input data may lead to bad extractors and result in poor generalization performance of the network. In this paper,via dropping the task-irrelevant input variables the performance of auto-encoders can be obviously improved .Specifically, an importance-based variable selection method is proposed to aim at finding the task-irrelevant input variables and dropping them.It firstly estimates importance of each variable,and then drops the variables with importance value lower than a threshold. In order to obtain better performance, the method can be employed for each layer of stacked auto-encoders. Experimental results show that when combined with our method the stacked denoising auto-encoders achieves significantly improved performance on three challenging datasets.
Tasks Denoising
Published 2016-05-31
URL http://arxiv.org/abs/1605.09458v1
PDF http://arxiv.org/pdf/1605.09458v1.pdf
PWC https://paperswithcode.com/paper/training-auto-encoders-effectively-via
Repo
Framework

T-CONV: A Convolutional Neural Network For Multi-scale Taxi Trajectory Prediction

Title T-CONV: A Convolutional Neural Network For Multi-scale Taxi Trajectory Prediction
Authors Jianming Lv, Qing Li, Xintong Wang
Abstract Precise destination prediction of taxi trajectories can benefit many intelligent location based services such as accurate ad for passengers. Traditional prediction approaches, which treat trajectories as one-dimensional sequences and process them in single scale, fail to capture the diverse two-dimensional patterns of trajectories in different spatial scales. In this paper, we propose T-CONV which models trajectories as two-dimensional images, and adopts multi-layer convolutional neural networks to combine multi-scale trajectory patterns to achieve precise prediction. Furthermore, we conduct gradient analysis to visualize the multi-scale spatial patterns captured by T-CONV and extract the areas with distinct influence on the ultimate prediction. Finally, we integrate multiple local enhancement convolutional fields to explore these important areas deeply for better prediction. Comprehensive experiments based on real trajectory data show that T-CONV can achieve higher accuracy than the state-of-the-art methods.
Tasks Trajectory Prediction
Published 2016-11-23
URL http://arxiv.org/abs/1611.07635v3
PDF http://arxiv.org/pdf/1611.07635v3.pdf
PWC https://paperswithcode.com/paper/t-conv-a-convolutional-neural-network-for
Repo
Framework
comments powered by Disqus