May 7, 2019

2921 words 14 mins read

Paper Group ANR 100

Paper Group ANR 100

Learning Features of Music from Scratch. Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning. BreakingNews: Article Annotation by Image and Text Processing. Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity. Generation of Near-Optimal Solutions …

Learning Features of Music from Scratch

Title Learning Features of Music from Scratch
Authors John Thickstun, Zaid Harchaoui, Sham Kakade
Abstract This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research. MusicNet consists of hundreds of freely-licensed classical music recordings by 10 composers, written for 11 instruments, together with instrument/note annotations resulting in over 1 million temporal labels on 34 hours of chamber music performances under various studio and microphone conditions. The paper defines a multi-label classification task to predict notes in musical recordings, along with an evaluation protocol, and benchmarks several machine learning architectures for this task: i) learning from spectrogram features; ii) end-to-end learning with a neural net; iii) end-to-end learning with a convolutional neural net. These experiments show that end-to-end models trained for note prediction learn frequency selective filters as a low-level representation of audio.
Tasks Multi-Label Classification
Published 2016-11-29
URL http://arxiv.org/abs/1611.09827v2
PDF http://arxiv.org/pdf/1611.09827v2.pdf
PWC https://paperswithcode.com/paper/learning-features-of-music-from-scratch
Repo
Framework

Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning

Title Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning
Authors Naima Kaabouch, Wen-Chen Hu, Yi Chen
Abstract Asymmetry analysis based on the overlapping of thermal images proved able to detect inflammation and, predict foot ulceration. This technique involves three main steps: segmentation, geometric transformation, and overlapping. However, the overlapping technique, which consists of subtracting the intensity levels of the right foot from those of the left foot, can also detect false abnormal areas if the projections of the left and right feet are not the same. In this paper, we present an alternative technique to asymmetry analysis-based overlapping. The proposed technique, scalable scanning, allows for an effective comparison even if the shapes and sizes of the feet projections appear differently in the image. The tested results show that asymmetry analysis- based scalable scanning provides fewer false abnormal areas than does asymmetry analysis -based overlapping.
Tasks
Published 2016-06-11
URL http://arxiv.org/abs/1606.03578v1
PDF http://arxiv.org/pdf/1606.03578v1.pdf
PWC https://paperswithcode.com/paper/alternative-technique-to-asymmetry-analysis
Repo
Framework

BreakingNews: Article Annotation by Image and Text Processing

Title BreakingNews: Article Annotation by Image and Text Processing
Authors Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk
Abstract Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of computer vision and natural language processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of News articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce new deep learning methods that address source detection, popularity prediction, article illustration and geolocation of articles. An adaptive CNN architecture is proposed, that shares most of the structure for all the tasks, and is suitable for multitask and transfer learning. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and popularity metrics). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.
Tasks Image Retrieval, Transfer Learning
Published 2016-03-23
URL http://arxiv.org/abs/1603.07141v1
PDF http://arxiv.org/pdf/1603.07141v1.pdf
PWC https://paperswithcode.com/paper/breakingnews-article-annotation-by-image-and
Repo
Framework

Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity

Title Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity
Authors Jia Xu, Zu-Zhen Huang, Zhi-Rui Wang, Li Xiao, Xiang-Gen Xia, Teng Long
Abstract In this paper, with respect to multichannel synthetic aperture radars (SAR), we first formulate the problems of Doppler ambiguities on the radial velocity (RV) estimation of a ground moving target in range-compressed domain, range-Doppler domain and image domain, respectively. It is revealed that in these problems, a cascaded time-space Doppler ambiguity (CTSDA) may encounter, i.e., time domain Doppler ambiguity (TDDA) in each channel arises first and then spatial domain Doppler ambiguity (SDDA) among multi-channels arises second. Accordingly, the multichannel SAR systems with different parameters are investigated in three different cases with diverse Doppler ambiguity properties, and a multi-frequency SAR is then proposed to obtain the RV estimation by solving the ambiguity problem based on Chinese remainder theorem (CRT). In the first two cases, the ambiguity problem can be solved by the existing closed-form robust CRT. In the third case, it is found that the problem is different from the conventional CRT problems and we call it a double remaindering problem in this paper. We then propose a sufficient condition under which the double remaindering problem, i.e., the CTSDA, can also be solved by the closed-form robust CRT. When the sufficient condition is not satisfied for a multi-channel SAR, a searching based method is proposed. Finally, some results of numerical experiments are provided to demonstrate the effectiveness of the proposed methods.
Tasks
Published 2016-10-01
URL http://arxiv.org/abs/1610.00070v3
PDF http://arxiv.org/pdf/1610.00070v3.pdf
PWC https://paperswithcode.com/paper/radial-velocity-retrieval-for-multichannel
Repo
Framework

Generation of Near-Optimal Solutions Using ILP-Guided Sampling

Title Generation of Near-Optimal Solutions Using ILP-Guided Sampling
Authors Ashwin Srinivasan, Gautam Shroff, Lovekesh Vig, Sarmimala Saikia, Puneet Agarwal
Abstract Our interest in this paper is in optimisation problems that are intractable to solve by direct numerical optimisation, but nevertheless have significant amounts of relevant domain-specific knowledge. The category of heuristic search techniques known as estimation of distribution algorithms (EDAs) seek to incrementally sample from probability distributions in which optimal (or near-optimal) solutions have increasingly higher probabilities. Can we use domain knowledge to assist the estimation of these distributions? To answer this in the affirmative, we need: (a)a general-purpose technique for the incorporation of domain knowledge when constructing models for optimal values; and (b)a way of using these models to generate new data samples. Here we investigate a combination of the use of Inductive Logic Programming (ILP) for (a), and standard logic-programming machinery to generate new samples for (b). Specifically, on each iteration of distribution estimation, an ILP engine is used to construct a model for good solutions. The resulting theory is then used to guide the generation of new data instances, which are now restricted to those derivable using the ILP model in conjunction with the background knowledge). We demonstrate the approach on two optimisation problems (predicting optimal depth-of-win for the KRK endgame, and job-shop scheduling). Our results are promising: (a)On each iteration of distribution estimation, samples obtained with an ILP theory have a substantially greater proportion of good solutions than samples without a theory; and (b)On termination of distribution estimation, samples obtained with an ILP theory contain more near-optimal samples than samples without a theory. Taken together, these results suggest that the use of ILP-constructed theories could be a useful technique for incorporating complex domain-knowledge into estimation distribution procedures.
Tasks
Published 2016-08-03
URL http://arxiv.org/abs/1608.01093v2
PDF http://arxiv.org/pdf/1608.01093v2.pdf
PWC https://paperswithcode.com/paper/generation-of-near-optimal-solutions-using
Repo
Framework

Rapid Posterior Exploration in Bayesian Non-negative Matrix Factorization

Title Rapid Posterior Exploration in Bayesian Non-negative Matrix Factorization
Authors M. Arjumand Masood, Finale Doshi-Velez
Abstract Non-negative Matrix Factorization (NMF) is a popular tool for data exploration. Bayesian NMF promises to also characterize uncertainty in the factorization. Unfortunately, current inference approaches such as MCMC mix slowly and tend to get stuck on single modes. We introduce a novel approach using rapidly-exploring random trees (RRTs) to asymptotically cover regions of high posterior density. These are placed in a principled Bayesian framework via an online extension to nonparametric variational inference. On experiments on real and synthetic data, we obtain greater coverage of the posterior and higher ELBO values than standard NMF inference approaches.
Tasks
Published 2016-10-27
URL http://arxiv.org/abs/1610.08928v1
PDF http://arxiv.org/pdf/1610.08928v1.pdf
PWC https://paperswithcode.com/paper/rapid-posterior-exploration-in-bayesian-non
Repo
Framework

Teaching natural language to computers

Title Teaching natural language to computers
Authors Joseph Corneli, Miriam Corneli
Abstract “Natural Language,” whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers – and programmers – engage with, and aims to fine tune the questions we ask to better account for context, self-awareness, and embodiment.
Tasks
Published 2016-04-29
URL http://arxiv.org/abs/1604.08781v2
PDF http://arxiv.org/pdf/1604.08781v2.pdf
PWC https://paperswithcode.com/paper/teaching-natural-language-to-computers
Repo
Framework

Beyond Caption To Narrative: Video Captioning With Multiple Sentences

Title Beyond Caption To Narrative: Video Captioning With Multiple Sentences
Authors Andrew Shin, Katsunori Ohnishi, Tatsuya Harada
Abstract Recent advances in image captioning task have led to increasing interests in video captioning task. However, most works on video captioning are focused on generating single input of aggregated features, which hardly deviates from image captioning process and does not fully take advantage of dynamic contents present in videos. We attempt to generate video captions that convey richer contents by temporally segmenting the video with action localization, generating multiple captions from multiple frames, and connecting them with natural language processing techniques, in order to generate a story-like caption. We show that our proposed method can generate captions that are richer in contents and can compete with state-of-the-art method without explicitly using video-level features as input.
Tasks Action Localization, Image Captioning, Video Captioning
Published 2016-05-18
URL http://arxiv.org/abs/1605.05440v1
PDF http://arxiv.org/pdf/1605.05440v1.pdf
PWC https://paperswithcode.com/paper/beyond-caption-to-narrative-video-captioning
Repo
Framework

Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals

Title Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals
Authors Manuel J. A. Eugster, Tuukka Ruotsalo, Michiel M. Spapé, Oswald Barral, Niklas Ravaja, Giulio Jacucci, Samuel Kaski
Abstract Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user’s interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual’s search intent was modeled and successfully used for retrieving new, relevant documents from the whole English Wikipedia corpus. The results show that the users’ interests towards digital content can be modeled from the brain signals evoked by reading. The introduced brain-relevance paradigm enables the recommendation of information without any explicit user interaction, and may be applied across diverse information-intensive applications.
Tasks EEG
Published 2016-07-12
URL http://arxiv.org/abs/1607.03502v1
PDF http://arxiv.org/pdf/1607.03502v1.pdf
PWC https://paperswithcode.com/paper/natural-brain-information-interfaces
Repo
Framework

Spatio-temporal Aware Non-negative Component Representation for Action Recognition

Title Spatio-temporal Aware Non-negative Component Representation for Action Recognition
Authors Jianhong Wang, Tian Lan, Xu Zhang, Limin Luo
Abstract This paper presents a novel mid-level representation for action recognition, named spatio-temporal aware non-negative component representation (STANNCR). The proposed STANNCR is based on action component and incorporates the spatial-temporal information. We first introduce a spatial-temporal distribution vector (STDV) to model the distributions of local feature locations in a compact and discriminative manner. Then we employ non-negative matrix factorization (NMF) to learn the action components and encode the video samples. The action component considers the correlations of visual words, which effectively bridge the sematic gap in action recognition. To incorporate the spatial-temporal cues for final representation, the STDV is used as the part of graph regularization for NMF. The fusion of spatial-temporal information makes the STANNCR more discriminative, and our fusion manner is more compact than traditional method of concatenating vectors. The proposed approach is extensively evaluated on three public datasets. The experimental results demonstrate the effectiveness of STANNCR for action recognition.
Tasks Temporal Action Localization
Published 2016-08-27
URL http://arxiv.org/abs/1608.07664v1
PDF http://arxiv.org/pdf/1608.07664v1.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-aware-non-negative-component
Repo
Framework

Update Strength in EDAs and ACO: How to Avoid Genetic Drift

Title Update Strength in EDAs and ACO: How to Avoid Genetic Drift
Authors Dirk Sudholt, Carsten Witt
Abstract We provide a rigorous runtime analysis concerning the update strength, a vital parameter in probabilistic model-building GAs such as the step size $1/K$ in the compact Genetic Algorithm (cGA) and the evaporation factor $\rho$ in ACO. While a large update strength is desirable for exploitation, there is a general trade-off: too strong updates can lead to genetic drift and poor performance. We demonstrate this trade-off for the cGA and a simple MMAS ACO algorithm on the OneMax function. More precisely, we obtain lower bounds on the expected runtime of $\Omega(K\sqrt{n} + n \log n)$ and $\Omega(\sqrt{n}/\rho + n \log n)$, respectively, showing that the update strength should be limited to $1/K, \rho = O(1/(\sqrt{n} \log n))$. In fact, choosing $1/K, \rho \sim 1/(\sqrt{n}\log n)$ both algorithms efficiently optimize OneMax in expected time $O(n \log n)$. Our analyses provide new insights into the stochastic behavior of probabilistic model-building GAs and propose new guidelines for setting the update strength in global optimization.
Tasks
Published 2016-07-14
URL http://arxiv.org/abs/1607.04063v2
PDF http://arxiv.org/pdf/1607.04063v2.pdf
PWC https://paperswithcode.com/paper/update-strength-in-edas-and-aco-how-to-avoid
Repo
Framework

Adaptive and Efficient Nonlinear Channel Equalization for Underwater Acoustic Communication

Title Adaptive and Efficient Nonlinear Channel Equalization for Underwater Acoustic Communication
Authors Dariush Kari, Nuri Denizcan Vanli, Suleyman Serdar Kozat
Abstract We investigate underwater acoustic (UWA) channel equalization and introduce hierarchical and adaptive nonlinear channel equalization algorithms that are highly efficient and provide significantly improved bit error rate (BER) performance. Due to the high complexity of nonlinear equalizers and poor performance of linear ones, to equalize highly difficult underwater acoustic channels, we employ piecewise linear equalizers. However, in order to achieve the performance of the best piecewise linear model, we use a tree structure to hierarchically partition the space of the received signal. Furthermore, the equalization algorithm should be completely adaptive, since due to the highly non-stationary nature of the underwater medium, the optimal MSE equalizer as well as the best piecewise linear equalizer changes in time. To this end, we introduce an adaptive piecewise linear equalization algorithm that not only adapts the linear equalizer at each region but also learns the complete hierarchical structure with a computational complexity only polynomial in the number of nodes of the tree. Furthermore, our algorithm is constructed to directly minimize the final squared error without introducing any ad-hoc parameters. We demonstrate the performance of our algorithms through highly realistic experiments performed on accurately simulated underwater acoustic channels.
Tasks
Published 2016-01-06
URL http://arxiv.org/abs/1601.01218v1
PDF http://arxiv.org/pdf/1601.01218v1.pdf
PWC https://paperswithcode.com/paper/adaptive-and-efficient-nonlinear-channel
Repo
Framework

The Movie Graph Argument Revisited

Title The Movie Graph Argument Revisited
Authors Russell K. Standish
Abstract In this paper, we reexamine the Movie Graph Argument, which demonstrates a basic incompatibility between computationalism and materialism. We discover that the incompatibility is only manifest in singular classical-like universes. If we accept that we live in a Multiverse, then the incompatibility goes away, but in that case another line of argument shows that with computationalism, the fundamental, or primitive materiality has no causal influence on what is observed, which must must be derivable from basic arithmetic properties.
Tasks
Published 2016-08-28
URL http://arxiv.org/abs/1608.07764v1
PDF http://arxiv.org/pdf/1608.07764v1.pdf
PWC https://paperswithcode.com/paper/the-movie-graph-argument-revisited
Repo
Framework

Fuzzy Constraints Linear Discriminant Analysis

Title Fuzzy Constraints Linear Discriminant Analysis
Authors Hamid Reza Hassanzadeh, Hadi Sadoghi Yazdi, Abedin Vahedian
Abstract In this paper we introduce a fuzzy constraint linear discriminant analysis (FC-LDA). The FC-LDA tries to minimize misclassification error based on modified perceptron criterion that benefits handling the uncertainty near the decision boundary by means of a fuzzy linear programming approach with fuzzy resources. The method proposed has low computational complexity because of its linear characteristics and the ability to deal with noisy data with different degrees of tolerance. Obtained results verify the success of the algorithm when dealing with different problems. Comparing FC-LDA and LDA shows superiority in classification task.
Tasks
Published 2016-12-30
URL http://arxiv.org/abs/1612.09593v1
PDF http://arxiv.org/pdf/1612.09593v1.pdf
PWC https://paperswithcode.com/paper/fuzzy-constraints-linear-discriminant
Repo
Framework

Bayesian Reinforcement Learning: A Survey

Title Bayesian Reinforcement Learning: A Survey
Authors Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar
Abstract Bayesian methods for machine learning have been widely investigated, yielding principled methods for incorporating prior information into inference algorithms. In this survey, we provide an in-depth review of the role of Bayesian methods for the reinforcement learning (RL) paradigm. The major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action-selection (exploration/exploitation) as a function of the uncertainty in learning; and 2) it provides a machinery to incorporate prior knowledge into the algorithms. We first discuss models and methods for Bayesian inference in the simple single-step Bandit model. We then review the extensive recent literature on Bayesian methods for model-based RL, where prior information can be expressed on the parameters of the Markov model. We also present Bayesian methods for model-free RL, where priors are expressed over the value function or policy class. The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties.
Tasks Bayesian Inference
Published 2016-09-14
URL http://arxiv.org/abs/1609.04436v1
PDF http://arxiv.org/pdf/1609.04436v1.pdf
PWC https://paperswithcode.com/paper/bayesian-reinforcement-learning-a-survey
Repo
Framework
comments powered by Disqus