Paper Group ANR 100
Learning Features of Music from Scratch. Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning. BreakingNews: Article Annotation by Image and Text Processing. Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity. Generation of Near-Optimal Solutions …
Learning Features of Music from Scratch
Title | Learning Features of Music from Scratch |
Authors | John Thickstun, Zaid Harchaoui, Sham Kakade |
Abstract | This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research. MusicNet consists of hundreds of freely-licensed classical music recordings by 10 composers, written for 11 instruments, together with instrument/note annotations resulting in over 1 million temporal labels on 34 hours of chamber music performances under various studio and microphone conditions. The paper defines a multi-label classification task to predict notes in musical recordings, along with an evaluation protocol, and benchmarks several machine learning architectures for this task: i) learning from spectrogram features; ii) end-to-end learning with a neural net; iii) end-to-end learning with a convolutional neural net. These experiments show that end-to-end models trained for note prediction learn frequency selective filters as a low-level representation of audio. |
Tasks | Multi-Label Classification |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09827v2 |
http://arxiv.org/pdf/1611.09827v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-features-of-music-from-scratch |
Repo | |
Framework | |
Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning
Title | Alternative Technique to Asymmetry Analysis-Based Overlapping for Foot Ulcer Examination: Scalable Scanning |
Authors | Naima Kaabouch, Wen-Chen Hu, Yi Chen |
Abstract | Asymmetry analysis based on the overlapping of thermal images proved able to detect inflammation and, predict foot ulceration. This technique involves three main steps: segmentation, geometric transformation, and overlapping. However, the overlapping technique, which consists of subtracting the intensity levels of the right foot from those of the left foot, can also detect false abnormal areas if the projections of the left and right feet are not the same. In this paper, we present an alternative technique to asymmetry analysis-based overlapping. The proposed technique, scalable scanning, allows for an effective comparison even if the shapes and sizes of the feet projections appear differently in the image. The tested results show that asymmetry analysis- based scalable scanning provides fewer false abnormal areas than does asymmetry analysis -based overlapping. |
Tasks | |
Published | 2016-06-11 |
URL | http://arxiv.org/abs/1606.03578v1 |
http://arxiv.org/pdf/1606.03578v1.pdf | |
PWC | https://paperswithcode.com/paper/alternative-technique-to-asymmetry-analysis |
Repo | |
Framework | |
BreakingNews: Article Annotation by Image and Text Processing
Title | BreakingNews: Article Annotation by Image and Text Processing |
Authors | Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk |
Abstract | Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of computer vision and natural language processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of News articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce new deep learning methods that address source detection, popularity prediction, article illustration and geolocation of articles. An adaptive CNN architecture is proposed, that shares most of the structure for all the tasks, and is suitable for multitask and transfer learning. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and popularity metrics). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field. |
Tasks | Image Retrieval, Transfer Learning |
Published | 2016-03-23 |
URL | http://arxiv.org/abs/1603.07141v1 |
http://arxiv.org/pdf/1603.07141v1.pdf | |
PWC | https://paperswithcode.com/paper/breakingnews-article-annotation-by-image-and |
Repo | |
Framework | |
Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity
Title | Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity |
Authors | Jia Xu, Zu-Zhen Huang, Zhi-Rui Wang, Li Xiao, Xiang-Gen Xia, Teng Long |
Abstract | In this paper, with respect to multichannel synthetic aperture radars (SAR), we first formulate the problems of Doppler ambiguities on the radial velocity (RV) estimation of a ground moving target in range-compressed domain, range-Doppler domain and image domain, respectively. It is revealed that in these problems, a cascaded time-space Doppler ambiguity (CTSDA) may encounter, i.e., time domain Doppler ambiguity (TDDA) in each channel arises first and then spatial domain Doppler ambiguity (SDDA) among multi-channels arises second. Accordingly, the multichannel SAR systems with different parameters are investigated in three different cases with diverse Doppler ambiguity properties, and a multi-frequency SAR is then proposed to obtain the RV estimation by solving the ambiguity problem based on Chinese remainder theorem (CRT). In the first two cases, the ambiguity problem can be solved by the existing closed-form robust CRT. In the third case, it is found that the problem is different from the conventional CRT problems and we call it a double remaindering problem in this paper. We then propose a sufficient condition under which the double remaindering problem, i.e., the CTSDA, can also be solved by the closed-form robust CRT. When the sufficient condition is not satisfied for a multi-channel SAR, a searching based method is proposed. Finally, some results of numerical experiments are provided to demonstrate the effectiveness of the proposed methods. |
Tasks | |
Published | 2016-10-01 |
URL | http://arxiv.org/abs/1610.00070v3 |
http://arxiv.org/pdf/1610.00070v3.pdf | |
PWC | https://paperswithcode.com/paper/radial-velocity-retrieval-for-multichannel |
Repo | |
Framework | |
Generation of Near-Optimal Solutions Using ILP-Guided Sampling
Title | Generation of Near-Optimal Solutions Using ILP-Guided Sampling |
Authors | Ashwin Srinivasan, Gautam Shroff, Lovekesh Vig, Sarmimala Saikia, Puneet Agarwal |
Abstract | Our interest in this paper is in optimisation problems that are intractable to solve by direct numerical optimisation, but nevertheless have significant amounts of relevant domain-specific knowledge. The category of heuristic search techniques known as estimation of distribution algorithms (EDAs) seek to incrementally sample from probability distributions in which optimal (or near-optimal) solutions have increasingly higher probabilities. Can we use domain knowledge to assist the estimation of these distributions? To answer this in the affirmative, we need: (a)a general-purpose technique for the incorporation of domain knowledge when constructing models for optimal values; and (b)a way of using these models to generate new data samples. Here we investigate a combination of the use of Inductive Logic Programming (ILP) for (a), and standard logic-programming machinery to generate new samples for (b). Specifically, on each iteration of distribution estimation, an ILP engine is used to construct a model for good solutions. The resulting theory is then used to guide the generation of new data instances, which are now restricted to those derivable using the ILP model in conjunction with the background knowledge). We demonstrate the approach on two optimisation problems (predicting optimal depth-of-win for the KRK endgame, and job-shop scheduling). Our results are promising: (a)On each iteration of distribution estimation, samples obtained with an ILP theory have a substantially greater proportion of good solutions than samples without a theory; and (b)On termination of distribution estimation, samples obtained with an ILP theory contain more near-optimal samples than samples without a theory. Taken together, these results suggest that the use of ILP-constructed theories could be a useful technique for incorporating complex domain-knowledge into estimation distribution procedures. |
Tasks | |
Published | 2016-08-03 |
URL | http://arxiv.org/abs/1608.01093v2 |
http://arxiv.org/pdf/1608.01093v2.pdf | |
PWC | https://paperswithcode.com/paper/generation-of-near-optimal-solutions-using |
Repo | |
Framework | |
Rapid Posterior Exploration in Bayesian Non-negative Matrix Factorization
Title | Rapid Posterior Exploration in Bayesian Non-negative Matrix Factorization |
Authors | M. Arjumand Masood, Finale Doshi-Velez |
Abstract | Non-negative Matrix Factorization (NMF) is a popular tool for data exploration. Bayesian NMF promises to also characterize uncertainty in the factorization. Unfortunately, current inference approaches such as MCMC mix slowly and tend to get stuck on single modes. We introduce a novel approach using rapidly-exploring random trees (RRTs) to asymptotically cover regions of high posterior density. These are placed in a principled Bayesian framework via an online extension to nonparametric variational inference. On experiments on real and synthetic data, we obtain greater coverage of the posterior and higher ELBO values than standard NMF inference approaches. |
Tasks | |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08928v1 |
http://arxiv.org/pdf/1610.08928v1.pdf | |
PWC | https://paperswithcode.com/paper/rapid-posterior-exploration-in-bayesian-non |
Repo | |
Framework | |
Teaching natural language to computers
Title | Teaching natural language to computers |
Authors | Joseph Corneli, Miriam Corneli |
Abstract | “Natural Language,” whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers – and programmers – engage with, and aims to fine tune the questions we ask to better account for context, self-awareness, and embodiment. |
Tasks | |
Published | 2016-04-29 |
URL | http://arxiv.org/abs/1604.08781v2 |
http://arxiv.org/pdf/1604.08781v2.pdf | |
PWC | https://paperswithcode.com/paper/teaching-natural-language-to-computers |
Repo | |
Framework | |
Beyond Caption To Narrative: Video Captioning With Multiple Sentences
Title | Beyond Caption To Narrative: Video Captioning With Multiple Sentences |
Authors | Andrew Shin, Katsunori Ohnishi, Tatsuya Harada |
Abstract | Recent advances in image captioning task have led to increasing interests in video captioning task. However, most works on video captioning are focused on generating single input of aggregated features, which hardly deviates from image captioning process and does not fully take advantage of dynamic contents present in videos. We attempt to generate video captions that convey richer contents by temporally segmenting the video with action localization, generating multiple captions from multiple frames, and connecting them with natural language processing techniques, in order to generate a story-like caption. We show that our proposed method can generate captions that are richer in contents and can compete with state-of-the-art method without explicitly using video-level features as input. |
Tasks | Action Localization, Image Captioning, Video Captioning |
Published | 2016-05-18 |
URL | http://arxiv.org/abs/1605.05440v1 |
http://arxiv.org/pdf/1605.05440v1.pdf | |
PWC | https://paperswithcode.com/paper/beyond-caption-to-narrative-video-captioning |
Repo | |
Framework | |
Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals
Title | Natural brain-information interfaces: Recommending information by relevance inferred from human brain signals |
Authors | Manuel J. A. Eugster, Tuukka Ruotsalo, Michiel M. Spapé, Oswald Barral, Niklas Ravaja, Giulio Jacucci, Samuel Kaski |
Abstract | Finding relevant information from large document collections such as the World Wide Web is a common task in our daily lives. Estimation of a user’s interest or search intention is necessary to recommend and retrieve relevant information from these collections. We introduce a brain-information interface used for recommending information by relevance inferred directly from brain signals. In experiments, participants were asked to read Wikipedia documents about a selection of topics while their EEG was recorded. Based on the prediction of word relevance, the individual’s search intent was modeled and successfully used for retrieving new, relevant documents from the whole English Wikipedia corpus. The results show that the users’ interests towards digital content can be modeled from the brain signals evoked by reading. The introduced brain-relevance paradigm enables the recommendation of information without any explicit user interaction, and may be applied across diverse information-intensive applications. |
Tasks | EEG |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03502v1 |
http://arxiv.org/pdf/1607.03502v1.pdf | |
PWC | https://paperswithcode.com/paper/natural-brain-information-interfaces |
Repo | |
Framework | |
Spatio-temporal Aware Non-negative Component Representation for Action Recognition
Title | Spatio-temporal Aware Non-negative Component Representation for Action Recognition |
Authors | Jianhong Wang, Tian Lan, Xu Zhang, Limin Luo |
Abstract | This paper presents a novel mid-level representation for action recognition, named spatio-temporal aware non-negative component representation (STANNCR). The proposed STANNCR is based on action component and incorporates the spatial-temporal information. We first introduce a spatial-temporal distribution vector (STDV) to model the distributions of local feature locations in a compact and discriminative manner. Then we employ non-negative matrix factorization (NMF) to learn the action components and encode the video samples. The action component considers the correlations of visual words, which effectively bridge the sematic gap in action recognition. To incorporate the spatial-temporal cues for final representation, the STDV is used as the part of graph regularization for NMF. The fusion of spatial-temporal information makes the STANNCR more discriminative, and our fusion manner is more compact than traditional method of concatenating vectors. The proposed approach is extensively evaluated on three public datasets. The experimental results demonstrate the effectiveness of STANNCR for action recognition. |
Tasks | Temporal Action Localization |
Published | 2016-08-27 |
URL | http://arxiv.org/abs/1608.07664v1 |
http://arxiv.org/pdf/1608.07664v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-aware-non-negative-component |
Repo | |
Framework | |
Update Strength in EDAs and ACO: How to Avoid Genetic Drift
Title | Update Strength in EDAs and ACO: How to Avoid Genetic Drift |
Authors | Dirk Sudholt, Carsten Witt |
Abstract | We provide a rigorous runtime analysis concerning the update strength, a vital parameter in probabilistic model-building GAs such as the step size $1/K$ in the compact Genetic Algorithm (cGA) and the evaporation factor $\rho$ in ACO. While a large update strength is desirable for exploitation, there is a general trade-off: too strong updates can lead to genetic drift and poor performance. We demonstrate this trade-off for the cGA and a simple MMAS ACO algorithm on the OneMax function. More precisely, we obtain lower bounds on the expected runtime of $\Omega(K\sqrt{n} + n \log n)$ and $\Omega(\sqrt{n}/\rho + n \log n)$, respectively, showing that the update strength should be limited to $1/K, \rho = O(1/(\sqrt{n} \log n))$. In fact, choosing $1/K, \rho \sim 1/(\sqrt{n}\log n)$ both algorithms efficiently optimize OneMax in expected time $O(n \log n)$. Our analyses provide new insights into the stochastic behavior of probabilistic model-building GAs and propose new guidelines for setting the update strength in global optimization. |
Tasks | |
Published | 2016-07-14 |
URL | http://arxiv.org/abs/1607.04063v2 |
http://arxiv.org/pdf/1607.04063v2.pdf | |
PWC | https://paperswithcode.com/paper/update-strength-in-edas-and-aco-how-to-avoid |
Repo | |
Framework | |
Adaptive and Efficient Nonlinear Channel Equalization for Underwater Acoustic Communication
Title | Adaptive and Efficient Nonlinear Channel Equalization for Underwater Acoustic Communication |
Authors | Dariush Kari, Nuri Denizcan Vanli, Suleyman Serdar Kozat |
Abstract | We investigate underwater acoustic (UWA) channel equalization and introduce hierarchical and adaptive nonlinear channel equalization algorithms that are highly efficient and provide significantly improved bit error rate (BER) performance. Due to the high complexity of nonlinear equalizers and poor performance of linear ones, to equalize highly difficult underwater acoustic channels, we employ piecewise linear equalizers. However, in order to achieve the performance of the best piecewise linear model, we use a tree structure to hierarchically partition the space of the received signal. Furthermore, the equalization algorithm should be completely adaptive, since due to the highly non-stationary nature of the underwater medium, the optimal MSE equalizer as well as the best piecewise linear equalizer changes in time. To this end, we introduce an adaptive piecewise linear equalization algorithm that not only adapts the linear equalizer at each region but also learns the complete hierarchical structure with a computational complexity only polynomial in the number of nodes of the tree. Furthermore, our algorithm is constructed to directly minimize the final squared error without introducing any ad-hoc parameters. We demonstrate the performance of our algorithms through highly realistic experiments performed on accurately simulated underwater acoustic channels. |
Tasks | |
Published | 2016-01-06 |
URL | http://arxiv.org/abs/1601.01218v1 |
http://arxiv.org/pdf/1601.01218v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-and-efficient-nonlinear-channel |
Repo | |
Framework | |
The Movie Graph Argument Revisited
Title | The Movie Graph Argument Revisited |
Authors | Russell K. Standish |
Abstract | In this paper, we reexamine the Movie Graph Argument, which demonstrates a basic incompatibility between computationalism and materialism. We discover that the incompatibility is only manifest in singular classical-like universes. If we accept that we live in a Multiverse, then the incompatibility goes away, but in that case another line of argument shows that with computationalism, the fundamental, or primitive materiality has no causal influence on what is observed, which must must be derivable from basic arithmetic properties. |
Tasks | |
Published | 2016-08-28 |
URL | http://arxiv.org/abs/1608.07764v1 |
http://arxiv.org/pdf/1608.07764v1.pdf | |
PWC | https://paperswithcode.com/paper/the-movie-graph-argument-revisited |
Repo | |
Framework | |
Fuzzy Constraints Linear Discriminant Analysis
Title | Fuzzy Constraints Linear Discriminant Analysis |
Authors | Hamid Reza Hassanzadeh, Hadi Sadoghi Yazdi, Abedin Vahedian |
Abstract | In this paper we introduce a fuzzy constraint linear discriminant analysis (FC-LDA). The FC-LDA tries to minimize misclassification error based on modified perceptron criterion that benefits handling the uncertainty near the decision boundary by means of a fuzzy linear programming approach with fuzzy resources. The method proposed has low computational complexity because of its linear characteristics and the ability to deal with noisy data with different degrees of tolerance. Obtained results verify the success of the algorithm when dealing with different problems. Comparing FC-LDA and LDA shows superiority in classification task. |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09593v1 |
http://arxiv.org/pdf/1612.09593v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-constraints-linear-discriminant |
Repo | |
Framework | |
Bayesian Reinforcement Learning: A Survey
Title | Bayesian Reinforcement Learning: A Survey |
Authors | Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar |
Abstract | Bayesian methods for machine learning have been widely investigated, yielding principled methods for incorporating prior information into inference algorithms. In this survey, we provide an in-depth review of the role of Bayesian methods for the reinforcement learning (RL) paradigm. The major incentives for incorporating Bayesian reasoning in RL are: 1) it provides an elegant approach to action-selection (exploration/exploitation) as a function of the uncertainty in learning; and 2) it provides a machinery to incorporate prior knowledge into the algorithms. We first discuss models and methods for Bayesian inference in the simple single-step Bandit model. We then review the extensive recent literature on Bayesian methods for model-based RL, where prior information can be expressed on the parameters of the Markov model. We also present Bayesian methods for model-free RL, where priors are expressed over the value function or policy class. The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties. |
Tasks | Bayesian Inference |
Published | 2016-09-14 |
URL | http://arxiv.org/abs/1609.04436v1 |
http://arxiv.org/pdf/1609.04436v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-reinforcement-learning-a-survey |
Repo | |
Framework | |