May 5, 2019

3291 words 16 mins read

Paper Group ANR 530

A Unit Selection Methodology for Music Generation Using Deep Neural Networks. DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size. Submodular Variational Inference for Network Reconstruction. Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods. Performance Improvements of Probabilistic Transcript-a …

A Unit Selection Methodology for Music Generation Using Deep Neural Networks


Title	A Unit Selection Methodology for Music Generation Using Deep Neural Networks
Authors	Mason Bretan, Gil Weinberg, Larry Heck
Abstract	Several methods exist for a computer to generate music based on data including Markov chains, recurrent neural networks, recombinancy, and grammars. We explore the use of unit selection and concatenation as a means of generating music using a procedure based on ranking, where, we consider a unit to be a variable length number of measures of music. We first examine whether a unit selection method, that is restricted to a finite size unit library, can be sufficient for encompassing a wide spectrum of music. We do this by developing a deep autoencoder that encodes a musical input and reconstructs the input by selecting from the library. We then describe a generative model that combines a deep structured semantic model (DSSM) with an LSTM to predict the next unit, where units consist of four, two, and one measures of music. We evaluate the generative model using objective metrics including mean rank and accuracy and with a subjective listening test in which expert musicians are asked to complete a forced-choiced ranking task. We compare our model to a note-level generative baseline that consists of a stacked LSTM trained to predict forward by one note.
Tasks	Music Generation
Published	2016-12-12
URL	http://arxiv.org/abs/1612.03789v1
PDF	http://arxiv.org/pdf/1612.03789v1.pdf
PWC	https://paperswithcode.com/paper/a-unit-selection-methodology-for-music
Repo
Framework

DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size


Title	DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size
Authors	Maya Kabkab, Azadeh Alavi, Rama Chellappa
Abstract	Large-scale supervised classification algorithms, especially those based on deep convolutional neural networks (DCNNs), require vast amounts of training data to achieve state-of-the-art performance. Decreasing this data requirement would significantly speed up the training process and possibly improve generalization. Motivated by this objective, we consider the task of adaptively finding concise training subsets which will be iteratively presented to the learner. We use convex optimization methods, based on an objective criterion and feedback from the current performance of the classifier, to efficiently identify informative samples to train on. We propose an algorithm to decompose the optimization problem into smaller per-class problems, which can be solved in parallel. We test our approach on standard classification tasks and demonstrate its effectiveness in decreasing the training set size without compromising performance. We also show that our approach can make the classifier more robust in the presence of label noise and class imbalance.
Tasks
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04232v1
PDF	http://arxiv.org/pdf/1606.04232v1.pdf
PWC	https://paperswithcode.com/paper/dcnns-on-a-diet-sampling-strategies-for
Repo
Framework

Submodular Variational Inference for Network Reconstruction


Title	Submodular Variational Inference for Network Reconstruction
Authors	Lin Chen, Forrest W Crawford, Amin Karbasi
Abstract	In real-world and online social networks, individuals receive and transmit information in real time. Cascading information transmissions (e.g. phone calls, text messages, social media posts) may be understood as a realization of a diffusion process operating on the network, and its branching path can be represented by a directed tree. The process only traverses and thus reveals a limited portion of the edges. The network reconstruction/inference problem is to infer the unrevealed connections. Most existing approaches derive a likelihood and attempt to find the network topology maximizing the likelihood, a problem that is highly intractable. In this paper, we focus on the network reconstruction problem for a broad class of real-world diffusion processes, exemplified by a network diffusion scheme called respondent-driven sampling (RDS). We prove that under realistic and general models of network diffusion, the posterior distribution of an observed RDS realization is a Bayesian log-submodular model.We then propose VINE (Variational Inference for Network rEconstruction), a novel, accurate, and computationally efficient variational inference algorithm, for the network reconstruction problem under this model. Crucially, we do not assume any particular probabilistic model for the underlying network. VINE recovers any connected graph with high accuracy as shown by our experimental results on real-life networks.
Tasks
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08616v2
PDF	http://arxiv.org/pdf/1603.08616v2.pdf
PWC	https://paperswithcode.com/paper/submodular-variational-inference-for-network
Repo
Framework

Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods


Title	Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods
Authors	Mehrtash Harandi, Mathieu Salzmann, Richard Hartley
Abstract	Representing images and videos with Symmetric Positive Definite (SPD) matrices, and considering the Riemannian geometry of the resulting space, has been shown to yield high discriminative power in many visual recognition tasks. Unfortunately, computation on the Riemannian manifold of SPD matrices -especially of high-dimensional ones- comes at a high cost that limits the applicability of existing techniques. In this paper, we introduce algorithms able to handle high-dimensional SPD matrices by constructing a lower-dimensional SPD manifold. To this end, we propose to model the mapping from the high-dimensional SPD manifold to the low-dimensional one with an orthonormal projection. This lets us formulate dimensionality reduction as the problem of finding a projection that yields a low-dimensional manifold either with maximum discriminative power in the supervised scenario, or with maximum variance of the data in the unsupervised one. We show that learning can be expressed as an optimization problem on a Grassmann manifold and discuss fast solutions for special cases. Our evaluation on several classification tasks evidences that our approach leads to a significant accuracy gain over state-of-the-art methods.
Tasks	Dimensionality Reduction
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06182v1
PDF	http://arxiv.org/pdf/1605.06182v1.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-on-spd-manifolds-the
Repo
Framework

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints


Title	Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints
Authors	Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson
Abstract	Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language.Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages. In this work, we describe two techniques to refine these probabilistic transcriptions: a noisy-channel model of non-native phone misperception is trained using a recurrent neural net-work, and decoded using minimally-resourced language-dependent pronunciation constraints. Both innovations improve quality of the transcript, and both innovations reduce phone error rate of a trainedASR, by 7% and 9% respectively
Tasks
Published	2016-12-13
URL	http://arxiv.org/abs/1612.03991v1
PDF	http://arxiv.org/pdf/1612.03991v1.pdf
PWC	https://paperswithcode.com/paper/performance-improvements-of-probabilistic
Repo
Framework

Aggressive actions and anger detection from multiple modalities using Kinect


Title	Aggressive actions and anger detection from multiple modalities using Kinect
Authors	Amol Patwardhan, Gerald Knapp
Abstract	Prison facilities, mental correctional institutions, sports bars and places of public protest are prone to sudden violence and conflicts. Surveillance systems play an important role in mitigation of hostile behavior and improvement of security by detecting such provocative and aggressive activities. This research proposed using automatic aggressive behavior and anger detection to improve the effectiveness of the surveillance systems. An emotion and aggression aware component will make the surveillance system highly responsive and capable of alerting the security guards in real time. This research proposed facial expression, head, hand and body movement and speech tracking for detecting anger and aggressive actions. Recognition was achieved using support vector machines and rule based features. The multimodal affect recognition precision rate for anger improved by 15.2% and recall rate improved by 11.7% when behavioral rule based features were used in aggressive action detection.
Tasks	Action Detection
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01076v1
PDF	http://arxiv.org/pdf/1607.01076v1.pdf
PWC	https://paperswithcode.com/paper/aggressive-actions-and-anger-detection-from
Repo
Framework

Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis


Title	Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis
Authors	Valero Laparra, Sandra Jiménez, Gustavo Camps-Valls, Jesús Malo
Abstract	Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical approaches that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as for instance by assuming flat Lambertian surfaces. Here we address the simultaneous statistical explanation of (i) the nonlinear behavior of achromatic and chromatic mechanisms in a fixed adaptation state, and (ii) the change of such behavior. Both phenomena emerge directly from the samples through a single data-driven method: the Sequential Principal Curves Analysis (SPCA) with local metric. SPCA is a new manifold learning technique to derive a set of sensors adapted to the manifold using different optimality criteria. A new database of colorimetrically calibrated images of natural objects under these illuminants was collected. The results obtained by applying SPCA show that the psychophysical behavior on color discrimination thresholds, discount of the illuminant and corresponding pairs in asymmetric color matching, emerge directly from realistic data regularities assuming no a priori functional form. These results provide stronger evidence for the hypothesis of a statistically driven organization of color sensors. Moreover, the obtained results suggest that color perception at this low abstraction level may be guided by an error minimization strategy rather than by the information maximization principle.
Tasks
Published	2016-01-31
URL	http://arxiv.org/abs/1602.00236v1
PDF	http://arxiv.org/pdf/1602.00236v1.pdf
PWC	https://paperswithcode.com/paper/nonlinearities-and-adaptation-of-color-vision
Repo
Framework

Macro-optimization of email recommendation response rates harnessing individual activity levels and group affinity trends


Title	Macro-optimization of email recommendation response rates harnessing individual activity levels and group affinity trends
Authors	Mohammed Korayem, Khalifeh Aljadda, Trey Grainger
Abstract	Recommendation emails are among the best ways to re-engage with customers after they have left a website. While on-site recommendation systems focus on finding the most relevant items for a user at the moment (right item), email recommendations add two critical additional dimensions: who to send recommendations to (right person) and when to send them (right time). It is critical that a recommendation email system not send too many emails to too many users in too short of a time-window, as users may unsubscribe from future emails or become desensitized and ignore future emails if they receive too many. Also, email service providers may mark such emails as spam if too many of their users are contacted in a short time-window. Optimizing email recommendation systems such that they can yield a maximum response rate for a minimum number of email sends is thus critical for the long-term performance of such a system. In this paper, we present a novel recommendation email system that not only generates recommendations, but which also leverages a combination of individual user activity data, as well as the behavior of the group to which they belong, in order to determine each user’s likelihood to respond to any given set of recommendations within a given time period. In doing this, we have effectively created a meta-recommendation system which recommends sets of recommendations in order to optimize the aggregate response rate of the entire system. The proposed technique has been applied successfully within CareerBuilder’s job recommendation email system to generate a 50% increase in total conversions while also decreasing sent emails by 72%
Tasks	Recommendation Systems
Published	2016-09-20
URL	http://arxiv.org/abs/1609.05989v1
PDF	http://arxiv.org/pdf/1609.05989v1.pdf
PWC	https://paperswithcode.com/paper/macro-optimization-of-email-recommendation
Repo
Framework

Nonparametric Bayesian Storyline Detection from Microtexts


Title	Nonparametric Bayesian Storyline Detection from Microtexts
Authors	Vinodh Krishnan, Jacob Eisenstein
Abstract	News events and social media are composed of evolving storylines, which capture public attention for a limited period of time. Identifying storylines requires integrating temporal and linguistic information, and prior work takes a largely heuristic approach. We present a novel online non-parametric Bayesian framework for storyline detection, using the distance-dependent Chinese Restaurant Process (dd-CRP). To ensure efficient linear-time inference, we employ a fixed-lag Gibbs sampling procedure, which is novel for the dd-CRP. We evaluate on the TREC Twitter Timeline Generation (TTG), obtaining encouraging results: despite using a weak baseline retrieval model, the dd-CRP story clustering method is competitive with the best entries in the 2014 TTG task.
Tasks
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04580v2
PDF	http://arxiv.org/pdf/1601.04580v2.pdf
PWC	https://paperswithcode.com/paper/nonparametric-bayesian-storyline-detection
Repo
Framework

Kernelized LRR on Grassmann Manifolds for Subspace Clustering


Title	Kernelized LRR on Grassmann Manifolds for Subspace Clustering
Authors	Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
Abstract	Low rank representation (LRR) has recently attracted great interest due to its pleasing efficacy in exploring low-dimensional sub- space structures embedded in data. One of its successful applications is subspace clustering, by which data are clustered according to the subspaces they belong to. In this paper, at a higher level, we intend to cluster subspaces into classes of subspaces. This is naturally described as a clustering problem on Grassmann manifold. The novelty of this paper is to generalize LRR on Euclidean space onto an LRR model on Grassmann manifold in a uniform kernelized LRR framework. The new method has many applications in data analysis in computer vision tasks. The proposed models have been evaluated on a number of practical data analysis applications. The experimental results show that the proposed models outperform a number of state-of-the-art subspace clustering methods.
Tasks
Published	2016-01-09
URL	http://arxiv.org/abs/1601.02124v1
PDF	http://arxiv.org/pdf/1601.02124v1.pdf
PWC	https://paperswithcode.com/paper/kernelized-lrr-on-grassmann-manifolds-for
Repo
Framework

Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks


Title	Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks
Authors	Sepehr Valipour, Mennatullah Siam, Eleni Stroulia, Martin Jagersand
Abstract	Parking management systems, and vacancy-indication services in particular, can play a valuable role in reducing traffic and energy waste in large cities. Visual detection methods represent a cost-effective option, since they can take advantage of hardware usually already available in many parking lots, namely cameras. However, visual detection methods can be fragile and not easily generalizable. In this paper, we present a robust detection algorithm based on deep convolutional neural networks. We implemented and tested our algorithm on a large baseline dataset, and also on a set of image feeds from actual cameras already installed in parking lots. We have developed a fully functional system, from server-side image analysis to front-end user interface, to demonstrate the practicality of our method.
Tasks
Published	2016-06-30
URL	http://arxiv.org/abs/1606.09367v1
PDF	http://arxiv.org/pdf/1606.09367v1.pdf
PWC	https://paperswithcode.com/paper/parking-stall-vacancy-indicator-system-based
Repo
Framework

An Enhanced Harmony Search Method for Bangla Handwritten Character Recognition Using Region Sampling


Title	An Enhanced Harmony Search Method for Bangla Handwritten Character Recognition Using Region Sampling
Authors	Ritesh Sarkhel, Amit K Saha, Nibaran Das
Abstract	Identification of minimum number of local regions of a handwritten character image, containing well-defined discriminating features which are sufficient for a minimal but complete description of the character is a challenging task. A new region selection technique based on the idea of an enhanced Harmony Search methodology has been proposed here. The powerful framework of Harmony Search has been utilized to search the region space and detect only the most informative regions for correctly recognizing the handwritten character. The proposed method has been tested on handwritten samples of Bangla Basic, Compound and mixed (Basic and Compound characters) characters separately with SVM based classifier using a longest run based feature-set obtained from the image subregions formed by a CG based quad-tree partitioning approach. Applying this methodology on the above mentioned three types of datasets, respectively 43.75%, 12.5% and 37.5% gains have been achieved in terms of region reduction and 2.3%, 0.6% and 1.2% gains have been achieved in terms of recognition accuracy. The results show a sizeable reduction in the minimal number of descriptive regions as well a significant increase in recognition accuracy for all the datasets using the proposed technique. Thus the time and cost related to feature extraction is decreased without dampening the corresponding recognition accuracy.
Tasks
Published	2016-05-02
URL	http://arxiv.org/abs/1605.00420v1
PDF	http://arxiv.org/pdf/1605.00420v1.pdf
PWC	https://paperswithcode.com/paper/an-enhanced-harmony-search-method-for-bangla
Repo
Framework

Escaping Local Optima using Crossover with Emergent or Reinforced Diversity


Title	Escaping Local Optima using Crossover with Emergent or Reinforced Diversity
Authors	Duc-Cuong Dang, Tobias Friedrich, Timo Kötzing, Martin S. Krejca, Per Kristian Lehre, Pietro S. Oliveto, Dirk Sudholt, Andrew M. Sutton
Abstract	Population diversity is essential for avoiding premature convergence in Genetic Algorithms (GAs) and for the effective use of crossover. Yet the dynamics of how diversity emerges in populations are not well understood. We use rigorous runtime analysis to gain insight into population dynamics and GA performance for the ($\mu$+1) GA and the $\text{Jump}_k$ test function. We show that the interplay of crossover and mutation may serve as a catalyst leading to a sudden burst of diversity. This leads to improvements of the expected optimisation time of order $\Omega(n/\log n)$ compared to mutation-only algorithms like (1+1) EA. Moreover, increasing the mutation rate by an arbitrarily small constant factor can facilitate the generation of diversity, leading to speedups of order $\Omega(n)$. We also compare seven commonly used diversity mechanisms and evaluate their impact on runtime bounds for the ($\mu$+1) GA. All previous results in this context only hold for unrealistically low crossover probability $p_c=O(k/n)$, while we give analyses for the setting of constant $p_c < 1$ in all but one case. For the typical case of constant $k > 2$ and constant $p_c$, we can compare the resulting expected runtimes for different diversity mechanisms assuming an optimal choice of $\mu$: $O(n^{k-1})$ for duplicate elimination/minim., $O(n^2\log n)$ for maximising the convex hull, $O(n\log n)$ for deterministic crowding (assuming $p_c = k/n$), $O(n\log n)$ for maximising Hamming distance, $O(n\log n)$ for fitness sharing, $O(n\log n)$ for single-receiver island model. This proves a sizeable advantage of all variants of the ($\mu$+1) GA compared to (1+1) EA, which requires time $\Theta(n^k)$. Experiments complement our theoretical findings and further highlight the benefits of crossover and diversity on $\text{Jump}_k$.
Tasks
Published	2016-08-10
URL	http://arxiv.org/abs/1608.03123v1
PDF	http://arxiv.org/pdf/1608.03123v1.pdf
PWC	https://paperswithcode.com/paper/escaping-local-optima-using-crossover-with
Repo
Framework

Bacterial foraging optimization based brain magnetic resonance image segmentation


Title	Bacterial foraging optimization based brain magnetic resonance image segmentation
Authors	Abdul kayom Md Khairuzzaman
Abstract	Segmentation partitions an image into its constituent parts. It is essentially the pre-processing stage of image analysis and computer vision. In this work, T1 and T2 weighted brain magnetic resonance images are segmented using multilevel thresholding and bacterial foraging optimization (BFO) algorithm. The thresholds are obtained by maximizing the between class variance (multilevel Otsu method) of the image. The BFO algorithm is used to optimize the threshold searching process. The edges are then obtained from the thresholded image by comparing the intensity of each pixel with its eight connected neighbourhood. Post processing is performed to remove spurious responses in the segmented image. The proposed segmentation technique is evaluated using edge detector evaluation parameters such as figure of merit, Rand Index and variation of information. The proposed brain MR image segmentation technique outperforms the traditional edge detectors such as canny and sobel.
Tasks	Semantic Segmentation
Published	2016-05-19
URL	http://arxiv.org/abs/1605.05815v1
PDF	http://arxiv.org/pdf/1605.05815v1.pdf
PWC	https://paperswithcode.com/paper/bacterial-foraging-optimization-based-brain
Repo
Framework

Enhancing ICA Performance by Exploiting Sparsity: Application to FMRI Analysis


Title	Enhancing ICA Performance by Exploiting Sparsity: Application to FMRI Analysis
Authors	Zois Boukouvalas, Yuri Levin-Schwartz, Tulay Adali
Abstract	Independent component analysis (ICA) is a powerful method for blind source separation based on the assumption that sources are statistically independent. Though ICA has proven useful and has been employed in many applications, complete statistical independence can be too restrictive an assumption in practice. Additionally, important prior information about the data, such as sparsity, is usually available. Sparsity is a natural property of the data, a form of diversity, which, if incorporated into the ICA model, can relax the independence assumption, resulting in an improvement in the overall separation performance. In this work, we propose a new variant of ICA by entropy bound minimization (ICA-EBM)-a flexible, yet parameter-free algorithm-through the direct exploitation of sparsity. Using this new SparseICA-EBM algorithm, we study the synergy of independence and sparsity through simulations on synthetic as well as functional magnetic resonance imaging (fMRI)-like data.
Tasks
Published	2016-10-19
URL	http://arxiv.org/abs/1610.06235v1
PDF	http://arxiv.org/pdf/1610.06235v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-ica-performance-by-exploiting
Repo
Framework