Paper Group ANR 530
A Unit Selection Methodology for Music Generation Using Deep Neural Networks. DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size. Submodular Variational Inference for Network Reconstruction. Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods. Performance Improvements of Probabilistic Transcript-a …
A Unit Selection Methodology for Music Generation Using Deep Neural Networks
Title | A Unit Selection Methodology for Music Generation Using Deep Neural Networks |
Authors | Mason Bretan, Gil Weinberg, Larry Heck |
Abstract | Several methods exist for a computer to generate music based on data including Markov chains, recurrent neural networks, recombinancy, and grammars. We explore the use of unit selection and concatenation as a means of generating music using a procedure based on ranking, where, we consider a unit to be a variable length number of measures of music. We first examine whether a unit selection method, that is restricted to a finite size unit library, can be sufficient for encompassing a wide spectrum of music. We do this by developing a deep autoencoder that encodes a musical input and reconstructs the input by selecting from the library. We then describe a generative model that combines a deep structured semantic model (DSSM) with an LSTM to predict the next unit, where units consist of four, two, and one measures of music. We evaluate the generative model using objective metrics including mean rank and accuracy and with a subjective listening test in which expert musicians are asked to complete a forced-choiced ranking task. We compare our model to a note-level generative baseline that consists of a stacked LSTM trained to predict forward by one note. |
Tasks | Music Generation |
Published | 2016-12-12 |
URL | http://arxiv.org/abs/1612.03789v1 |
http://arxiv.org/pdf/1612.03789v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unit-selection-methodology-for-music |
Repo | |
Framework | |
DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size
Title | DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size |
Authors | Maya Kabkab, Azadeh Alavi, Rama Chellappa |
Abstract | Large-scale supervised classification algorithms, especially those based on deep convolutional neural networks (DCNNs), require vast amounts of training data to achieve state-of-the-art performance. Decreasing this data requirement would significantly speed up the training process and possibly improve generalization. Motivated by this objective, we consider the task of adaptively finding concise training subsets which will be iteratively presented to the learner. We use convex optimization methods, based on an objective criterion and feedback from the current performance of the classifier, to efficiently identify informative samples to train on. We propose an algorithm to decompose the optimization problem into smaller per-class problems, which can be solved in parallel. We test our approach on standard classification tasks and demonstrate its effectiveness in decreasing the training set size without compromising performance. We also show that our approach can make the classifier more robust in the presence of label noise and class imbalance. |
Tasks | |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04232v1 |
http://arxiv.org/pdf/1606.04232v1.pdf | |
PWC | https://paperswithcode.com/paper/dcnns-on-a-diet-sampling-strategies-for |
Repo | |
Framework | |
Submodular Variational Inference for Network Reconstruction
Title | Submodular Variational Inference for Network Reconstruction |
Authors | Lin Chen, Forrest W Crawford, Amin Karbasi |
Abstract | In real-world and online social networks, individuals receive and transmit information in real time. Cascading information transmissions (e.g. phone calls, text messages, social media posts) may be understood as a realization of a diffusion process operating on the network, and its branching path can be represented by a directed tree. The process only traverses and thus reveals a limited portion of the edges. The network reconstruction/inference problem is to infer the unrevealed connections. Most existing approaches derive a likelihood and attempt to find the network topology maximizing the likelihood, a problem that is highly intractable. In this paper, we focus on the network reconstruction problem for a broad class of real-world diffusion processes, exemplified by a network diffusion scheme called respondent-driven sampling (RDS). We prove that under realistic and general models of network diffusion, the posterior distribution of an observed RDS realization is a Bayesian log-submodular model.We then propose VINE (Variational Inference for Network rEconstruction), a novel, accurate, and computationally efficient variational inference algorithm, for the network reconstruction problem under this model. Crucially, we do not assume any particular probabilistic model for the underlying network. VINE recovers any connected graph with high accuracy as shown by our experimental results on real-life networks. |
Tasks | |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08616v2 |
http://arxiv.org/pdf/1603.08616v2.pdf | |
PWC | https://paperswithcode.com/paper/submodular-variational-inference-for-network |
Repo | |
Framework | |
Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods
Title | Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods |
Authors | Mehrtash Harandi, Mathieu Salzmann, Richard Hartley |
Abstract | Representing images and videos with Symmetric Positive Definite (SPD) matrices, and considering the Riemannian geometry of the resulting space, has been shown to yield high discriminative power in many visual recognition tasks. Unfortunately, computation on the Riemannian manifold of SPD matrices -especially of high-dimensional ones- comes at a high cost that limits the applicability of existing techniques. In this paper, we introduce algorithms able to handle high-dimensional SPD matrices by constructing a lower-dimensional SPD manifold. To this end, we propose to model the mapping from the high-dimensional SPD manifold to the low-dimensional one with an orthonormal projection. This lets us formulate dimensionality reduction as the problem of finding a projection that yields a low-dimensional manifold either with maximum discriminative power in the supervised scenario, or with maximum variance of the data in the unsupervised one. We show that learning can be expressed as an optimization problem on a Grassmann manifold and discuss fast solutions for special cases. Our evaluation on several classification tasks evidences that our approach leads to a significant accuracy gain over state-of-the-art methods. |
Tasks | Dimensionality Reduction |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06182v1 |
http://arxiv.org/pdf/1605.06182v1.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-on-spd-manifolds-the |
Repo | |
Framework | |
Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints
Title | Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints |
Authors | Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson |
Abstract | Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language.Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages. In this work, we describe two techniques to refine these probabilistic transcriptions: a noisy-channel model of non-native phone misperception is trained using a recurrent neural net-work, and decoded using minimally-resourced language-dependent pronunciation constraints. Both innovations improve quality of the transcript, and both innovations reduce phone error rate of a trainedASR, by 7% and 9% respectively |
Tasks | |
Published | 2016-12-13 |
URL | http://arxiv.org/abs/1612.03991v1 |
http://arxiv.org/pdf/1612.03991v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-improvements-of-probabilistic |
Repo | |
Framework | |
Aggressive actions and anger detection from multiple modalities using Kinect
Title | Aggressive actions and anger detection from multiple modalities using Kinect |
Authors | Amol Patwardhan, Gerald Knapp |
Abstract | Prison facilities, mental correctional institutions, sports bars and places of public protest are prone to sudden violence and conflicts. Surveillance systems play an important role in mitigation of hostile behavior and improvement of security by detecting such provocative and aggressive activities. This research proposed using automatic aggressive behavior and anger detection to improve the effectiveness of the surveillance systems. An emotion and aggression aware component will make the surveillance system highly responsive and capable of alerting the security guards in real time. This research proposed facial expression, head, hand and body movement and speech tracking for detecting anger and aggressive actions. Recognition was achieved using support vector machines and rule based features. The multimodal affect recognition precision rate for anger improved by 15.2% and recall rate improved by 11.7% when behavioral rule based features were used in aggressive action detection. |
Tasks | Action Detection |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01076v1 |
http://arxiv.org/pdf/1607.01076v1.pdf | |
PWC | https://paperswithcode.com/paper/aggressive-actions-and-anger-detection-from |
Repo | |
Framework | |
Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis
Title | Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis |
Authors | Valero Laparra, Sandra Jiménez, Gustavo Camps-Valls, Jesús Malo |
Abstract | Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical approaches that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as for instance by assuming flat Lambertian surfaces. Here we address the simultaneous statistical explanation of (i) the nonlinear behavior of achromatic and chromatic mechanisms in a fixed adaptation state, and (ii) the change of such behavior. Both phenomena emerge directly from the samples through a single data-driven method: the Sequential Principal Curves Analysis (SPCA) with local metric. SPCA is a new manifold learning technique to derive a set of sensors adapted to the manifold using different optimality criteria. A new database of colorimetrically calibrated images of natural objects under these illuminants was collected. The results obtained by applying SPCA show that the psychophysical behavior on color discrimination thresholds, discount of the illuminant and corresponding pairs in asymmetric color matching, emerge directly from realistic data regularities assuming no a priori functional form. These results provide stronger evidence for the hypothesis of a statistically driven organization of color sensors. Moreover, the obtained results suggest that color perception at this low abstraction level may be guided by an error minimization strategy rather than by the information maximization principle. |
Tasks | |
Published | 2016-01-31 |
URL | http://arxiv.org/abs/1602.00236v1 |
http://arxiv.org/pdf/1602.00236v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinearities-and-adaptation-of-color-vision |
Repo | |
Framework | |
Macro-optimization of email recommendation response rates harnessing individual activity levels and group affinity trends
Title | Macro-optimization of email recommendation response rates harnessing individual activity levels and group affinity trends |
Authors | Mohammed Korayem, Khalifeh Aljadda, Trey Grainger |
Abstract | Recommendation emails are among the best ways to re-engage with customers after they have left a website. While on-site recommendation systems focus on finding the most relevant items for a user at the moment (right item), email recommendations add two critical additional dimensions: who to send recommendations to (right person) and when to send them (right time). It is critical that a recommendation email system not send too many emails to too many users in too short of a time-window, as users may unsubscribe from future emails or become desensitized and ignore future emails if they receive too many. Also, email service providers may mark such emails as spam if too many of their users are contacted in a short time-window. Optimizing email recommendation systems such that they can yield a maximum response rate for a minimum number of email sends is thus critical for the long-term performance of such a system. In this paper, we present a novel recommendation email system that not only generates recommendations, but which also leverages a combination of individual user activity data, as well as the behavior of the group to which they belong, in order to determine each user’s likelihood to respond to any given set of recommendations within a given time period. In doing this, we have effectively created a meta-recommendation system which recommends sets of recommendations in order to optimize the aggregate response rate of the entire system. The proposed technique has been applied successfully within CareerBuilder’s job recommendation email system to generate a 50% increase in total conversions while also decreasing sent emails by 72% |
Tasks | Recommendation Systems |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.05989v1 |
http://arxiv.org/pdf/1609.05989v1.pdf | |
PWC | https://paperswithcode.com/paper/macro-optimization-of-email-recommendation |
Repo | |
Framework | |
Nonparametric Bayesian Storyline Detection from Microtexts
Title | Nonparametric Bayesian Storyline Detection from Microtexts |
Authors | Vinodh Krishnan, Jacob Eisenstein |
Abstract | News events and social media are composed of evolving storylines, which capture public attention for a limited period of time. Identifying storylines requires integrating temporal and linguistic information, and prior work takes a largely heuristic approach. We present a novel online non-parametric Bayesian framework for storyline detection, using the distance-dependent Chinese Restaurant Process (dd-CRP). To ensure efficient linear-time inference, we employ a fixed-lag Gibbs sampling procedure, which is novel for the dd-CRP. We evaluate on the TREC Twitter Timeline Generation (TTG), obtaining encouraging results: despite using a weak baseline retrieval model, the dd-CRP story clustering method is competitive with the best entries in the 2014 TTG task. |
Tasks | |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04580v2 |
http://arxiv.org/pdf/1601.04580v2.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-bayesian-storyline-detection |
Repo | |
Framework | |
Kernelized LRR on Grassmann Manifolds for Subspace Clustering
Title | Kernelized LRR on Grassmann Manifolds for Subspace Clustering |
Authors | Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin |
Abstract | Low rank representation (LRR) has recently attracted great interest due to its pleasing efficacy in exploring low-dimensional sub- space structures embedded in data. One of its successful applications is subspace clustering, by which data are clustered according to the subspaces they belong to. In this paper, at a higher level, we intend to cluster subspaces into classes of subspaces. This is naturally described as a clustering problem on Grassmann manifold. The novelty of this paper is to generalize LRR on Euclidean space onto an LRR model on Grassmann manifold in a uniform kernelized LRR framework. The new method has many applications in data analysis in computer vision tasks. The proposed models have been evaluated on a number of practical data analysis applications. The experimental results show that the proposed models outperform a number of state-of-the-art subspace clustering methods. |
Tasks | |
Published | 2016-01-09 |
URL | http://arxiv.org/abs/1601.02124v1 |
http://arxiv.org/pdf/1601.02124v1.pdf | |
PWC | https://paperswithcode.com/paper/kernelized-lrr-on-grassmann-manifolds-for |
Repo | |
Framework | |
Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks
Title | Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks |
Authors | Sepehr Valipour, Mennatullah Siam, Eleni Stroulia, Martin Jagersand |
Abstract | Parking management systems, and vacancy-indication services in particular, can play a valuable role in reducing traffic and energy waste in large cities. Visual detection methods represent a cost-effective option, since they can take advantage of hardware usually already available in many parking lots, namely cameras. However, visual detection methods can be fragile and not easily generalizable. In this paper, we present a robust detection algorithm based on deep convolutional neural networks. We implemented and tested our algorithm on a large baseline dataset, and also on a set of image feeds from actual cameras already installed in parking lots. We have developed a fully functional system, from server-side image analysis to front-end user interface, to demonstrate the practicality of our method. |
Tasks | |
Published | 2016-06-30 |
URL | http://arxiv.org/abs/1606.09367v1 |
http://arxiv.org/pdf/1606.09367v1.pdf | |
PWC | https://paperswithcode.com/paper/parking-stall-vacancy-indicator-system-based |
Repo | |
Framework | |
An Enhanced Harmony Search Method for Bangla Handwritten Character Recognition Using Region Sampling
Title | An Enhanced Harmony Search Method for Bangla Handwritten Character Recognition Using Region Sampling |
Authors | Ritesh Sarkhel, Amit K Saha, Nibaran Das |
Abstract | Identification of minimum number of local regions of a handwritten character image, containing well-defined discriminating features which are sufficient for a minimal but complete description of the character is a challenging task. A new region selection technique based on the idea of an enhanced Harmony Search methodology has been proposed here. The powerful framework of Harmony Search has been utilized to search the region space and detect only the most informative regions for correctly recognizing the handwritten character. The proposed method has been tested on handwritten samples of Bangla Basic, Compound and mixed (Basic and Compound characters) characters separately with SVM based classifier using a longest run based feature-set obtained from the image subregions formed by a CG based quad-tree partitioning approach. Applying this methodology on the above mentioned three types of datasets, respectively 43.75%, 12.5% and 37.5% gains have been achieved in terms of region reduction and 2.3%, 0.6% and 1.2% gains have been achieved in terms of recognition accuracy. The results show a sizeable reduction in the minimal number of descriptive regions as well a significant increase in recognition accuracy for all the datasets using the proposed technique. Thus the time and cost related to feature extraction is decreased without dampening the corresponding recognition accuracy. |
Tasks | |
Published | 2016-05-02 |
URL | http://arxiv.org/abs/1605.00420v1 |
http://arxiv.org/pdf/1605.00420v1.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-harmony-search-method-for-bangla |
Repo | |
Framework | |
Escaping Local Optima using Crossover with Emergent or Reinforced Diversity
Title | Escaping Local Optima using Crossover with Emergent or Reinforced Diversity |
Authors | Duc-Cuong Dang, Tobias Friedrich, Timo Kötzing, Martin S. Krejca, Per Kristian Lehre, Pietro S. Oliveto, Dirk Sudholt, Andrew M. Sutton |
Abstract | Population diversity is essential for avoiding premature convergence in Genetic Algorithms (GAs) and for the effective use of crossover. Yet the dynamics of how diversity emerges in populations are not well understood. We use rigorous runtime analysis to gain insight into population dynamics and GA performance for the ($\mu$+1) GA and the $\text{Jump}_k$ test function. We show that the interplay of crossover and mutation may serve as a catalyst leading to a sudden burst of diversity. This leads to improvements of the expected optimisation time of order $\Omega(n/\log n)$ compared to mutation-only algorithms like (1+1) EA. Moreover, increasing the mutation rate by an arbitrarily small constant factor can facilitate the generation of diversity, leading to speedups of order $\Omega(n)$. We also compare seven commonly used diversity mechanisms and evaluate their impact on runtime bounds for the ($\mu$+1) GA. All previous results in this context only hold for unrealistically low crossover probability $p_c=O(k/n)$, while we give analyses for the setting of constant $p_c < 1$ in all but one case. For the typical case of constant $k > 2$ and constant $p_c$, we can compare the resulting expected runtimes for different diversity mechanisms assuming an optimal choice of $\mu$: $O(n^{k-1})$ for duplicate elimination/minim., $O(n^2\log n)$ for maximising the convex hull, $O(n\log n)$ for deterministic crowding (assuming $p_c = k/n$), $O(n\log n)$ for maximising Hamming distance, $O(n\log n)$ for fitness sharing, $O(n\log n)$ for single-receiver island model. This proves a sizeable advantage of all variants of the ($\mu$+1) GA compared to (1+1) EA, which requires time $\Theta(n^k)$. Experiments complement our theoretical findings and further highlight the benefits of crossover and diversity on $\text{Jump}_k$. |
Tasks | |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03123v1 |
http://arxiv.org/pdf/1608.03123v1.pdf | |
PWC | https://paperswithcode.com/paper/escaping-local-optima-using-crossover-with |
Repo | |
Framework | |
Bacterial foraging optimization based brain magnetic resonance image segmentation
Title | Bacterial foraging optimization based brain magnetic resonance image segmentation |
Authors | Abdul kayom Md Khairuzzaman |
Abstract | Segmentation partitions an image into its constituent parts. It is essentially the pre-processing stage of image analysis and computer vision. In this work, T1 and T2 weighted brain magnetic resonance images are segmented using multilevel thresholding and bacterial foraging optimization (BFO) algorithm. The thresholds are obtained by maximizing the between class variance (multilevel Otsu method) of the image. The BFO algorithm is used to optimize the threshold searching process. The edges are then obtained from the thresholded image by comparing the intensity of each pixel with its eight connected neighbourhood. Post processing is performed to remove spurious responses in the segmented image. The proposed segmentation technique is evaluated using edge detector evaluation parameters such as figure of merit, Rand Index and variation of information. The proposed brain MR image segmentation technique outperforms the traditional edge detectors such as canny and sobel. |
Tasks | Semantic Segmentation |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.05815v1 |
http://arxiv.org/pdf/1605.05815v1.pdf | |
PWC | https://paperswithcode.com/paper/bacterial-foraging-optimization-based-brain |
Repo | |
Framework | |
Enhancing ICA Performance by Exploiting Sparsity: Application to FMRI Analysis
Title | Enhancing ICA Performance by Exploiting Sparsity: Application to FMRI Analysis |
Authors | Zois Boukouvalas, Yuri Levin-Schwartz, Tulay Adali |
Abstract | Independent component analysis (ICA) is a powerful method for blind source separation based on the assumption that sources are statistically independent. Though ICA has proven useful and has been employed in many applications, complete statistical independence can be too restrictive an assumption in practice. Additionally, important prior information about the data, such as sparsity, is usually available. Sparsity is a natural property of the data, a form of diversity, which, if incorporated into the ICA model, can relax the independence assumption, resulting in an improvement in the overall separation performance. In this work, we propose a new variant of ICA by entropy bound minimization (ICA-EBM)-a flexible, yet parameter-free algorithm-through the direct exploitation of sparsity. Using this new SparseICA-EBM algorithm, we study the synergy of independence and sparsity through simulations on synthetic as well as functional magnetic resonance imaging (fMRI)-like data. |
Tasks | |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.06235v1 |
http://arxiv.org/pdf/1610.06235v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-ica-performance-by-exploiting |
Repo | |
Framework | |