May 5, 2019

3028 words 15 mins read

Paper Group ANR 551

Paper Group ANR 551

Joint Sound Source Separation and Speaker Recognition. Neuro-symbolic EDA-based Optimisation using ILP-enhanced DBNs. Relational Similarity Machines. Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications. Application of Statistical Relational Learning to Hybrid Recommendation Systems. A Non-convex One-Pass Framework for Gen …

Joint Sound Source Separation and Speaker Recognition

Title Joint Sound Source Separation and Speaker Recognition
Authors Jeroen Zegers, Hugo Van hamme
Abstract Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or non-simultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source separation for simultaneous speech. This paper explains how NMF can be used to jointly solve the two problems in a multichannel speaker recognizer for simultaneous speech. It is shown how state-of-the-art multichannel NMF for blind source separation can be easily extended to incorporate speaker recognition. Experiments on the CHiME corpus show that this method outperforms the sequential approach of first applying source separation, followed by speaker recognition that uses state-of-the-art i-vector techniques.
Tasks Speaker Recognition
Published 2016-04-29
URL http://arxiv.org/abs/1604.08852v1
PDF http://arxiv.org/pdf/1604.08852v1.pdf
PWC https://paperswithcode.com/paper/joint-sound-source-separation-and-speaker
Repo
Framework

Neuro-symbolic EDA-based Optimisation using ILP-enhanced DBNs

Title Neuro-symbolic EDA-based Optimisation using ILP-enhanced DBNs
Authors Sarmimala Saikia, Lovekesh Vig, Ashwin Srinivasan, Gautam Shroff, Puneet Agarwal, Richa Rawat
Abstract We investigate solving discrete optimisation problems using the estimation of distribution (EDA) approach via a novel combination of deep belief networks(DBN) and inductive logic programming (ILP).While DBNs are used to learn the structure of successively better feasible solutions,ILP enables the incorporation of domain-based background knowledge related to the goodness of solutions.Recent work showed that ILP could be an effective way to use domain knowledge in an EDA scenario.However,in a purely ILP-based EDA,sampling successive populations is either inefficient or not straightforward.In our Neuro-symbolic EDA,an ILP engine is used to construct a model for good solutions using domain-based background knowledge.These rules are introduced as Boolean features in the last hidden layer of DBNs used for EDA-based optimization.This incorporation of logical ILP features requires some changes while training and sampling from DBNs: (a)our DBNs need to be trained with data for units at the input layer as well as some units in an otherwise hidden layer, and (b)we would like the samples generated to be drawn from instances entailed by the logical model.We demonstrate the viability of our approach on instances of two optimisation problems: predicting optimal depth-of-win for the KRK endgame,and jobshop scheduling.Our results are promising: (i)On each iteration of distribution estimation,samples obtained with an ILP-assisted DBN have a substantially greater proportion of good solutions than samples generated using a DBN without ILP features, and (ii)On termination of distribution estimation,samples obtained using an ILP-assisted DBN contain more near-optimal samples than samples from a DBN without ILP features.These results suggest that the use of ILP-constructed theories could be useful for incorporating complex domain-knowledge into deep models for estimation of distribution based procedures.
Tasks
Published 2016-12-20
URL http://arxiv.org/abs/1612.06528v1
PDF http://arxiv.org/pdf/1612.06528v1.pdf
PWC https://paperswithcode.com/paper/neuro-symbolic-eda-based-optimisation-using
Repo
Framework

Relational Similarity Machines

Title Relational Similarity Machines
Authors Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed
Abstract This paper proposes Relational Similarity Machines (RSM): a fast, accurate, and flexible relational learning framework for supervised and semi-supervised learning tasks. Despite the importance of relational learning, most existing methods are hard to adapt to different settings, due to issues with efficiency, scalability, accuracy, and flexibility for handling a wide variety of classification problems, data, constraints, and tasks. For instance, many existing methods perform poorly for multi-class classification problems, graphs that are sparsely labeled or network data with low relational autocorrelation. In contrast, the proposed relational learning framework is designed to be (i) fast for learning and inference at real-time interactive rates, and (ii) flexible for a variety of learning settings (multi-class problems), constraints (few labeled instances), and application domains. The experiments demonstrate the effectiveness of RSM for a variety of tasks and data.
Tasks Relational Reasoning
Published 2016-08-02
URL http://arxiv.org/abs/1608.00876v1
PDF http://arxiv.org/pdf/1608.00876v1.pdf
PWC https://paperswithcode.com/paper/relational-similarity-machines
Repo
Framework

Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications

Title Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications
Authors Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, Bjoern Andres
Abstract We state a combinatorial optimization problem whose feasible solutions define both a decomposition and a node labeling of a given graph. This problem offers a common mathematical abstraction of seemingly unrelated computer vision tasks, including instance-separating semantic segmentation, articulated human body pose estimation and multiple object tracking. Conceptually, the problem we state generalizes the unconstrained integer quadratic program and the minimum cost lifted multicut problem, both of which are NP-hard. In order to find feasible solutions efficiently, we define two local search algorithms that converge monotonously to a local optimum, offering a feasible solution at any time. To demonstrate their effectiveness in tackling computer vision tasks, we apply these algorithms to instances of the problem that we construct from published data, using published algorithms. We report state-of-the-art application-specific accuracy for the three above-mentioned applications.
Tasks Combinatorial Optimization, Multiple Object Tracking, Object Tracking, Pose Estimation, Semantic Segmentation
Published 2016-11-14
URL http://arxiv.org/abs/1611.04399v2
PDF http://arxiv.org/pdf/1611.04399v2.pdf
PWC https://paperswithcode.com/paper/joint-graph-decomposition-and-node-labeling
Repo
Framework

Application of Statistical Relational Learning to Hybrid Recommendation Systems

Title Application of Statistical Relational Learning to Hybrid Recommendation Systems
Authors Shuo Yang, Mohammed Korayem, Khalifeh AlJadda, Trey Grainger, Sriraam Natarajan
Abstract Recommendation systems usually involve exploiting the relations among known features and content that describe items (content-based filtering) or the overlap of similar users who interacted with or rated the target item (collaborative filtering). To combine these two filtering approaches, current model-based hybrid recommendation systems typically require extensive feature engineering to construct a user profile. Statistical Relational Learning (SRL) provides a straightforward way to combine the two approaches. However, due to the large scale of the data used in real world recommendation systems, little research exists on applying SRL models to hybrid recommendation systems, and essentially none of that research has been applied on real big-data-scale systems. In this paper, we proposed a way to adapt the state-of-the-art in SRL learning approaches to construct a real hybrid recommendation system. Furthermore, in order to satisfy a common requirement in recommendation systems (i.e. that false positives are more undesirable and therefore penalized more harshly than false negatives), our approach can also allow tuning the trade-off between the precision and recall of the system in a principled way. Our experimental results demonstrate the efficiency of our proposed approach as well as its improved performance on recommendation precision.
Tasks Feature Engineering, Recommendation Systems, Relational Reasoning
Published 2016-07-04
URL http://arxiv.org/abs/1607.01050v1
PDF http://arxiv.org/pdf/1607.01050v1.pdf
PWC https://paperswithcode.com/paper/application-of-statistical-relational
Repo
Framework

A Non-convex One-Pass Framework for Generalized Factorization Machine and Rank-One Matrix Sensing

Title A Non-convex One-Pass Framework for Generalized Factorization Machine and Rank-One Matrix Sensing
Authors Ming Lin, Jieping Ye
Abstract We develop an efficient alternating framework for learning a generalized version of Factorization Machine (gFM) on steaming data with provable guarantees. When the instances are sampled from $d$ dimensional random Gaussian vectors and the target second order coefficient matrix in gFM is of rank $k$, our algorithm converges linearly, achieves $O(\epsilon)$ recovery error after retrieving $O(k^{3}d\log(1/\epsilon))$ training instances, consumes $O(kd)$ memory in one-pass of dataset and only requires matrix-vector product operations in each iteration. The key ingredient of our framework is a construction of an estimation sequence endowed with a so-called Conditionally Independent RIP condition (CI-RIP). As special cases of gFM, our framework can be applied to symmetric or asymmetric rank-one matrix sensing problems, such as inductive matrix completion and phase retrieval.
Tasks Matrix Completion
Published 2016-08-21
URL http://arxiv.org/abs/1608.05995v5
PDF http://arxiv.org/pdf/1608.05995v5.pdf
PWC https://paperswithcode.com/paper/a-non-convex-one-pass-framework-for
Repo
Framework

Image and Video Mining through Online Learning

Title Image and Video Mining through Online Learning
Authors Andrew Gilbert, Richard Bowden
Abstract Within the field of image and video recognition, the traditional approach is a dataset split into fixed training and test partitions. However, the labelling of the training set is time-consuming, especially as datasets grow in size and complexity. Furthermore, this approach is not applicable to the home user, who wants to intuitively group their media without tirelessly labelling the content. Our interactive approach is able to iteratively cluster classes of images and video. Our approach is based around the concept of an image signature which, unlike a standard bag of words model, can express co-occurrence statistics as well as symbol frequency. We efficiently compute metric distances between signatures despite their inherent high dimensionality and provide discriminative feature selection, to allow common and distinctive elements to be identified from a small set of user labelled examples. These elements are then accentuated in the image signature to increase similarity between examples and pull correct classes together. By repeating this process in an online learning framework, the accuracy of similarity increases dramatically despite labelling only a few training examples. To demonstrate that the approach is agnostic to media type and features used, we evaluate on three image datasets (15 scene, Caltech101 and FG-NET), a mixed text and image dataset (ImageTag), a dataset used in active learning (Iris) and on three action recognition datasets (UCF11, KTH and Hollywood2). On the UCF11 video dataset, the accuracy is 86.7% despite using only 90 labelled examples from a dataset of over 1200 videos, instead of the standard 1122 training videos. The approach is both scalable and efficient, with a single iteration over the full UCF11 dataset of around 1200 videos taking approximately 1 minute on a standard desktop machine.
Tasks Active Learning, Feature Selection, Temporal Action Localization, Video Recognition
Published 2016-09-09
URL http://arxiv.org/abs/1609.02770v2
PDF http://arxiv.org/pdf/1609.02770v2.pdf
PWC https://paperswithcode.com/paper/image-and-video-mining-through-online
Repo
Framework

Development of a Real-time Colorectal Tumor Classification System for Narrow-band Imaging zoom-videoendoscopy

Title Development of a Real-time Colorectal Tumor Classification System for Narrow-band Imaging zoom-videoendoscopy
Authors Tsubasa Hirakawa, Toru Tamaki, Bisser Raytchev, Kazufumi Kaneda, Tetsushi Koide, Shigeto Yoshida, Hiroshi Mieno, Shinji Tanaka
Abstract Colorectal endoscopy is important for the early detection and treatment of colorectal cancer and is used worldwide. A computer-aided diagnosis (CAD) system that provides an objective measure to endoscopists during colorectal endoscopic examinations would be of great value. In this study, we describe a newly developed CAD system that provides real-time objective measures. Our system captures the video stream from an endoscopic system and transfers it to a desktop computer. The captured video stream is then classified by a pretrained classifier and the results are displayed on a monitor. The experimental results show that our developed system works efficiently in actual endoscopic examinations and is medically significant.
Tasks
Published 2016-12-15
URL http://arxiv.org/abs/1612.05000v2
PDF http://arxiv.org/pdf/1612.05000v2.pdf
PWC https://paperswithcode.com/paper/development-of-a-real-time-colorectal-tumor
Repo
Framework

MindX: Denoising Mixed Impulse Poisson-Gaussian Noise Using Proximal Algorithms

Title MindX: Denoising Mixed Impulse Poisson-Gaussian Noise Using Proximal Algorithms
Authors Mohamed Aly, Wolfgang Heidrich
Abstract We present a novel algorithm for blind denoising of images corrupted by mixed impulse, Poisson, and Gaussian noises. The algorithm starts by applying the Anscombe variance-stabilizing transformation to convert the Poisson into white Gaussian noise. Then it applies a combinatorial optimization technique to denoise the mixed impulse Gaussian noise using proximal algorithms. The result is then processed by the inverse Anscombe transform. We compare our algorithm to state of the art methods on standard images, and show its superior performance in various noise conditions.
Tasks Combinatorial Optimization, Denoising
Published 2016-08-28
URL http://arxiv.org/abs/1608.07802v1
PDF http://arxiv.org/pdf/1608.07802v1.pdf
PWC https://paperswithcode.com/paper/mindx-denoising-mixed-impulse-poisson
Repo
Framework

Learning, Visualizing, and Exploiting a Model for the Intrinsic Value of a Batted Ball

Title Learning, Visualizing, and Exploiting a Model for the Intrinsic Value of a Batted Ball
Authors Glenn Healey
Abstract We present an algorithm for learning the intrinsic value of a batted ball in baseball. This work addresses the fundamental problem of separating the value of a batted ball at contact from factors such as the defense, weather, and ballpark that can affect its observed outcome. The algorithm uses a Bayesian model to construct a continuous mapping from a vector of batted ball parameters to an intrinsic measure defined as the expected value of a linear weights representation for run value. A kernel method is used to build nonparametric estimates for the component probability density functions in Bayes theorem from a set of over one hundred thousand batted ball measurements recorded by the HITf/x system during the 2014 major league baseball (MLB) season. Cross-validation is used to determine the optimal vector of smoothing parameters for the density estimates. Properties of the mapping are visualized by considering reduced-dimension subsets of the batted ball parameter space. We use the mapping to derive statistics for intrinsic quality of contact for batters and pitchers which have the potential to improve the accuracy of player models and forecasting systems. We also show that the new approach leads to a simple automated measure of contact-adjusted defense and provides insight into the impact of environmental variables on batted balls.
Tasks
Published 2016-02-21
URL http://arxiv.org/abs/1603.00050v1
PDF http://arxiv.org/pdf/1603.00050v1.pdf
PWC https://paperswithcode.com/paper/learning-visualizing-and-exploiting-a-model
Repo
Framework

Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining

Title Safe Pattern Pruning: An Efficient Approach for Predictive Pattern Mining
Authors Kazuya Nakagawa, Shinya Suzumura, Masayuki Karasuyama, Koji Tsuda, Ichiro Takeuchi
Abstract In this paper we study predictive pattern mining problems where the goal is to construct a predictive model based on a subset of predictive patterns in the database. Our main contribution is to introduce a novel method called safe pattern pruning (SPP) for a class of predictive pattern mining problems. The SPP method allows us to efficiently find a superset of all the predictive patterns in the database that are needed for the optimal predictive model. The advantage of the SPP method over existing boosting-type method is that the former can find the superset by a single search over the database, while the latter requires multiple searches. The SPP method is inspired by recent development of safe feature screening. In order to extend the idea of safe feature screening into predictive pattern mining, we derive a novel pruning rule called safe pattern pruning (SPP) rule that can be used for searching over the tree defined among patterns in the database. The SPP rule has a property that, if a node corresponding to a pattern in the database is pruned out by the SPP rule, then it is guaranteed that all the patterns corresponding to its descendant nodes are never needed for the optimal predictive model. We apply the SPP method to graph mining and item-set mining problems, and demonstrate its computational advantage.
Tasks
Published 2016-02-15
URL http://arxiv.org/abs/1602.04548v1
PDF http://arxiv.org/pdf/1602.04548v1.pdf
PWC https://paperswithcode.com/paper/safe-pattern-pruning-an-efficient-approach
Repo
Framework

Optimizing Codes for Source Separation in Color Image Demosaicing and Compressive Video Recovery

Title Optimizing Codes for Source Separation in Color Image Demosaicing and Compressive Video Recovery
Authors Alankar Kotwal, Ajit Rajwade
Abstract There exist several applications in image processing (eg: video compressed sensing [Hitomi, Y. et al, “Video from a single coded exposure photograph using a learned overcomplete dictionary”] and color image demosaicing [Moghadam, A. A. et al, “Compressive Framework for Demosaicing of Natural Images”]) which require separation of constituent images given measurements in the form of a coded superposition of those images. Physically practical code patterns in these applications are non-negative, systematically structured, and do not always obey the nice incoherence properties of other patterns such as Gaussian codes, which can adversely affect reconstruction performance. The contribution of this paper is to design code patterns for video compressed sensing and demosaicing by minimizing the mutual coherence of the matrix $\boldsymbol{\Phi \Psi}$ where $\boldsymbol{\Phi}$ represents the sensing matrix created from the code, and $\boldsymbol{\Psi}$ is the signal representation matrix. Our main contribution is that we explicitly take into account the special structure of those code patterns as required by these applications: (1)~non-negativity, (2)~block-diagonal nature, and (3)~circular shifting. In particular, the last property enables for accurate and seamless patch-wise reconstruction for some important compressed sensing architectures.
Tasks Demosaicking
Published 2016-09-07
URL http://arxiv.org/abs/1609.02135v2
PDF http://arxiv.org/pdf/1609.02135v2.pdf
PWC https://paperswithcode.com/paper/optimizing-codes-for-source-separation-in
Repo
Framework

Translucent Players: Explaining Cooperative Behavior in Social Dilemmas

Title Translucent Players: Explaining Cooperative Behavior in Social Dilemmas
Authors Valerio Capraro, Joseph Y. Halpern
Abstract In the last few decades, numerous experiments have shown that humans do not always behave so as to maximize their material payoff. Cooperative behavior when non-cooperation is a dominant strategy (with respect to the material payoffs) is particularly puzzling. Here we propose a novel approach to explain cooperation, assuming what Halpern and Pass call translucent players. Typically, players are assumed to be opaque, in the sense that a deviation by one player in a normal-form game does not affect the strategies used by other players. But a player may believe that if he switches from one strategy to another, the fact that he chooses to switch may be visible to the other players. For example, if he chooses to defect in Prisoner’s Dilemma, the other player may sense his guilt. We show that by assuming translucent players, we can recover many of the regularities observed in human behavior in well-studied games such as Prisoner’s Dilemma, Traveler’s Dilemma, Bertrand Competition, and the Public Goods game.
Tasks
Published 2016-06-24
URL http://arxiv.org/abs/1606.07533v1
PDF http://arxiv.org/pdf/1606.07533v1.pdf
PWC https://paperswithcode.com/paper/translucent-players-explaining-cooperative
Repo
Framework

Neural Document Embeddings for Intensive Care Patient Mortality Prediction

Title Neural Document Embeddings for Intensive Care Patient Mortality Prediction
Authors Paulina Grnarova, Florian Schmidt, Stephanie L. Hyland, Carsten Eickhoff
Abstract We present an automatic mortality prediction scheme based on the unstructured textual content of clinical notes. Proposing a convolutional document embedding approach, our empirical investigation using the MIMIC-III intensive care database shows significant performance gains compared to previously employed methods such as latent topic distributions or generic doc2vec embeddings. These improvements are especially pronounced for the difficult problem of post-discharge mortality prediction.
Tasks Document Embedding, Mortality Prediction
Published 2016-12-01
URL http://arxiv.org/abs/1612.00467v1
PDF http://arxiv.org/pdf/1612.00467v1.pdf
PWC https://paperswithcode.com/paper/neural-document-embeddings-for-intensive-care
Repo
Framework

Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection

Title Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection
Authors Katharina Kann, Hinrich Schütze
Abstract Morphological reinflection is the task of generating a target form given a source form, a source tag and a target tag. We propose a new way of modeling this task with neural encoder-decoder models. Our approach reduces the amount of required training data for this architecture and achieves state-of-the-art results, making encoder-decoder models applicable to morphological reinflection even for low-resource languages. We further present a new automatic correction method for the outputs based on edit trees.
Tasks
Published 2016-06-02
URL http://arxiv.org/abs/1606.00589v1
PDF http://arxiv.org/pdf/1606.00589v1.pdf
PWC https://paperswithcode.com/paper/single-model-encoder-decoder-with-explicit
Repo
Framework
comments powered by Disqus