May 6, 2019

2876 words 14 mins read

Paper Group ANR 204

Paper Group ANR 204

Length bias in Encoder Decoder Models and a Case for Global Conditioning. Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network. Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models. Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery. A Non-Local Means Approach for Gau …

Length bias in Encoder Decoder Models and a Case for Global Conditioning

Title Length bias in Encoder Decoder Models and a Case for Global Conditioning
Authors Pavel Sountsov, Sunita Sarawagi
Abstract Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space.
Tasks
Published 2016-06-10
URL http://arxiv.org/abs/1606.03402v2
PDF http://arxiv.org/pdf/1606.03402v2.pdf
PWC https://paperswithcode.com/paper/length-bias-in-encoder-decoder-models-and-a
Repo
Framework

Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network

Title Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network
Authors Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang
Abstract In this paper, we propose a dictionary update method for Nonnegative Matrix Factorization (NMF) with high dimensional data in a spectral conversion (SC) task. Voice conversion has been widely studied due to its potential applications such as personalized speech synthesis and speech enhancement. Exemplar-based NMF (ENMF) emerges as an effective and probably the simplest choice among all techniques for SC, as long as a source-target parallel speech corpus is given. ENMF-based SC systems usually need a large amount of bases (exemplars) to ensure the quality of the converted speech. However, a small and effective dictionary is desirable but hard to obtain via dictionary update, in particular when high-dimensional features such as STRAIGHT spectra are used. Therefore, we propose a dictionary update framework for NMF by means of an encoder-decoder reformulation. Regarding NMF as an encoder-decoder network makes it possible to exploit the whole parallel corpus more effectively and efficiently when applied to SC. Our experiments demonstrate significant gains of the proposed system with small dictionaries over conventional ENMF-based systems with dictionaries of same or much larger size.
Tasks Speech Enhancement, Speech Synthesis, Voice Conversion
Published 2016-10-13
URL http://arxiv.org/abs/1610.03988v1
PDF http://arxiv.org/pdf/1610.03988v1.pdf
PWC https://paperswithcode.com/paper/dictionary-update-for-nmf-based-voice
Repo
Framework

Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models

Title Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models
Authors Julius Adebayo, Lalana Kagal
Abstract Predictive models are increasingly deployed for the purpose of determining access to services such as credit, insurance, and employment. Despite potential gains in productivity and efficiency, several potential problems have yet to be addressed, particularly the potential for unintentional discrimination. We present an iterative procedure, based on orthogonal projection of input attributes, for enabling interpretability of black-box predictive models. Through our iterative procedure, one can quantify the relative dependence of a black-box model on its input attributes.The relative significance of the inputs to a predictive model can then be used to assess the fairness (or discriminatory extent) of such a model.
Tasks
Published 2016-11-15
URL http://arxiv.org/abs/1611.04967v1
PDF http://arxiv.org/pdf/1611.04967v1.pdf
PWC https://paperswithcode.com/paper/iterative-orthogonal-feature-projection-for
Repo
Framework

Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery

Title Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery
Authors Baburaj M., Sudhish N. George
Abstract This paper focus on recovering multi-dimensional data called tensor from randomly corrupted incomplete observation. Inspired by reweighted $l_1$ norm minimization for sparsity enhancement, this paper proposes a reweighted singular value enhancement scheme to improve tensor low tubular rank in the tensor completion process. An efficient iterative decomposition scheme based on t-SVD is proposed which improves low-rank signal recovery significantly. The effectiveness of the proposed method is established by applying to video completion problem, and experimental results reveal that the algorithm outperforms its counterparts.
Tasks
Published 2016-11-18
URL http://arxiv.org/abs/1611.05964v4
PDF http://arxiv.org/pdf/1611.05964v4.pdf
PWC https://paperswithcode.com/paper/reweighted-low-rank-tensor-completion-and-its
Repo
Framework

A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel

Title A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel
Authors Mojtaba Kazemi, Ehsan Mohammadi. P, Parichehr shahidi sadeghi, Mohamad B. Menhaj
Abstract Gaussian noise removal is an interesting area in digital image processing not only to improve the visual quality, but for its impact on other post-processing algorithms like image registration or segmentation. Many presented state-of-the-art denoising methods are based on the self-similarity or patch-based image processing. Specifically, Non-Local Means (NLM) as a patch-based filter has gained increasing attention in recent years. Essentially, this filter tends to obtain the noise-less signal value by computing the Gaussian-weighted Euclidean distance between the patch under-processing and other patches inside the image. However, the NLM filter is sensitive to the outliers (pixels that their intensity values are far away from other pixels) inside the patch, meaning that the pixels with the symmetric locations in the patch are assigned the same weight. This can lead to sub-optimal denoising performance when the destructive nature of noise generates some outliers inside patches. In this paper, we propose a new weighting approach to modify the Gaussian kernel of the NLM filter. Our approach employs the geometric distance between image intensities to come up with new weights for each pixel of a patch, lowering the impact of outliers on the denoising performance. Experiments on a set of standard images and different noise levels show that our proposed method outperforms the other compared denoising filters.
Tasks Denoising, Image Registration
Published 2016-12-03
URL http://arxiv.org/abs/1612.01006v1
PDF http://arxiv.org/pdf/1612.01006v1.pdf
PWC https://paperswithcode.com/paper/a-non-local-means-approach-for-gaussian-noise
Repo
Framework

Multi-Modal Hybrid Deep Neural Network for Speech Enhancement

Title Multi-Modal Hybrid Deep Neural Network for Speech Enhancement
Authors Zhenzhou Wu, Sunil Sivadas, Yong Kiam Tan, Ma Bin, Rick Siow Mong Goh
Abstract Deep Neural Networks (DNN) have been successful in en- hancing noisy speech signals. Enhancement is achieved by learning a nonlinear mapping function from the features of the corrupted speech signal to that of the reference clean speech signal. The quality of predicted features can be improved by providing additional side channel information that is robust to noise, such as visual cues. In this paper we propose a novel deep learning model inspired by insights from human audio visual perception. In the proposed unified hybrid architecture, features from a Convolution Neural Network (CNN) that processes the visual cues and features from a fully connected DNN that processes the audio signal are integrated using a Bidirectional Long Short-Term Memory (BiLSTM) network. The parameters of the hybrid model are jointly learned using backpropagation. We compare the quality of enhanced speech from the hybrid models with those from traditional DNN and BiLSTM models.
Tasks Speech Enhancement
Published 2016-06-15
URL http://arxiv.org/abs/1606.04750v1
PDF http://arxiv.org/pdf/1606.04750v1.pdf
PWC https://paperswithcode.com/paper/multi-modal-hybrid-deep-neural-network-for
Repo
Framework

Single Channel Speech Enhancement Using Outlier Detection

Title Single Channel Speech Enhancement Using Outlier Detection
Authors Eunjoon Cho, Bowon Lee, Ronald Schafer, Bernard Widrow
Abstract Distortion of the underlying speech is a common problem for single-channel speech enhancement algorithms, and hinders such methods from being used more extensively. A dictionary based speech enhancement method that emphasizes preserving the underlying speech is proposed. Spectral patches of clean speech are sampled and clustered to train a dictionary. Given a noisy speech spectral patch, the best matching dictionary entry is selected and used to estimate the noise power at each time-frequency bin. The noise estimation step is formulated as an outlier detection problem, where the noise at each bin is assumed present only if it is an outlier to the corresponding bin of the best matching dictionary entry. This framework assigns higher priority in removing spectral elements that strongly deviate from a typical spoken unit stored in the trained dictionary. Even without the aid of a separate noise model, this method can achieve significant noise reduction for various non-stationary noises, while effectively preserving the underlying speech in more challenging noisy environments.
Tasks Outlier Detection, Speech Enhancement
Published 2016-05-04
URL http://arxiv.org/abs/1605.01329v1
PDF http://arxiv.org/pdf/1605.01329v1.pdf
PWC https://paperswithcode.com/paper/single-channel-speech-enhancement-using
Repo
Framework

Capturing Dynamic Textured Surfaces of Moving Targets

Title Capturing Dynamic Textured Surfaces of Moving Targets
Authors Ruizhe Wang, Lingyu Wei, Etienne Vouga, Qixing Huang, Duygu Ceylan, Gerard Medioni, Hao Li
Abstract We present an end-to-end system for reconstructing complete watertight and textured models of moving subjects such as clothed humans and animals, using only three or four handheld sensors. The heart of our framework is a new pairwise registration algorithm that minimizes, using a particle swarm strategy, an alignment error metric based on mutual visibility and occlusion. We show that this algorithm reliably registers partial scans with as little as 15% overlap without requiring any initial correspondences, and outperforms alternative global registration algorithms. This registration algorithm allows us to reconstruct moving subjects from free-viewpoint video produced by consumer-grade sensors, without extensive sensor calibration, constrained capture volume, expensive arrays of cameras, or templates of the subject geometry.
Tasks Calibration
Published 2016-04-11
URL http://arxiv.org/abs/1604.02801v1
PDF http://arxiv.org/pdf/1604.02801v1.pdf
PWC https://paperswithcode.com/paper/capturing-dynamic-textured-surfaces-of-moving
Repo
Framework

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering

Title Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Authors Mateusz Malinowski, Marcus Rohrbach, Mario Fritz
Abstract We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem. In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language inputs (image and question). We provide additional insights into the problem by analyzing how much information is contained only in the language part for which we provide a new human baseline. To study human consensus, which is related to the ambiguities inherent in this challenging task, we propose two novel metrics and collect additional answers which extend the original DAQUAR dataset to DAQUAR-Consensus. Moreover, we also extend our analysis to VQA, a large-scale question answering about images dataset, where we investigate some particular design choices and show the importance of stronger visual models. At the same time, we achieve strong performance of our model that still uses a global image representation. Finally, based on such analysis, we refine our Ask Your Neurons on DAQUAR, which also leads to a better performance on this challenging task.
Tasks Question Answering, Visual Question Answering
Published 2016-05-09
URL http://arxiv.org/abs/1605.02697v2
PDF http://arxiv.org/pdf/1605.02697v2.pdf
PWC https://paperswithcode.com/paper/ask-your-neurons-a-deep-learning-approach-to
Repo
Framework

Probabilistic Graphical Models on Multi-Core CPUs using Java 8

Title Probabilistic Graphical Models on Multi-Core CPUs using Java 8
Authors Andres R. Masegosa, Ana M. Martinez, Hanen Borchani
Abstract In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.
Tasks
Published 2016-04-27
URL http://arxiv.org/abs/1604.07990v1
PDF http://arxiv.org/pdf/1604.07990v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-graphical-models-on-multi-core
Repo
Framework

Rapid Prediction of Player Retention in Free-to-Play Mobile Games

Title Rapid Prediction of Player Retention in Free-to-Play Mobile Games
Authors Anders Drachen, Eric Thurston Lundquist, Yungjen Kung, Pranav Simha Rao, Diego Klabjan, Rafet Sifa, Julian Runge
Abstract Predicting and improving player retention is crucial to the success of mobile Free-to-Play games. This paper explores the problem of rapid retention prediction in this context. Heuristic modeling approaches are introduced as a way of building simple rules for predicting short-term retention. Compared to common classification algorithms, our heuristic-based approach achieves reasonable and comparable performance using information from the first session, day, and week of player activity.
Tasks
Published 2016-07-12
URL http://arxiv.org/abs/1607.03202v1
PDF http://arxiv.org/pdf/1607.03202v1.pdf
PWC https://paperswithcode.com/paper/rapid-prediction-of-player-retention-in-free
Repo
Framework

A Theme-Rewriting Approach for Generating Algebra Word Problems

Title A Theme-Rewriting Approach for Generating Algebra Word Problems
Authors Rik Koncel-Kedziorski, Ioannis Konstas, Luke Zettlemoyer, Hannaneh Hajishirzi
Abstract Texts present coherent stories that have a particular theme or overall setting, for example science fiction or western. In this paper, we present a text generation method called {\it rewriting} that edits existing human-authored narratives to change their theme without changing the underlying story. We apply the approach to math word problems, where it might help students stay more engaged by quickly transforming all of their homework assignments to the theme of their favorite movie without changing the math concepts that are being taught. Our rewriting method uses a two-stage decoding process, which proposes new words from the target theme and scores the resulting stories according to a number of factors defining aspects of syntactic, semantic, and thematic coherence. Experiments demonstrate that the final stories typically represent the new theme well while still testing the original math concepts, outperforming a number of baselines. We also release a new dataset of human-authored rewrites of math word problems in several themes.
Tasks Text Generation
Published 2016-10-19
URL http://arxiv.org/abs/1610.06210v1
PDF http://arxiv.org/pdf/1610.06210v1.pdf
PWC https://paperswithcode.com/paper/a-theme-rewriting-approach-for-generating
Repo
Framework

Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets, and Complexity

Title Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets, and Complexity
Authors James Cussens, Matti Järvisalo, Janne H. Korhonen, Mark Bartlett
Abstract The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks—arguably the most central class of graphical models—especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem( is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.
Tasks
Published 2016-05-13
URL http://arxiv.org/abs/1605.04071v2
PDF http://arxiv.org/pdf/1605.04071v2.pdf
PWC https://paperswithcode.com/paper/bayesian-network-structure-learning-with
Repo
Framework

Machine olfaction using time scattering of sensor multiresolution graphs

Title Machine olfaction using time scattering of sensor multiresolution graphs
Authors Leonid Gugel, Yoel Shkolnisky, Shai Dekel
Abstract In this paper we construct a learning architecture for high dimensional time series sampled by sensor arrangements. Using a redundant wavelet decomposition on a graph constructed over the sensor locations, our algorithm is able to construct discriminative features that exploit the mutual information between the sensors. The algorithm then applies scattering networks to the time series graphs to create the feature space. We demonstrate our method on a machine olfaction problem, where one needs to classify the gas type and the location where it originates from data sampled by an array of sensors. Our experimental results clearly demonstrate that our method outperforms classical machine learning techniques used in previous studies.
Tasks Time Series
Published 2016-02-13
URL http://arxiv.org/abs/1602.04358v1
PDF http://arxiv.org/pdf/1602.04358v1.pdf
PWC https://paperswithcode.com/paper/machine-olfaction-using-time-scattering-of
Repo
Framework

FSMJ: Feature Selection with Maximum Jensen-Shannon Divergence for Text Categorization

Title FSMJ: Feature Selection with Maximum Jensen-Shannon Divergence for Text Categorization
Authors Bo Tang, Haibo He
Abstract In this paper, we present a new wrapper feature selection approach based on Jensen-Shannon (JS) divergence, termed feature selection with maximum JS-divergence (FSMJ), for text categorization. Unlike most existing feature selection approaches, the proposed FSMJ approach is based on real-valued features which provide more information for discrimination than binary-valued features used in conventional approaches. We show that the FSMJ is a greedy approach and the JS-divergence monotonically increases when more features are selected. We conduct several experiments on real-life data sets, compared with the state-of-the-art feature selection approaches for text categorization. The superior performance of the proposed FSMJ approach demonstrates its effectiveness and further indicates its wide potential applications on data mining.
Tasks Feature Selection, Text Categorization
Published 2016-06-20
URL http://arxiv.org/abs/1606.06366v1
PDF http://arxiv.org/pdf/1606.06366v1.pdf
PWC https://paperswithcode.com/paper/fsmj-feature-selection-with-maximum-jensen
Repo
Framework
comments powered by Disqus