May 6, 2019

2876 words 14 mins read

Paper Group ANR 204

Length bias in Encoder Decoder Models and a Case for Global Conditioning. Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network. Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models. Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery. A Non-Local Means Approach for Gau …

Length bias in Encoder Decoder Models and a Case for Global Conditioning


Title	Length bias in Encoder Decoder Models and a Case for Global Conditioning
Authors	Pavel Sountsov, Sunita Sarawagi
Abstract	Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space.
Tasks
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03402v2
PDF	http://arxiv.org/pdf/1606.03402v2.pdf
PWC	https://paperswithcode.com/paper/length-bias-in-encoder-decoder-models-and-a
Repo
Framework

Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network


Title	Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network
Authors	Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang
Abstract	In this paper, we propose a dictionary update method for Nonnegative Matrix Factorization (NMF) with high dimensional data in a spectral conversion (SC) task. Voice conversion has been widely studied due to its potential applications such as personalized speech synthesis and speech enhancement. Exemplar-based NMF (ENMF) emerges as an effective and probably the simplest choice among all techniques for SC, as long as a source-target parallel speech corpus is given. ENMF-based SC systems usually need a large amount of bases (exemplars) to ensure the quality of the converted speech. However, a small and effective dictionary is desirable but hard to obtain via dictionary update, in particular when high-dimensional features such as STRAIGHT spectra are used. Therefore, we propose a dictionary update framework for NMF by means of an encoder-decoder reformulation. Regarding NMF as an encoder-decoder network makes it possible to exploit the whole parallel corpus more effectively and efficiently when applied to SC. Our experiments demonstrate significant gains of the proposed system with small dictionaries over conventional ENMF-based systems with dictionaries of same or much larger size.
Tasks	Speech Enhancement, Speech Synthesis, Voice Conversion
Published	2016-10-13
URL	http://arxiv.org/abs/1610.03988v1
PDF	http://arxiv.org/pdf/1610.03988v1.pdf
PWC	https://paperswithcode.com/paper/dictionary-update-for-nmf-based-voice
Repo
Framework

Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models


Title	Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models
Authors	Julius Adebayo, Lalana Kagal
Abstract	Predictive models are increasingly deployed for the purpose of determining access to services such as credit, insurance, and employment. Despite potential gains in productivity and efficiency, several potential problems have yet to be addressed, particularly the potential for unintentional discrimination. We present an iterative procedure, based on orthogonal projection of input attributes, for enabling interpretability of black-box predictive models. Through our iterative procedure, one can quantify the relative dependence of a black-box model on its input attributes.The relative significance of the inputs to a predictive model can then be used to assess the fairness (or discriminatory extent) of such a model.
Tasks
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04967v1
PDF	http://arxiv.org/pdf/1611.04967v1.pdf
PWC	https://paperswithcode.com/paper/iterative-orthogonal-feature-projection-for
Repo
Framework

Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery


Title	Reweighted Low-Rank Tensor Completion and its Applications in Video Recovery
Authors	Baburaj M., Sudhish N. George
Abstract	This paper focus on recovering multi-dimensional data called tensor from randomly corrupted incomplete observation. Inspired by reweighted $l_1$ norm minimization for sparsity enhancement, this paper proposes a reweighted singular value enhancement scheme to improve tensor low tubular rank in the tensor completion process. An efficient iterative decomposition scheme based on t-SVD is proposed which improves low-rank signal recovery significantly. The effectiveness of the proposed method is established by applying to video completion problem, and experimental results reveal that the algorithm outperforms its counterparts.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.05964v4
PDF	http://arxiv.org/pdf/1611.05964v4.pdf
PWC	https://paperswithcode.com/paper/reweighted-low-rank-tensor-completion-and-its
Repo
Framework

A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel


Title	A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel
Authors	Mojtaba Kazemi, Ehsan Mohammadi. P, Parichehr shahidi sadeghi, Mohamad B. Menhaj
Abstract	Gaussian noise removal is an interesting area in digital image processing not only to improve the visual quality, but for its impact on other post-processing algorithms like image registration or segmentation. Many presented state-of-the-art denoising methods are based on the self-similarity or patch-based image processing. Specifically, Non-Local Means (NLM) as a patch-based filter has gained increasing attention in recent years. Essentially, this filter tends to obtain the noise-less signal value by computing the Gaussian-weighted Euclidean distance between the patch under-processing and other patches inside the image. However, the NLM filter is sensitive to the outliers (pixels that their intensity values are far away from other pixels) inside the patch, meaning that the pixels with the symmetric locations in the patch are assigned the same weight. This can lead to sub-optimal denoising performance when the destructive nature of noise generates some outliers inside patches. In this paper, we propose a new weighting approach to modify the Gaussian kernel of the NLM filter. Our approach employs the geometric distance between image intensities to come up with new weights for each pixel of a patch, lowering the impact of outliers on the denoising performance. Experiments on a set of standard images and different noise levels show that our proposed method outperforms the other compared denoising filters.
Tasks	Denoising, Image Registration
Published	2016-12-03
URL	http://arxiv.org/abs/1612.01006v1
PDF	http://arxiv.org/pdf/1612.01006v1.pdf
PWC	https://paperswithcode.com/paper/a-non-local-means-approach-for-gaussian-noise
Repo
Framework


Title	Multi-Modal Hybrid Deep Neural Network for Speech Enhancement
Authors	Zhenzhou Wu, Sunil Sivadas, Yong Kiam Tan, Ma Bin, Rick Siow Mong Goh
Abstract	Deep Neural Networks (DNN) have been successful in en- hancing noisy speech signals. Enhancement is achieved by learning a nonlinear mapping function from the features of the corrupted speech signal to that of the reference clean speech signal. The quality of predicted features can be improved by providing additional side channel information that is robust to noise, such as visual cues. In this paper we propose a novel deep learning model inspired by insights from human audio visual perception. In the proposed unified hybrid architecture, features from a Convolution Neural Network (CNN) that processes the visual cues and features from a fully connected DNN that processes the audio signal are integrated using a Bidirectional Long Short-Term Memory (BiLSTM) network. The parameters of the hybrid model are jointly learned using backpropagation. We compare the quality of enhanced speech from the hybrid models with those from traditional DNN and BiLSTM models.
Tasks	Speech Enhancement
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04750v1
PDF	http://arxiv.org/pdf/1606.04750v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-hybrid-deep-neural-network-for
Repo
Framework

Single Channel Speech Enhancement Using Outlier Detection


Title	Single Channel Speech Enhancement Using Outlier Detection
Authors	Eunjoon Cho, Bowon Lee, Ronald Schafer, Bernard Widrow
Abstract	Distortion of the underlying speech is a common problem for single-channel speech enhancement algorithms, and hinders such methods from being used more extensively. A dictionary based speech enhancement method that emphasizes preserving the underlying speech is proposed. Spectral patches of clean speech are sampled and clustered to train a dictionary. Given a noisy speech spectral patch, the best matching dictionary entry is selected and used to estimate the noise power at each time-frequency bin. The noise estimation step is formulated as an outlier detection problem, where the noise at each bin is assumed present only if it is an outlier to the corresponding bin of the best matching dictionary entry. This framework assigns higher priority in removing spectral elements that strongly deviate from a typical spoken unit stored in the trained dictionary. Even without the aid of a separate noise model, this method can achieve significant noise reduction for various non-stationary noises, while effectively preserving the underlying speech in more challenging noisy environments.
Tasks	Outlier Detection, Speech Enhancement
Published	2016-05-04
URL	http://arxiv.org/abs/1605.01329v1
PDF	http://arxiv.org/pdf/1605.01329v1.pdf
PWC	https://paperswithcode.com/paper/single-channel-speech-enhancement-using
Repo
Framework

Capturing Dynamic Textured Surfaces of Moving Targets


Title	Capturing Dynamic Textured Surfaces of Moving Targets
Authors	Ruizhe Wang, Lingyu Wei, Etienne Vouga, Qixing Huang, Duygu Ceylan, Gerard Medioni, Hao Li
Abstract	We present an end-to-end system for reconstructing complete watertight and textured models of moving subjects such as clothed humans and animals, using only three or four handheld sensors. The heart of our framework is a new pairwise registration algorithm that minimizes, using a particle swarm strategy, an alignment error metric based on mutual visibility and occlusion. We show that this algorithm reliably registers partial scans with as little as 15% overlap without requiring any initial correspondences, and outperforms alternative global registration algorithms. This registration algorithm allows us to reconstruct moving subjects from free-viewpoint video produced by consumer-grade sensors, without extensive sensor calibration, constrained capture volume, expensive arrays of cameras, or templates of the subject geometry.
Tasks	Calibration
Published	2016-04-11
URL	http://arxiv.org/abs/1604.02801v1
PDF	http://arxiv.org/pdf/1604.02801v1.pdf
PWC	https://paperswithcode.com/paper/capturing-dynamic-textured-surfaces-of-moving
Repo
Framework

Ask Your Neurons: A Deep Learning Approach to Visual Question Answering


Title	Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Authors	Mateusz Malinowski, Marcus Rohrbach, Mario Fritz
Abstract	We address a question answering task on real-world images that is set up as a Visual Turing Test. By combining latest advances in image representation and natural language processing, we propose Ask Your Neurons, a scalable, jointly trained, end-to-end formulation to this problem. In contrast to previous efforts, we are facing a multi-modal problem where the language output (answer) is conditioned on visual and natural language inputs (image and question). We provide additional insights into the problem by analyzing how much information is contained only in the language part for which we provide a new human baseline. To study human consensus, which is related to the ambiguities inherent in this challenging task, we propose two novel metrics and collect additional answers which extend the original DAQUAR dataset to DAQUAR-Consensus. Moreover, we also extend our analysis to VQA, a large-scale question answering about images dataset, where we investigate some particular design choices and show the importance of stronger visual models. At the same time, we achieve strong performance of our model that still uses a global image representation. Finally, based on such analysis, we refine our Ask Your Neurons on DAQUAR, which also leads to a better performance on this challenging task.
Tasks	Question Answering, Visual Question Answering
Published	2016-05-09
URL	http://arxiv.org/abs/1605.02697v2
PDF	http://arxiv.org/pdf/1605.02697v2.pdf
PWC	https://paperswithcode.com/paper/ask-your-neurons-a-deep-learning-approach-to
Repo
Framework

Probabilistic Graphical Models on Multi-Core CPUs using Java 8


Title	Probabilistic Graphical Models on Multi-Core CPUs using Java 8
Authors	Andres R. Masegosa, Ana M. Martinez, Hanen Borchani
Abstract	In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.
Tasks
Published	2016-04-27
URL	http://arxiv.org/abs/1604.07990v1
PDF	http://arxiv.org/pdf/1604.07990v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-graphical-models-on-multi-core
Repo
Framework

Rapid Prediction of Player Retention in Free-to-Play Mobile Games


Title	Rapid Prediction of Player Retention in Free-to-Play Mobile Games
Authors	Anders Drachen, Eric Thurston Lundquist, Yungjen Kung, Pranav Simha Rao, Diego Klabjan, Rafet Sifa, Julian Runge
Abstract	Predicting and improving player retention is crucial to the success of mobile Free-to-Play games. This paper explores the problem of rapid retention prediction in this context. Heuristic modeling approaches are introduced as a way of building simple rules for predicting short-term retention. Compared to common classification algorithms, our heuristic-based approach achieves reasonable and comparable performance using information from the first session, day, and week of player activity.
Tasks
Published	2016-07-12
URL	http://arxiv.org/abs/1607.03202v1
PDF	http://arxiv.org/pdf/1607.03202v1.pdf
PWC	https://paperswithcode.com/paper/rapid-prediction-of-player-retention-in-free
Repo
Framework

A Theme-Rewriting Approach for Generating Algebra Word Problems


Title	A Theme-Rewriting Approach for Generating Algebra Word Problems
Authors	Rik Koncel-Kedziorski, Ioannis Konstas, Luke Zettlemoyer, Hannaneh Hajishirzi
Abstract	Texts present coherent stories that have a particular theme or overall setting, for example science fiction or western. In this paper, we present a text generation method called {\it rewriting} that edits existing human-authored narratives to change their theme without changing the underlying story. We apply the approach to math word problems, where it might help students stay more engaged by quickly transforming all of their homework assignments to the theme of their favorite movie without changing the math concepts that are being taught. Our rewriting method uses a two-stage decoding process, which proposes new words from the target theme and scores the resulting stories according to a number of factors defining aspects of syntactic, semantic, and thematic coherence. Experiments demonstrate that the final stories typically represent the new theme well while still testing the original math concepts, outperforming a number of baselines. We also release a new dataset of human-authored rewrites of math word problems in several themes.
Tasks	Text Generation
Published	2016-10-19
URL	http://arxiv.org/abs/1610.06210v1
PDF	http://arxiv.org/pdf/1610.06210v1.pdf
PWC	https://paperswithcode.com/paper/a-theme-rewriting-approach-for-generating
Repo
Framework


Title	Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets, and Complexity
Authors	James Cussens, Matti Järvisalo, Janne H. Korhonen, Mark Bartlett
Abstract	The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks—arguably the most central class of graphical models—especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem( is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.
Tasks
Published	2016-05-13
URL	http://arxiv.org/abs/1605.04071v2
PDF	http://arxiv.org/pdf/1605.04071v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-network-structure-learning-with
Repo
Framework

Machine olfaction using time scattering of sensor multiresolution graphs


Title	Machine olfaction using time scattering of sensor multiresolution graphs
Authors	Leonid Gugel, Yoel Shkolnisky, Shai Dekel
Abstract	In this paper we construct a learning architecture for high dimensional time series sampled by sensor arrangements. Using a redundant wavelet decomposition on a graph constructed over the sensor locations, our algorithm is able to construct discriminative features that exploit the mutual information between the sensors. The algorithm then applies scattering networks to the time series graphs to create the feature space. We demonstrate our method on a machine olfaction problem, where one needs to classify the gas type and the location where it originates from data sampled by an array of sensors. Our experimental results clearly demonstrate that our method outperforms classical machine learning techniques used in previous studies.
Tasks	Time Series
Published	2016-02-13
URL	http://arxiv.org/abs/1602.04358v1
PDF	http://arxiv.org/pdf/1602.04358v1.pdf
PWC	https://paperswithcode.com/paper/machine-olfaction-using-time-scattering-of
Repo
Framework

FSMJ: Feature Selection with Maximum Jensen-Shannon Divergence for Text Categorization


Title	FSMJ: Feature Selection with Maximum Jensen-Shannon Divergence for Text Categorization
Authors	Bo Tang, Haibo He
Abstract	In this paper, we present a new wrapper feature selection approach based on Jensen-Shannon (JS) divergence, termed feature selection with maximum JS-divergence (FSMJ), for text categorization. Unlike most existing feature selection approaches, the proposed FSMJ approach is based on real-valued features which provide more information for discrimination than binary-valued features used in conventional approaches. We show that the FSMJ is a greedy approach and the JS-divergence monotonically increases when more features are selected. We conduct several experiments on real-life data sets, compared with the state-of-the-art feature selection approaches for text categorization. The superior performance of the proposed FSMJ approach demonstrates its effectiveness and further indicates its wide potential applications on data mining.
Tasks	Feature Selection, Text Categorization
Published	2016-06-20
URL	http://arxiv.org/abs/1606.06366v1
PDF	http://arxiv.org/pdf/1606.06366v1.pdf
PWC	https://paperswithcode.com/paper/fsmj-feature-selection-with-maximum-jensen
Repo
Framework