January 27, 2020

2911 words 14 mins read

Paper Group ANR 1091

Bayesian causal inference via probabilistic program synthesis. Visual Confusion Label Tree For Image Classification. Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees. Learning to Rank Broad and Narrow Queries in E-Commerce. Amortized Rejection Sampling in Universal Probabilistic Programming. Princip …

Bayesian causal inference via probabilistic program synthesis


Title	Bayesian causal inference via probabilistic program synthesis
Authors	Sam Witty, Alexander Lew, David Jensen, Vikash Mansinghka
Abstract	Causal inference can be formalized as Bayesian inference that combines a prior distribution over causal models and likelihoods that account for both observations and interventions. We show that it is possible to implement this approach using a sufficiently expressive probabilistic programming language. Priors are represented using probabilistic programs that generate source code in a domain specific language. Interventions are represented using probabilistic programs that edit this source code to modify the original generative process. This approach makes it straightforward to incorporate data from atomic interventions, as well as shift interventions, variance-scaling interventions, and other interventions that modify causal structure. This approach also enables the use of general-purpose inference machinery for probabilistic programs to infer probable causal structures and parameters from data. This abstract describes a prototype of this approach in the Gen probabilistic programming language.
Tasks	Bayesian Inference, Causal Inference, Probabilistic Programming, Program Synthesis
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14124v1
PDF	https://arxiv.org/pdf/1910.14124v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-causal-inference-via-probabilistic
Repo
Framework

Visual Confusion Label Tree For Image Classification


Title	Visual Confusion Label Tree For Image Classification
Authors	Yuntao Liu, Yong Dou, Ruochun Jin, Rongchun Li
Abstract	Convolution neural network models are widely used in image classification tasks. However, the running time of such models is so long that it is not the conforming to the strict real-time requirement of mobile devices. In order to optimize models and meet the requirement mentioned above, we propose a method that replaces the fully-connected layers of convolution neural network models with a tree classifier. Specifically, we construct a Visual Confusion Label Tree based on the output of the convolution neural network models, and use a multi-kernel SVM plus classifier with hierarchical constraints to train the tree classifier. Focusing on those confusion subsets instead of the entire set of categories makes the tree classifier more discriminative and the replacement of the fully-connected layers reduces the original running time. Experiments show that our tree classifier obtains a significant improvement over the state-of-the-art tree classifier by 4.3% and 2.4% in terms of top-1 accuracy on CIFAR-100 and ImageNet datasets respectively. Additionally, our method achieves 124x and 115x speedup ratio compared with fully-connected layers on AlexNet and VGG16 without accuracy decline.
Tasks	Image Classification
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02012v1
PDF	https://arxiv.org/pdf/1906.02012v1.pdf
PWC	https://paperswithcode.com/paper/visual-confusion-label-tree-for-image
Repo
Framework

Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees


Title	Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees
Authors	Abdul Karim, Avinash Mishra, M A Hakim Newton, Abdul Sattar
Abstract	Toxicity prediction of chemical compounds is a grand challenge. Lately, it achieved significant progress in accuracy but using a huge set of features, implementing a complex blackbox technique such as a deep neural network, and exploiting enormous computational resources. In this paper, we strongly argue for the models and methods that are simple in machine learning characteristics, efficient in computing resource usage, and powerful to achieve very high accuracy levels. To demonstrate this, we develop a single task-based chemical toxicity prediction framework using only 2D features that are less compute intensive. We effectively use a decision tree to obtain an optimum number of features from a collection of thousands of them. We use a shallow neural network and jointly optimize it with decision tree taking both network parameters and input features into account. Our model needs only a minute on a single CPU for its training while existing methods using deep neural networks need about 10 min on NVidia Tesla K40 GPU. However, we obtain similar or better performance on several toxicity benchmark tasks. We also develop a cumulative feature ranking method which enables us to identify features that can help chemists perform prescreening of toxic compounds effectively.
Tasks
Published	2019-01-26
URL	http://arxiv.org/abs/1901.09240v1
PDF	http://arxiv.org/pdf/1901.09240v1.pdf
PWC	https://paperswithcode.com/paper/efficient-toxicity-prediction-via-simple
Repo
Framework

Learning to Rank Broad and Narrow Queries in E-Commerce


Title	Learning to Rank Broad and Narrow Queries in E-Commerce
Authors	Siddhartha Devapujula, Sagar Arora, Sumit Borar
Abstract	Search is a prominent channel for discovering products on an e-commerce platform. Ranking products retrieved from search becomes crucial to address customer’s need and optimize for business metrics. While learning to Rank (LETOR) models have been extensively studied and have demonstrated efficacy in the context of web search; it is a relatively new research area to be explored in the e-commerce. In this paper, we present a framework for building LETOR model for an e-commerce platform. We analyze user queries and propose a mechanism to segment queries between broad and narrow based on user’s intent. We discuss different types of features - query, product and query-product and discuss challenges in using them. We show that sparsity in product features can be tackled through a denoising auto-encoder while skip-gram based word embeddings help solve the query-product sparsity issues. We also present various target metrics that can be employed for evaluating search results and compare their robustness. Further, we build and compare performances of both pointwise and pairwise LETOR models on fashion category data set. We also build and compare distinct models for broad and narrow queries, analyze feature importance across these and show that these specialized models perform better than a combined model in the fashion world.
Tasks	Denoising, Feature Importance, Learning-To-Rank, Word Embeddings
Published	2019-07-01
URL	https://arxiv.org/abs/1907.01549v2
PDF	https://arxiv.org/pdf/1907.01549v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-rank-broad-and-narrow-queries-in
Repo
Framework

Amortized Rejection Sampling in Universal Probabilistic Programming


Title	Amortized Rejection Sampling in Universal Probabilistic Programming
Authors	Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood
Abstract	Existing approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. An instance of this is importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove finite variance of our estimator and empirically demonstrate our method’s correctness and efficiency compared to existing alternatives on generative programs containing rejection sampling loops and discuss how to implement our method in a generic probabilistic programming framework.
Tasks	Probabilistic Programming
Published	2019-10-20
URL	https://arxiv.org/abs/1910.09056v2
PDF	https://arxiv.org/pdf/1910.09056v2.pdf
PWC	https://paperswithcode.com/paper/amortized-rejection-sampling-in-universal
Repo
Framework

Principal Model Analysis Based on Partial Least Squares


Title	Principal Model Analysis Based on Partial Least Squares
Authors	Qiwei Xie, Liang Tang, Weifu Li, Vijay John, Yong Hu
Abstract	Motivated by the Bagging Partial Least Squares (PLS) and Principal Component Analysis (PCA) algorithms, we propose a Principal Model Analysis (PMA) method in this paper. In the proposed PMA algorithm, the PCA and the PLS are combined. In the method, multiple PLS models are trained on sub-training sets, derived from the original training set based on the random sampling with replacement method. The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix. The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix. The proposed PMA method is compared with other traditional dimension reduction methods, such as PLS, Bagging PLS, Linear discriminant analysis (LDA) and PLS-LDA. Experimental results on six public datasets show that our proposed method can achieve better classification performance and is usually more stable.
Tasks	Dimensionality Reduction
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02422v1
PDF	http://arxiv.org/pdf/1902.02422v1.pdf
PWC	https://paperswithcode.com/paper/principal-model-analysis-based-on-partial
Repo
Framework

Variational Uncalibrated Photometric Stereo under General Lighting


Title	Variational Uncalibrated Photometric Stereo under General Lighting
Authors	Bjoern Haefner, Zhenzhang Ye, Maolin Gao, Tao Wu, Yvain Quéau, Daniel Cremers
Abstract	Photometric stereo (PS) techniques nowadays remain constrained to an ideal laboratory setup where modeling and calibration of lighting is amenable. To eliminate such restrictions, we propose an efficient principled variational approach to uncalibrated PS under general illumination. To this end, the Lambertian reflectance model is approximated through a spherical harmonic expansion, which preserves the spatial invariance of the lighting. The joint recovery of shape, reflectance and illumination is then formulated as a single variational problem. There the shape estimation is carried out directly in terms of the underlying perspective depth map, thus implicitly ensuring integrability and bypassing the need for a subsequent normal integration. To tackle the resulting nonconvex problem numerically, we undertake a two-phase procedure to initialize a balloon-like perspective depth map, followed by a “lagged” block coordinate descent scheme. The experiments validate efficiency and robustness of this approach. Across a variety of evaluations, we are able to reduce the mean angular error consistently by a factor of 2-3 compared to the state-of-the-art.
Tasks	Calibration
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03942v2
PDF	https://arxiv.org/pdf/1904.03942v2.pdf
PWC	https://paperswithcode.com/paper/variational-uncalibrated-photometric-stereo
Repo
Framework

Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data


Title	Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data
Authors	Davoud Ataee Tarzanagh, George Michailidis
Abstract	We introduce a general tensor model suitable for data analytic tasks for heterogeneous data sets, wherein there are joint low-rank structures within groups of observations, but also discriminative structures across different groups. To capture such complex structures, a double core tensor (DCOT) factorization model is introduced together with a family of smoothing loss functions. By leveraging the proposed smoothing function, the model accurately estimates the model factors, even in the presence of missing entries. A linearized ADMM method is employed to solve regularized versions of DCOT factorizations, that avoid large tensor operations and large memory storage requirements. Further, we establish theoretically its global convergence, together with consistency of the estimates of the model parameters. The effectiveness of the DCOT model is illustrated on several real-world examples including image completion, recommender systems, subspace clustering and detecting modules in heterogeneous Omics multi-modal data, since it provides more insightful decompositions than conventional tensor methods.
Tasks	Recommendation Systems
Published	2019-11-24
URL	https://arxiv.org/abs/1911.10454v1
PDF	https://arxiv.org/pdf/1911.10454v1.pdf
PWC	https://paperswithcode.com/paper/regularized-and-smooth-double-core-tensor
Repo
Framework

Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning


Title	Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning
Authors	Xin Huang, Duan Li, Daniel Zhuoyu Long
Abstract	Stochastic control with both inherent random system noise and lack of knowledge on system parameters constitutes the core and fundamental topic in reinforcement learning (RL), especially under non-episodic situations where online learning is much more demanding. This challenge has been notably addressed in Bayesian RL recently where some approximation techniques have been developed to find suboptimal policies. While existing approaches mainly focus on approximating the value function, or on involving Thompson sampling, we propose a novel two-layer solution scheme in this paper to approximate the optimal policy directly, by combining the time-decomposition based dynamic programming (DP) at the lower layer and the scenario-decomposition based revised progressive hedging algorithm (PHA) at the upper layer, for a type of Bayesian RL problem. The key feature of our approach is to separate reducible system uncertainty from irreducible one at two different layers, thus decomposing and conquering. We demonstrate our solution framework more especially via the linear-quadratic-Gaussian problem with unknown gain, which, although seemingly simple, has been a notorious subject over more than half century in dual control.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09035v1
PDF	https://arxiv.org/pdf/1906.09035v1.pdf
PWC	https://paperswithcode.com/paper/revised-progressive-hedging-algorithm-based
Repo
Framework

Generalized Dilation Neural Networks


Title	Generalized Dilation Neural Networks
Authors	Gavneet Singh Chadha, Jan Niclas Reimann, Andreas Schwung
Abstract	Vanilla convolutional neural networks are known to provide superior performance not only in image recognition tasks but also in natural language processing and time series analysis. One of the strengths of convolutional layers is the ability to learn features about spatial relations in the input domain using various parameterized convolutional kernels. However, in time series analysis learning such spatial relations is not necessarily required nor effective. In such cases, kernels which model temporal dependencies or kernels with broader spatial resolutions are recommended for more efficient training as proposed by dilation kernels. However, the dilation has to be fixed a priori which limits the flexibility of the kernels. We propose generalized dilation networks which generalize the initial dilations in two aspects. First we derive an end-to-end learnable architecture for dilation layers where also the dilation rate can be learned. Second we break up the strict dilation structure, in that we develop kernels operating independently in the input space.
Tasks	Time Series, Time Series Analysis
Published	2019-05-08
URL	https://arxiv.org/abs/1905.02961v1
PDF	https://arxiv.org/pdf/1905.02961v1.pdf
PWC	https://paperswithcode.com/paper/generalized-dilation-neural-networks
Repo
Framework

GLOSS: Generative Latent Optimization of Sentence Representations


Title	GLOSS: Generative Latent Optimization of Sentence Representations
Authors	Sidak Pal Singh, Angela Fan, Michael Auli
Abstract	We propose a method to learn unsupervised sentence representations in a non-compositional manner based on Generative Latent Optimization. Our approach does not impose any assumptions on how words are to be combined into a sentence representation. We discuss a simple Bag of Words model as well as a variant that models word positions. Both are trained to reconstruct the sentence based on a latent code and our model can be used to generate text. Experiments show large improvements over the related Paragraph Vectors. Compared to uSIF, we achieve a relative improvement of 5% when trained on the same data and our method performs competitively to Sent2vec while trained on 30 times less data.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06385v1
PDF	https://arxiv.org/pdf/1907.06385v1.pdf
PWC	https://paperswithcode.com/paper/gloss-generative-latent-optimization-of
Repo
Framework

Pruning a BERT-based Question Answering Model


Title	Pruning a BERT-based Question Answering Model
Authors	J. S. McCarley
Abstract	We investigate compressing a BERT-based question answering system by pruning parameters from the underlying BERT model. We start from models trained for SQuAD 2.0 and introduce gates that allow selected parts of transformers to be individually eliminated. Specifically, we investigate (1) reducing the number of attention heads in each transformer, (2) reducing the intermediate width of the feed-forward sublayer of each transformer, and (3) reducing the embedding dimension. We compare several approaches for determining the values of these gates. We find that a combination of pruning attention heads and the feed-forward layer almost doubles the decoding speed, with only a 1.5 f-point loss in accuracy.
Tasks	Question Answering
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06360v1
PDF	https://arxiv.org/pdf/1910.06360v1.pdf
PWC	https://paperswithcode.com/paper/pruning-a-bert-based-question-answering-model
Repo
Framework

Matrix denoising for weighted loss functions and heterogeneous signals


Title	Matrix denoising for weighted loss functions and heterogeneous signals
Authors	William Leeb
Abstract	We consider the problem of estimating a low-rank matrix from a noisy observed matrix. Previous work has shown that the optimal method depends crucially on the choice of loss function. In this paper, we use a family of weighted loss functions, which arise naturally in many settings such as heteroscedastic noise, missing data, and submatrix denoising. However, weighted loss functions are challenging to analyze because they are not orthogonally-invariant. We derive optimal spectral denoisers for these weighted loss functions. By combining different weights, we then use these optimal denoisers to construct a new denoiser that exploits heterogeneity in the signal matrix to boost estimation with unweighted loss.
Tasks	Denoising
Published	2019-02-25
URL	https://arxiv.org/abs/1902.09474v2
PDF	https://arxiv.org/pdf/1902.09474v2.pdf
PWC	https://paperswithcode.com/paper/matrix-denoising-for-weighted-loss-functions
Repo
Framework

A Non-Intrusive Method of Face Liveness Detection Using Specular Reflection and Local Binary Patterns


Title	A Non-Intrusive Method of Face Liveness Detection Using Specular Reflection and Local Binary Patterns
Authors	Shivang Bharadwaj, Bhupendra Niranjan, Anant Kumar
Abstract	With the advent of ubiquitous facial recognition technology in our everyday life, face spoofing presents a serious threat to the reliability of the security of the system. A spoofing attack occurs when a person tries to impersonate another person's biometric traits in order to circumvent the biometric security of the system. We have seen a lot of work being done to create systems, both intrusive and nonintrusive, to tackle the ingenious ways in which spoofing attacks try to bypass the biometric authorization systems but at the cost of computation or robustness. In this paper, we propose a robust, computationally swift and non-intrusive method to detect face spoofing attacks consisting of recaptured photographs of faces using Local Binary Patterns(LBP) and Specular Reflection. We consider the application as a binary classification problem and make use of Support Vector Machine(SVM) classifier to classify the photograph into real or fake. Experimental analysis shows competitive results of our method on publicly available datasets when compared to other works.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06540v2
PDF	https://arxiv.org/pdf/1905.06540v2.pdf
PWC	https://paperswithcode.com/paper/a-non-intrusive-method-of-face-liveness
Repo
Framework

Automatic segmentation of kidney and liver tumors in CT images


Title	Automatic segmentation of kidney and liver tumors in CT images
Authors	Dina B. Efremova, Dmitry A. Konovalov, Thanongchai Siriapisith, Worapan Kusakunniran, Peter Haddawy
Abstract	Automatic segmentation of hepatic lesions in computed tomography (CT) images is a challenging task to perform due to heterogeneous, diffusive shape of tumors and complex background. To address the problem more and more researchers rely on assistance of deep convolutional neural networks (CNN) with 2D or 3D type architecture that have proven to be effective in a wide range of computer vision tasks, including medical image processing. In this technical report, we carry out research focused on more careful approach to the process of learning rather than on complex architecture of the CNN. We have chosen MICCAI 2017 LiTS dataset for training process and the public 3DIRCADb dataset for validation of our method. The proposed algorithm reached DICE score 78.8% on the 3DIRCADb dataset. The described method was then applied to the 2019 Kidney Tumor Segmentation (KiTS-2019) challenge, where our single submission achieved 96.38% for kidney and 67.38% for tumor Dice scores.
Tasks	Computed Tomography (CT)
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01279v2
PDF	https://arxiv.org/pdf/1908.01279v2.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-of-kidney-and-liver
Repo
Framework