Paper Group ANR 1091
Bayesian causal inference via probabilistic program synthesis. Visual Confusion Label Tree For Image Classification. Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees. Learning to Rank Broad and Narrow Queries in E-Commerce. Amortized Rejection Sampling in Universal Probabilistic Programming. Princip …
Bayesian causal inference via probabilistic program synthesis
Title | Bayesian causal inference via probabilistic program synthesis |
Authors | Sam Witty, Alexander Lew, David Jensen, Vikash Mansinghka |
Abstract | Causal inference can be formalized as Bayesian inference that combines a prior distribution over causal models and likelihoods that account for both observations and interventions. We show that it is possible to implement this approach using a sufficiently expressive probabilistic programming language. Priors are represented using probabilistic programs that generate source code in a domain specific language. Interventions are represented using probabilistic programs that edit this source code to modify the original generative process. This approach makes it straightforward to incorporate data from atomic interventions, as well as shift interventions, variance-scaling interventions, and other interventions that modify causal structure. This approach also enables the use of general-purpose inference machinery for probabilistic programs to infer probable causal structures and parameters from data. This abstract describes a prototype of this approach in the Gen probabilistic programming language. |
Tasks | Bayesian Inference, Causal Inference, Probabilistic Programming, Program Synthesis |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14124v1 |
https://arxiv.org/pdf/1910.14124v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-causal-inference-via-probabilistic |
Repo | |
Framework | |
Visual Confusion Label Tree For Image Classification
Title | Visual Confusion Label Tree For Image Classification |
Authors | Yuntao Liu, Yong Dou, Ruochun Jin, Rongchun Li |
Abstract | Convolution neural network models are widely used in image classification tasks. However, the running time of such models is so long that it is not the conforming to the strict real-time requirement of mobile devices. In order to optimize models and meet the requirement mentioned above, we propose a method that replaces the fully-connected layers of convolution neural network models with a tree classifier. Specifically, we construct a Visual Confusion Label Tree based on the output of the convolution neural network models, and use a multi-kernel SVM plus classifier with hierarchical constraints to train the tree classifier. Focusing on those confusion subsets instead of the entire set of categories makes the tree classifier more discriminative and the replacement of the fully-connected layers reduces the original running time. Experiments show that our tree classifier obtains a significant improvement over the state-of-the-art tree classifier by 4.3% and 2.4% in terms of top-1 accuracy on CIFAR-100 and ImageNet datasets respectively. Additionally, our method achieves 124x and 115x speedup ratio compared with fully-connected layers on AlexNet and VGG16 without accuracy decline. |
Tasks | Image Classification |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02012v1 |
https://arxiv.org/pdf/1906.02012v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-confusion-label-tree-for-image |
Repo | |
Framework | |
Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees
Title | Efficient Toxicity Prediction via Simple Features Using Shallow Neural Networks and Decision Trees |
Authors | Abdul Karim, Avinash Mishra, M A Hakim Newton, Abdul Sattar |
Abstract | Toxicity prediction of chemical compounds is a grand challenge. Lately, it achieved significant progress in accuracy but using a huge set of features, implementing a complex blackbox technique such as a deep neural network, and exploiting enormous computational resources. In this paper, we strongly argue for the models and methods that are simple in machine learning characteristics, efficient in computing resource usage, and powerful to achieve very high accuracy levels. To demonstrate this, we develop a single task-based chemical toxicity prediction framework using only 2D features that are less compute intensive. We effectively use a decision tree to obtain an optimum number of features from a collection of thousands of them. We use a shallow neural network and jointly optimize it with decision tree taking both network parameters and input features into account. Our model needs only a minute on a single CPU for its training while existing methods using deep neural networks need about 10 min on NVidia Tesla K40 GPU. However, we obtain similar or better performance on several toxicity benchmark tasks. We also develop a cumulative feature ranking method which enables us to identify features that can help chemists perform prescreening of toxic compounds effectively. |
Tasks | |
Published | 2019-01-26 |
URL | http://arxiv.org/abs/1901.09240v1 |
http://arxiv.org/pdf/1901.09240v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-toxicity-prediction-via-simple |
Repo | |
Framework | |
Learning to Rank Broad and Narrow Queries in E-Commerce
Title | Learning to Rank Broad and Narrow Queries in E-Commerce |
Authors | Siddhartha Devapujula, Sagar Arora, Sumit Borar |
Abstract | Search is a prominent channel for discovering products on an e-commerce platform. Ranking products retrieved from search becomes crucial to address customer’s need and optimize for business metrics. While learning to Rank (LETOR) models have been extensively studied and have demonstrated efficacy in the context of web search; it is a relatively new research area to be explored in the e-commerce. In this paper, we present a framework for building LETOR model for an e-commerce platform. We analyze user queries and propose a mechanism to segment queries between broad and narrow based on user’s intent. We discuss different types of features - query, product and query-product and discuss challenges in using them. We show that sparsity in product features can be tackled through a denoising auto-encoder while skip-gram based word embeddings help solve the query-product sparsity issues. We also present various target metrics that can be employed for evaluating search results and compare their robustness. Further, we build and compare performances of both pointwise and pairwise LETOR models on fashion category data set. We also build and compare distinct models for broad and narrow queries, analyze feature importance across these and show that these specialized models perform better than a combined model in the fashion world. |
Tasks | Denoising, Feature Importance, Learning-To-Rank, Word Embeddings |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.01549v2 |
https://arxiv.org/pdf/1907.01549v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-rank-broad-and-narrow-queries-in |
Repo | |
Framework | |
Amortized Rejection Sampling in Universal Probabilistic Programming
Title | Amortized Rejection Sampling in Universal Probabilistic Programming |
Authors | Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood |
Abstract | Existing approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. An instance of this is importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove finite variance of our estimator and empirically demonstrate our method’s correctness and efficiency compared to existing alternatives on generative programs containing rejection sampling loops and discuss how to implement our method in a generic probabilistic programming framework. |
Tasks | Probabilistic Programming |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.09056v2 |
https://arxiv.org/pdf/1910.09056v2.pdf | |
PWC | https://paperswithcode.com/paper/amortized-rejection-sampling-in-universal |
Repo | |
Framework | |
Principal Model Analysis Based on Partial Least Squares
Title | Principal Model Analysis Based on Partial Least Squares |
Authors | Qiwei Xie, Liang Tang, Weifu Li, Vijay John, Yong Hu |
Abstract | Motivated by the Bagging Partial Least Squares (PLS) and Principal Component Analysis (PCA) algorithms, we propose a Principal Model Analysis (PMA) method in this paper. In the proposed PMA algorithm, the PCA and the PLS are combined. In the method, multiple PLS models are trained on sub-training sets, derived from the original training set based on the random sampling with replacement method. The regression coefficients of all the sub-PLS models are fused in a joint regression coefficient matrix. The final projection direction is then estimated by performing the PCA on the joint regression coefficient matrix. The proposed PMA method is compared with other traditional dimension reduction methods, such as PLS, Bagging PLS, Linear discriminant analysis (LDA) and PLS-LDA. Experimental results on six public datasets show that our proposed method can achieve better classification performance and is usually more stable. |
Tasks | Dimensionality Reduction |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02422v1 |
http://arxiv.org/pdf/1902.02422v1.pdf | |
PWC | https://paperswithcode.com/paper/principal-model-analysis-based-on-partial |
Repo | |
Framework | |
Variational Uncalibrated Photometric Stereo under General Lighting
Title | Variational Uncalibrated Photometric Stereo under General Lighting |
Authors | Bjoern Haefner, Zhenzhang Ye, Maolin Gao, Tao Wu, Yvain Quéau, Daniel Cremers |
Abstract | Photometric stereo (PS) techniques nowadays remain constrained to an ideal laboratory setup where modeling and calibration of lighting is amenable. To eliminate such restrictions, we propose an efficient principled variational approach to uncalibrated PS under general illumination. To this end, the Lambertian reflectance model is approximated through a spherical harmonic expansion, which preserves the spatial invariance of the lighting. The joint recovery of shape, reflectance and illumination is then formulated as a single variational problem. There the shape estimation is carried out directly in terms of the underlying perspective depth map, thus implicitly ensuring integrability and bypassing the need for a subsequent normal integration. To tackle the resulting nonconvex problem numerically, we undertake a two-phase procedure to initialize a balloon-like perspective depth map, followed by a “lagged” block coordinate descent scheme. The experiments validate efficiency and robustness of this approach. Across a variety of evaluations, we are able to reduce the mean angular error consistently by a factor of 2-3 compared to the state-of-the-art. |
Tasks | Calibration |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.03942v2 |
https://arxiv.org/pdf/1904.03942v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-uncalibrated-photometric-stereo |
Repo | |
Framework | |
Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data
Title | Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data |
Authors | Davoud Ataee Tarzanagh, George Michailidis |
Abstract | We introduce a general tensor model suitable for data analytic tasks for heterogeneous data sets, wherein there are joint low-rank structures within groups of observations, but also discriminative structures across different groups. To capture such complex structures, a double core tensor (DCOT) factorization model is introduced together with a family of smoothing loss functions. By leveraging the proposed smoothing function, the model accurately estimates the model factors, even in the presence of missing entries. A linearized ADMM method is employed to solve regularized versions of DCOT factorizations, that avoid large tensor operations and large memory storage requirements. Further, we establish theoretically its global convergence, together with consistency of the estimates of the model parameters. The effectiveness of the DCOT model is illustrated on several real-world examples including image completion, recommender systems, subspace clustering and detecting modules in heterogeneous Omics multi-modal data, since it provides more insightful decompositions than conventional tensor methods. |
Tasks | Recommendation Systems |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10454v1 |
https://arxiv.org/pdf/1911.10454v1.pdf | |
PWC | https://paperswithcode.com/paper/regularized-and-smooth-double-core-tensor |
Repo | |
Framework | |
Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning
Title | Revised Progressive-Hedging-Algorithm Based Two-layer Solution Scheme for Bayesian Reinforcement Learning |
Authors | Xin Huang, Duan Li, Daniel Zhuoyu Long |
Abstract | Stochastic control with both inherent random system noise and lack of knowledge on system parameters constitutes the core and fundamental topic in reinforcement learning (RL), especially under non-episodic situations where online learning is much more demanding. This challenge has been notably addressed in Bayesian RL recently where some approximation techniques have been developed to find suboptimal policies. While existing approaches mainly focus on approximating the value function, or on involving Thompson sampling, we propose a novel two-layer solution scheme in this paper to approximate the optimal policy directly, by combining the time-decomposition based dynamic programming (DP) at the lower layer and the scenario-decomposition based revised progressive hedging algorithm (PHA) at the upper layer, for a type of Bayesian RL problem. The key feature of our approach is to separate reducible system uncertainty from irreducible one at two different layers, thus decomposing and conquering. We demonstrate our solution framework more especially via the linear-quadratic-Gaussian problem with unknown gain, which, although seemingly simple, has been a notorious subject over more than half century in dual control. |
Tasks | |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09035v1 |
https://arxiv.org/pdf/1906.09035v1.pdf | |
PWC | https://paperswithcode.com/paper/revised-progressive-hedging-algorithm-based |
Repo | |
Framework | |
Generalized Dilation Neural Networks
Title | Generalized Dilation Neural Networks |
Authors | Gavneet Singh Chadha, Jan Niclas Reimann, Andreas Schwung |
Abstract | Vanilla convolutional neural networks are known to provide superior performance not only in image recognition tasks but also in natural language processing and time series analysis. One of the strengths of convolutional layers is the ability to learn features about spatial relations in the input domain using various parameterized convolutional kernels. However, in time series analysis learning such spatial relations is not necessarily required nor effective. In such cases, kernels which model temporal dependencies or kernels with broader spatial resolutions are recommended for more efficient training as proposed by dilation kernels. However, the dilation has to be fixed a priori which limits the flexibility of the kernels. We propose generalized dilation networks which generalize the initial dilations in two aspects. First we derive an end-to-end learnable architecture for dilation layers where also the dilation rate can be learned. Second we break up the strict dilation structure, in that we develop kernels operating independently in the input space. |
Tasks | Time Series, Time Series Analysis |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.02961v1 |
https://arxiv.org/pdf/1905.02961v1.pdf | |
PWC | https://paperswithcode.com/paper/generalized-dilation-neural-networks |
Repo | |
Framework | |
GLOSS: Generative Latent Optimization of Sentence Representations
Title | GLOSS: Generative Latent Optimization of Sentence Representations |
Authors | Sidak Pal Singh, Angela Fan, Michael Auli |
Abstract | We propose a method to learn unsupervised sentence representations in a non-compositional manner based on Generative Latent Optimization. Our approach does not impose any assumptions on how words are to be combined into a sentence representation. We discuss a simple Bag of Words model as well as a variant that models word positions. Both are trained to reconstruct the sentence based on a latent code and our model can be used to generate text. Experiments show large improvements over the related Paragraph Vectors. Compared to uSIF, we achieve a relative improvement of 5% when trained on the same data and our method performs competitively to Sent2vec while trained on 30 times less data. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06385v1 |
https://arxiv.org/pdf/1907.06385v1.pdf | |
PWC | https://paperswithcode.com/paper/gloss-generative-latent-optimization-of |
Repo | |
Framework | |
Pruning a BERT-based Question Answering Model
Title | Pruning a BERT-based Question Answering Model |
Authors | J. S. McCarley |
Abstract | We investigate compressing a BERT-based question answering system by pruning parameters from the underlying BERT model. We start from models trained for SQuAD 2.0 and introduce gates that allow selected parts of transformers to be individually eliminated. Specifically, we investigate (1) reducing the number of attention heads in each transformer, (2) reducing the intermediate width of the feed-forward sublayer of each transformer, and (3) reducing the embedding dimension. We compare several approaches for determining the values of these gates. We find that a combination of pruning attention heads and the feed-forward layer almost doubles the decoding speed, with only a 1.5 f-point loss in accuracy. |
Tasks | Question Answering |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06360v1 |
https://arxiv.org/pdf/1910.06360v1.pdf | |
PWC | https://paperswithcode.com/paper/pruning-a-bert-based-question-answering-model |
Repo | |
Framework | |
Matrix denoising for weighted loss functions and heterogeneous signals
Title | Matrix denoising for weighted loss functions and heterogeneous signals |
Authors | William Leeb |
Abstract | We consider the problem of estimating a low-rank matrix from a noisy observed matrix. Previous work has shown that the optimal method depends crucially on the choice of loss function. In this paper, we use a family of weighted loss functions, which arise naturally in many settings such as heteroscedastic noise, missing data, and submatrix denoising. However, weighted loss functions are challenging to analyze because they are not orthogonally-invariant. We derive optimal spectral denoisers for these weighted loss functions. By combining different weights, we then use these optimal denoisers to construct a new denoiser that exploits heterogeneity in the signal matrix to boost estimation with unweighted loss. |
Tasks | Denoising |
Published | 2019-02-25 |
URL | https://arxiv.org/abs/1902.09474v2 |
https://arxiv.org/pdf/1902.09474v2.pdf | |
PWC | https://paperswithcode.com/paper/matrix-denoising-for-weighted-loss-functions |
Repo | |
Framework | |
A Non-Intrusive Method of Face Liveness Detection Using Specular Reflection and Local Binary Patterns
Title | A Non-Intrusive Method of Face Liveness Detection Using Specular Reflection and Local Binary Patterns |
Authors | Shivang Bharadwaj, Bhupendra Niranjan, Anant Kumar |
Abstract | With the advent of ubiquitous facial recognition technology in our everyday life, face spoofing presents a serious threat to the reliability of the security of the system. A spoofing attack occurs when a person tries to impersonate another person's biometric traits in order to circumvent the biometric security of the system. We have seen a lot of work being done to create systems, both intrusive and nonintrusive, to tackle the ingenious ways in which spoofing attacks try to bypass the biometric authorization systems but at the cost of computation or robustness. In this paper, we propose a robust, computationally swift and non-intrusive method to detect face spoofing attacks consisting of recaptured photographs of faces using Local Binary Patterns(LBP) and Specular Reflection. We consider the application as a binary classification problem and make use of Support Vector Machine(SVM) classifier to classify the photograph into real or fake. Experimental analysis shows competitive results of our method on publicly available datasets when compared to other works. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06540v2 |
https://arxiv.org/pdf/1905.06540v2.pdf | |
PWC | https://paperswithcode.com/paper/a-non-intrusive-method-of-face-liveness |
Repo | |
Framework | |
Automatic segmentation of kidney and liver tumors in CT images
Title | Automatic segmentation of kidney and liver tumors in CT images |
Authors | Dina B. Efremova, Dmitry A. Konovalov, Thanongchai Siriapisith, Worapan Kusakunniran, Peter Haddawy |
Abstract | Automatic segmentation of hepatic lesions in computed tomography (CT) images is a challenging task to perform due to heterogeneous, diffusive shape of tumors and complex background. To address the problem more and more researchers rely on assistance of deep convolutional neural networks (CNN) with 2D or 3D type architecture that have proven to be effective in a wide range of computer vision tasks, including medical image processing. In this technical report, we carry out research focused on more careful approach to the process of learning rather than on complex architecture of the CNN. We have chosen MICCAI 2017 LiTS dataset for training process and the public 3DIRCADb dataset for validation of our method. The proposed algorithm reached DICE score 78.8% on the 3DIRCADb dataset. The described method was then applied to the 2019 Kidney Tumor Segmentation (KiTS-2019) challenge, where our single submission achieved 96.38% for kidney and 67.38% for tumor Dice scores. |
Tasks | Computed Tomography (CT) |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.01279v2 |
https://arxiv.org/pdf/1908.01279v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-of-kidney-and-liver |
Repo | |
Framework | |