February 1, 2020

2657 words 13 mins read

Paper Group AWR 312

Nonlinear Semi-Parametric Models for Survival Analysis. Brain Tissue Segmentation Using NeuroNet With Different Pre-processing Techniques. Estimating Density Models with Complex Truncation Boundaries. Multilingual is not enough: BERT for Finnish. Deep Hyperspectral Prior: Denoising, Inpainting, Super-Resolution. Dirichlet Simplex Nest and Geometric …

Nonlinear Semi-Parametric Models for Survival Analysis


Title	Nonlinear Semi-Parametric Models for Survival Analysis
Authors	Chirag Nagpal, Rohan Sangave, Amit Chahar, Parth Shah, Artur Dubrawski, Bhiksha Raj
Abstract	Semi-parametric survival analysis methods like the Cox Proportional Hazards (CPH) regression (Cox, 1972) are a popular approach for survival analysis. These methods involve fitting of the log-proportional hazard as a function of the covariates and are convenient as they do not require estimation of the baseline hazard rate. Recent approaches have involved learning non-linear representations of the input covariates and demonstrate improved performance. In this paper we argue against such deep parameterizations for survival analysis and experimentally demonstrate that more interpretable semi-parametric models inspired from mixtures of experts perform equally well or in some cases better than such overly parameterized deep models.
Tasks	Survival Analysis
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05865v1
PDF	https://arxiv.org/pdf/1905.05865v1.pdf
PWC	https://paperswithcode.com/paper/nonlinear-semi-parametric-models-for-survival
Repo	https://github.com/rohansangave/SurvivalAnalysis
Framework	pytorch

Brain Tissue Segmentation Using NeuroNet With Different Pre-processing Techniques


Title	Brain Tissue Segmentation Using NeuroNet With Different Pre-processing Techniques
Authors	Fakrul Islam Tushar, Basel Alyafi, Md. Kamrul Hasan, Lavsen Dahal
Abstract	Automatic segmentation of brain Magnetic Resonance Imaging (MRI) images is one of the vital steps for quantitative analysis of brain for further inspection. In this paper, NeuroNet has been adopted to segment the brain tissues (white matter (WM), grey matter (GM) and cerebrospinal fluid (CSF)) which uses Residual Network (ResNet) in encoder and Fully Convolution Network (FCN) in the decoder. To achieve the best performance, various hyper-parameters have been tuned, while, network parameters (kernel and bias) were initialized using the NeuroNet pre-trained model. Different pre-processing pipelines have also been introduced to get a robust trained model. The model has been trained and tested on IBSR18 data-set. To validate the research outcome, performance was measured quantitatively using Dice Similarity Coefficient (DSC) and is reported on average as 0.84 for CSF, 0.94 for GM, and 0.94 for WM. The outcome of the research indicates that for the IBSR18 data-set, pre-processing and proper tuning of hyper-parameters for NeuroNet model have improvement in DSC for the brain tissue segmentation.
Tasks
Published	2019-03-29
URL	http://arxiv.org/abs/1904.00068v1
PDF	http://arxiv.org/pdf/1904.00068v1.pdf
PWC	https://paperswithcode.com/paper/brain-tissue-segmentation-using-neuronet-with
Repo	https://github.com/fitushar/Brain-Tissue-Segmentation-Using-Deep-Learning-Pipeline-NeuroNet
Framework	tf

Estimating Density Models with Complex Truncation Boundaries


Title	Estimating Density Models with Complex Truncation Boundaries
Authors	Song Liu, Takafumi Kanamori
Abstract	Truncated densities are probability density functions defined on truncated input domains. These densities share the same parametric form with their non-truncated counterparts up to a normalization term. However, normalization terms usually cannot be obtained in closed form for these distributions, due to complicated truncation domains. Score Matching is a powerful tool for fitting parameters in unnormalized models. However, it cannot be straightforwardly applied here as boundary conditions used to derive a tractable objective are usually not satisfied by truncated distributions. In this paper, we propose a maximally weighted Score Matching objective function which takes the geometry of the truncation boundary into account when fitting unnormalized density models. We show the weighting function that maximizes the objective function can be constructed easily and the boundary conditions for deriving a tradable objective are satisfied. Experiments on toy datasets and Chicago crime dataset show promising results.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03834v1
PDF	https://arxiv.org/pdf/1910.03834v1.pdf
PWC	https://paperswithcode.com/paper/estimating-density-models-with-complex
Repo	https://github.com/anewgithubname/Truncated-Score-Matching
Framework	none

Multilingual is not enough: BERT for Finnish


Title	Multilingual is not enough: BERT for Finnish
Authors	Antti Virtanen, Jenna Kanerva, Rami Ilo, Jouni Luoma, Juhani Luotolahti, Tapio Salakoski, Filip Ginter, Sampo Pyysalo
Abstract	Deep learning-based language models pretrained on large unannotated text corpora have been demonstrated to allow efficient transfer learning for natural language processing, with recent approaches such as the transformer-based BERT model advancing the state of the art across a variety of tasks. While most work on these models has focused on high-resource languages, in particular English, a number of recent efforts have introduced multilingual models that can be fine-tuned to address tasks in a large number of different languages. However, we still lack a thorough understanding of the capabilities of these models, in particular for lower-resourced languages. In this paper, we focus on Finnish and thoroughly evaluate the multilingual BERT model on a range of tasks, comparing it with a new Finnish BERT model trained from scratch. The new language-specific model is shown to systematically and clearly outperform the multilingual. While the multilingual model largely fails to reach the performance of previously proposed methods, the custom Finnish BERT model establishes new state-of-the-art results on all corpora for all reference tasks: part-of-speech tagging, named entity recognition, and dependency parsing. We release the model and all related resources created for this study with open licenses at https://turkunlp.org/finbert .
Tasks	Dependency Parsing, Named Entity Recognition, Part-Of-Speech Tagging, Transfer Learning
Published	2019-12-15
URL	https://arxiv.org/abs/1912.07076v1
PDF	https://arxiv.org/pdf/1912.07076v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-is-not-enough-bert-for-finnish
Repo	https://github.com/TurkuNLP/FinBERT
Framework	pytorch

Deep Hyperspectral Prior: Denoising, Inpainting, Super-Resolution


Title	Deep Hyperspectral Prior: Denoising, Inpainting, Super-Resolution
Authors	Oleksii Sidorov, Jon Yngve Hardeberg
Abstract	Deep learning algorithms have demonstrated state-of-the-art performance in various tasks of image restoration. This was made possible through the ability of CNNs to learn from large exemplar sets. However, the latter becomes an issue for hyperspectral image processing where datasets commonly consist of just a few images. In this work, we propose a new approach to denoising, inpainting, and super-resolution of hyperspectral image data using intrinsic properties of a CNN without any training. The performance of the given algorithm is shown to be comparable to the performance of trained networks, while its application is not restricted by the availability of training data. This work is an extension of original “deep prior” algorithm to HSI domain and 3D-convolutional networks.
Tasks	Denoising, Image Restoration, Super-Resolution
Published	2019-02-01
URL	https://arxiv.org/abs/1902.00301v2
PDF	https://arxiv.org/pdf/1902.00301v2.pdf
PWC	https://paperswithcode.com/paper/deep-hyperspectral-prior-denoising-inpainting
Repo	https://github.com/acecreamu/deep-hs-prior
Framework	pytorch

Dirichlet Simplex Nest and Geometric Inference


Title	Dirichlet Simplex Nest and Geometric Inference
Authors	Mikhail Yurochkin, Aritra Guha, Yuekai Sun, XuanLong Nguyen
Abstract	We propose Dirichlet Simplex Nest, a class of probabilistic models suitable for a variety of data types, and develop fast and provably accurate inference algorithms by accounting for the model’s convex geometry and low dimensional simplicial structure. By exploiting the connection to Voronoi tessellation and properties of Dirichlet distribution, the proposed inference algorithm is shown to achieve consistency and strong error bound guarantees on a range of model settings and data distributions. The effectiveness of our model and the learning algorithm is demonstrated by simulations and by analyses of text and financial data.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11009v1
PDF	https://arxiv.org/pdf/1905.11009v1.pdf
PWC	https://paperswithcode.com/paper/dirichlet-simplex-nest-and-geometric
Repo	https://github.com/moonfolk/VLAD
Framework	none

Bayesian Tensor Network with Polynomial Complexity for Probabilistic Machine Learning


Title	Bayesian Tensor Network with Polynomial Complexity for Probabilistic Machine Learning
Authors	Shi-Ju Ran
Abstract	It is known that describing or calculating the conditional probabilities of multiple events is exponentially expensive. In this work, Bayesian tensor network (BTN) is proposed to efficiently capture the conditional probabilities of multiple sets of events with polynomial complexity. BTN is a directed acyclic graphical model that forms a subset of TN. To testify its validity for exponentially many events, BTN is implemented to the image recognition, where the classification is mapped to capturing the conditional probabilities in an exponentially large sample space. Competitive performance is achieved by the BTN with simple tree network structures. Analogous to the tensor network simulations of quantum systems, the validity of the simple-tree BTN implies an ``area law’’ of fluctuations in image recognition problems. \|
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12923v2
PDF	https://arxiv.org/pdf/1912.12923v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-tensor-network-and-optimization
Repo	https://github.com/ranshiju/BayesianTN
Framework	pytorch

Nonconvex Regularized Robust Regression with Oracle Properties in Polynomial Time


Title	Nonconvex Regularized Robust Regression with Oracle Properties in Polynomial Time
Authors	Xiaoou Pan, Qiang Sun, Wen-Xin Zhou
Abstract	This paper investigates tradeoffs among optimization errors, statistical rates of convergence and the effect of heavy-tailed errors for high-dimensional robust regression with nonconvex regularization. When the additive errors in linear models have only bounded second moment, our results suggest that adaptive Huber regression with nonconvex regularization yields statistically optimal estimators that satisfy oracle properties as if the true underlying support set were known beforehand. Computationally, we need as many as O(log s + log log d) convex relaxations to reach such oracle estimators, where s and d denote the sparsity and ambient dimension, respectively. Extension to a general class of robust loss functions is also considered. Numerical studies lend strong support to our methodology and theory.
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04027v2
PDF	https://arxiv.org/pdf/1907.04027v2.pdf
PWC	https://paperswithcode.com/paper/nonconvex-regularized-robust-regression-with
Repo	https://github.com/XiaoouPan/ILAMM
Framework	tf

Scaling structural learning with NO-BEARS to infer causal transcriptome networks


Title	Scaling structural learning with NO-BEARS to infer causal transcriptome networks
Authors	Hao-Chih Lee, Matteo Danieletto, Riccardo Miotto, Sarah T. Cherng, Joel T. Dudley
Abstract	Constructing gene regulatory networks is a critical step in revealing disease mechanisms from transcriptomic data. In this work, we present NO-BEARS, a novel algorithm for estimating gene regulatory networks. The NO-BEARS algorithm is built on the basis of the NOTEARS algorithm with two improvements. First, we propose a new constraint and its fast approximation to reduce the computational cost of the NO-TEARS algorithm. Next, we introduce a polynomial regression loss to handle non-linearity in gene expressions. Our implementation utilizes modern GPU computation that can decrease the time of hours-long CPU computation to seconds. Using synthetic data, we demonstrate improved performance, both in processing time and accuracy, on inferring gene regulatory networks from gene expression data.
Tasks
Published	2019-10-31
URL	https://arxiv.org/abs/1911.00081v1
PDF	https://arxiv.org/pdf/1911.00081v1.pdf
PWC	https://paperswithcode.com/paper/scaling-structural-learning-with-no-bears-to
Repo	https://github.com/howchihlee/BNGPU
Framework	tf

Silhouette Guided Point Cloud Reconstruction beyond Occlusion


Title	Silhouette Guided Point Cloud Reconstruction beyond Occlusion
Authors	Chuhang Zou, Derek Hoiem
Abstract	One major challenge in 3D reconstruction is to infer the complete shape geometry from partial foreground occlusions. In this paper, we propose a method to reconstruct the complete 3D shape of an object from a single RGB image, with robustness to occlusion. Given the image and a silhouette of the visible region, our approach completes the silhouette of the occluded region and then generates a point cloud. We show improvements for reconstruction of non-occluded and partially occluded objects by providing the predicted complete silhouette as guidance. We also improve state-of-the-art for 3D shape prediction with a 2D reprojection loss from multiple synthetic views and a surface-based smoothing and refinement step. Experiments demonstrate the efficacy of our approach both quantitatively and qualitatively on synthetic and real scene datasets.
Tasks	3D Reconstruction
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12253v1
PDF	https://arxiv.org/pdf/1907.12253v1.pdf
PWC	https://paperswithcode.com/paper/silhouette-guided-point-cloud-reconstruction
Repo	https://github.com/zouchuhang/Silhouette-Guided-3D
Framework	pytorch

Error Feedback Fixes SignSGD and other Gradient Compression Schemes


Title	Error Feedback Fixes SignSGD and other Gradient Compression Schemes
Authors	Sai Praneeth Karimireddy, Quentin Rebjock, Sebastian U. Stich, Martin Jaggi
Abstract	Sign-based algorithms (e.g. signSGD) have been proposed as a biased gradient compression technique to alleviate the communication bottleneck in training large neural networks across multiple workers. We show simple convex counter-examples where signSGD does not converge to the optimum. Further, even when it does converge, signSGD may generalize poorly when compared with SGD. These issues arise because of the biased nature of the sign compression operator. We then show that using error-feedback, i.e. incorporating the error made by the compression operator into the next step, overcomes these issues. We prove that our algorithm EF-SGD with arbitrary compression operator achieves the same rate of convergence as SGD without any additional assumptions. Thus EF-SGD achieves gradient compression for free. Our experiments thoroughly substantiate the theory and show that error-feedback improves both convergence and generalization. Code can be found at \url{https://github.com/epfml/error-feedback-SGD}.
Tasks
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09847v2
PDF	https://arxiv.org/pdf/1901.09847v2.pdf
PWC	https://paperswithcode.com/paper/error-feedback-fixes-signsgd-and-other
Repo	https://github.com/epfml/error-feedback-SGD
Framework	pytorch

Live Reconstruction of Large-Scale Dynamic Outdoor Worlds


Title	Live Reconstruction of Large-Scale Dynamic Outdoor Worlds
Authors	Ondrej Miksik, Vibhav Vineet
Abstract	Standard 3D reconstruction pipelines assume stationary world, therefore suffer from ghost artifacts' whenever dynamic objects are present in the scene. Recent approaches has started tackling this issue, however, they typically either only discard dynamic information, represent it using bounding boxes or per-frame depth or rely on approaches that are inherently slow and not suitable to online settings. We propose an end-to-end system for live reconstruction of large-scale outdoor dynamic environments. We leverage recent advances in computationally efficient data-driven approaches for 6-DoF object pose estimation to segment the scene into objects and stationary background’. This allows us to represent the scene using a time-dependent (dynamic) map, in which each object is explicitly represented as a separate instance and reconstructed in its own volume. For each time step, our dynamic map maintains a relative pose of each volume with respect to the stationary background. Our system operates in incremental manner which is essential for on-line reconstruction, handles large-scale environments with objects at large distances and runs in (near) real-time. We demonstrate the efficacy of our approach on the KITTI dataset, and provide qualitative and quantitative results showing high-quality dense 3D reconstructions of a number of dynamic scenes.
Tasks	3D Reconstruction, Pose Estimation
Published	2019-03-15
URL	http://arxiv.org/abs/1903.06708v2
PDF	http://arxiv.org/pdf/1903.06708v2.pdf
PWC	https://paperswithcode.com/paper/live-reconstruction-of-large-scale-dynamic
Repo	https://github.com/omiksik/dfusion
Framework	none

Message Passing Attention Networks for Document Understanding


Title	Message Passing Attention Networks for Document Understanding
Authors	Giannis Nikolentzos, Antoine J. -P. Tixier, Michalis Vazirgiannis
Abstract	Graph neural networks have recently emerged as a very effective framework for processing graph-structured data. These models have achieved state-of-the-art performance in many tasks. Most graph neural networks can be described in terms of message passing, vertex update, and readout functions. In this paper, we represent documents as word co-occurrence networks and propose an application of the message passing framework to NLP, the Message Passing Attention network for Document understanding (MPAD). We also propose several hierarchical variants of MPAD. Experiments conducted on 10 standard text classification datasets show that our architectures are competitive with the state-of-the-art. Ablation studies reveal further insights about the impact of the different components on performance. Code is publicly available at: https://github.com/giannisnik/mpad .
Tasks	Text Classification
Published	2019-08-17
URL	https://arxiv.org/abs/1908.06267v2
PDF	https://arxiv.org/pdf/1908.06267v2.pdf
PWC	https://paperswithcode.com/paper/message-passing-attention-networks-for
Repo	https://github.com/giannisnik/mpad
Framework	pytorch

Entropy from Machine Learning


Title	Entropy from Machine Learning
Authors	Romuald A. Janik
Abstract	We translate the problem of calculating the entropy of a set of binary configurations/signals into a sequence of supervised classification tasks. Subsequently, one can use virtually any machine learning classification algorithm for computing entropy. This procedure can be used to compute entropy, and consequently the free energy directly from a set of Monte Carlo configurations at a given temperature. As a test of the proposed method, using an off-the-shelf machine learning classifier we reproduce the entropy and free energy of the 2D Ising model from Monte Carlo configurations at various temperatures throughout its phase diagram. Other potential applications include computing the entropy of spiking neurons or any other multidimensional binary signals.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10831v3
PDF	https://arxiv.org/pdf/1909.10831v3.pdf
PWC	https://paperswithcode.com/paper/entropy-from-machine-learning
Repo	https://github.com/rmldj/ml-entropy
Framework	none

Stripe: Tensor Compilation via the Nested Polyhedral Model


Title	Stripe: Tensor Compilation via the Nested Polyhedral Model
Authors	Tim Zerrell, Jeremy Bruestle
Abstract	Hardware architectures and machine learning (ML) libraries evolve rapidly. Traditional compilers often fail to generate high-performance code across the spectrum of new hardware offerings. To mitigate, engineers develop hand-tuned kernels for each ML library update and hardware upgrade. Unfortunately, this approach requires excessive engineering effort to scale or maintain with any degree of state-of-the-art performance. Here we present a Nested Polyhedral Model for representing highly parallelizable computations with limited dependencies between iterations. This model provides an underlying framework for an intermediate representation (IR) called Stripe, amenable to standard compiler techniques while naturally modeling key aspects of modern ML computing. Stripe represents parallelism, efficient memory layout, and multiple compute units at a level of abstraction amenable to automatic optimization. We describe how Stripe enables a compiler for ML in the style of LLVM that allows independent development of algorithms, optimizations, and hardware accelerators. We also discuss the design exploration advantages of Stripe over kernel libraries and schedule-based or schedule-space-based code generation.
Tasks	Code Generation
Published	2019-03-14
URL	http://arxiv.org/abs/1903.06498v1
PDF	http://arxiv.org/pdf/1903.06498v1.pdf
PWC	https://paperswithcode.com/paper/stripe-tensor-compilation-via-the-nested
Repo	https://github.com/plaidml/plaidml
Framework	none