January 28, 2020

2986 words 15 mins read

Paper Group ANR 820

Learning undirected models via query training. Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions. A Holistic Natural Language Generation Framework for the Semantic Web. Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification. Novel diffusion-derived distance measure …

Learning undirected models via query training


Title	Learning undirected models via query training
Authors	Miguel Lazaro-Gredilla, Wolfgang Lehrach, Dileep George
Abstract	Typical amortized inference in variational autoencoders is specialized for a single probabilistic query. Here we propose an inference network architecture that generalizes to unseen probabilistic queries. Instead of an encoder-decoder pair, we can train a single inference network directly from data, using a cost function that is stochastic not only over samples, but also over queries. We can use this network to perform the same inference tasks as we would in an undirected graphical model with hidden variables, without having to deal with the intractable partition function. The results can be mapped to the learning of an actual undirected model, which is a notoriously hard problem. Our network also marginalizes nuisance variables as required. We show that our approach generalizes to unseen probabilistic queries on also unseen test data, providing fast and flexible inference. Experiments show that this approach outperforms or matches PCD and AdVIL on 9 benchmark datasets.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02893v1
PDF	https://arxiv.org/pdf/1912.02893v1.pdf
PWC	https://paperswithcode.com/paper/learning-undirected-models-via-query-training
Repo
Framework

Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions


Title	Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions
Authors	Zongchen Chen, Santosh S. Vempala
Abstract	We study Hamiltonian Monte Carlo (HMC) for sampling from a strongly logconcave density proportional to $e^{-f}$ where $f:\mathbb{R}^d \to \mathbb{R}$ is $\mu$-strongly convex and $L$-smooth (the condition number is $\kappa = L/\mu$). We show that the relaxation time (inverse of the spectral gap) of ideal HMC is $O(\kappa)$, improving on the previous best bound of $O(\kappa^{1.5})$; we complement this with an example where the relaxation time is $\Omega(\kappa)$. When implemented using a nearly optimal ODE solver, HMC returns an $\varepsilon$-approximate point in $2$-Wasserstein distance using $\widetilde{O}((\kappa d)^{0.5} \varepsilon^{-1})$ gradient evaluations per step and $\widetilde{O}((\kappa d)^{1.5}\varepsilon^{-1})$ total time.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02313v1
PDF	https://arxiv.org/pdf/1905.02313v1.pdf
PWC	https://paperswithcode.com/paper/optimal-convergence-rate-of-hamiltonian-monte
Repo
Framework

A Holistic Natural Language Generation Framework for the Semantic Web


Title	A Holistic Natural Language Generation Framework for the Semantic Web
Authors	Axel-Cyrille Ngonga Ngomo, Diego Moussallem, Lorenz Bühmann
Abstract	With the ever-growing generation of data for the Semantic Web comes an increasing demand for this data to be made available to non-semantic Web experts. One way of achieving this goal is to translate the languages of the Semantic Web into natural language. We present LD2NL, a framework for verbalizing the three key languages of the Semantic Web, i.e., RDF, OWL, and SPARQL. Our framework is based on a bottom-up approach to verbalization. We evaluated LD2NL in an open survey with 86 persons. Our results suggest that our framework can generate verbalizations that are close to natural languages and that can be easily understood by non-experts. Therewith, it enables non-domain experts to interpret Semantic Web data with more than 91% of the accuracy of domain experts.
Tasks	Text Generation
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01248v1
PDF	https://arxiv.org/pdf/1911.01248v1.pdf
PWC	https://paperswithcode.com/paper/a-holistic-natural-language-generation
Repo
Framework

Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification


Title	Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification
Authors	Youngmoon Jung, Younggwan Kim, Hyungjun Lim, Yeunju Choi, Hoirin Kim
Abstract	In this paper, we propose a new pooling method called spatial pyramid encoding (SPE) to generate speaker embeddings for text-independent speaker verification. We first partition the output feature maps from a deep residual network (ResNet) into increasingly fine sub-regions and extract speaker embeddings from each sub-region through a learnable dictionary encoding layer. These embeddings are concatenated to obtain the final speaker representation. The SPE layer not only generates a fixed-dimensional speaker embedding for a variable-length speech segment, but also aggregates the information of feature distribution from multi-level temporal bins. Furthermore, we apply deep length normalization by augmenting the loss function with ring loss. By applying ring loss, the network gradually learns to normalize the speaker embeddings using model weights themselves while preserving convexity, leading to more robust speaker embeddings. Experiments on the VoxCeleb1 dataset show that the proposed system using the SPE layer and ring loss-based deep length normalization outperforms both i-vector and d-vector baselines.
Tasks	Speaker Verification, Text-Independent Speaker Verification
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08333v1
PDF	https://arxiv.org/pdf/1906.08333v1.pdf
PWC	https://paperswithcode.com/paper/spatial-pyramid-encoding-with-convex-length
Repo
Framework

Novel diffusion-derived distance measures for graphs


Title	Novel diffusion-derived distance measures for graphs
Authors	C. B. Scott, Eric Mjolsness
Abstract	We define a new family of similarity and distance measures on graphs, and explore their theoretical properties in comparison to conventional distance metrics. These measures are defined by the solution(s) to an optimization problem which attempts find a map minimizing the discrepancy between two graph Laplacian exponential matrices, under norm-preserving and sparsity constraints. Variants of the distance metric are introduced to consider such optimized maps under sparsity constraints as well as fixed time-scaling between the two Laplacians. The objective function of this optimization is multimodal and has discontinuous slope, and is hence difficult for univariate optimizers to solve. We demonstrate a novel procedure for efficiently calculating these optima for two of our distance measure variants. We present numerical experiments demonstrating that (a) upper bounds of our distance metrics can be used to distinguish between lineages of related graphs; (b) our procedure is faster at finding the required optima, by as much as a factor of 10^3; and (c) the upper bounds satisfy the triangle inequality exactly under some assumptions and approximately under others. We also derive an upper bound for the distance between two graph products, in terms of the distance between the two pairs of factors. Additionally, we present several possible applications, including the construction of infinite “graph limits” by means of Cauchy sequences of graphs related to one another by our distance measure.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04203v1
PDF	https://arxiv.org/pdf/1909.04203v1.pdf
PWC	https://paperswithcode.com/paper/novel-diffusion-derived-distance-measures-for
Repo
Framework

Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification


Title	Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
Authors	Lanhua You, Wu Guo, Lirong Dai, Jun Du
Abstract	In this paper, gating mechanisms are applied in deep neural network (DNN) training for x-vector-based text-independent speaker verification. First, a gated convolution neural network (GCNN) is employed for modeling the frame-level embedding layers. Compared with the time-delay DNN (TDNN), the GCNN can obtain more expressive frame-level representations through carefully designed memory cell and gating mechanisms. Moreover, we propose a novel gated-attention statistics pooling strategy in which the attention scores are shared with the output gate. The gated-attention statistics pooling combines both gating and attention mechanisms into one framework; therefore, we can capture more useful information in the temporal pooling layer. Experiments are carried out using the NIST SRE16 and SRE18 evaluation datasets. The results demonstrate the effectiveness of the GCNN and show that the proposed gated-attention statistics pooling can further improve the performance.
Tasks	Speaker Verification, Text-Independent Speaker Verification
Published	2019-03-28
URL	http://arxiv.org/abs/1903.12092v2
PDF	http://arxiv.org/pdf/1903.12092v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-network-embeddings-with-gating
Repo
Framework

Automated brain extraction of multi-sequence MRI using artificial neural networks


Title	Automated brain extraction of multi-sequence MRI using artificial neural networks
Authors	Fabian Isensee, Marianne Schell, Irada Tursunova, Gianluca Brugnara, David Bonekamp, Ulf Neuberger, Antje Wick, Heinz-Peter Schlemmer, Sabine Heiland, Wolfgang Wick, Martin Bendszus, Klaus Hermann Maier-Hein, Philipp Kickingereder
Abstract	Brain extraction is a critical preprocessing step in the analysis of MRI neuroimaging studies and influences the accuracy of downstream analyses. The majority of brain extraction algorithms are, however, optimized for processing healthy brains and thus frequently fail in the presence of pathologically altered brain or when applied to heterogeneous MRI datasets. Here we introduce a new, rigorously validated algorithm (termed HD-BET) relying on artificial neural networks that aims to overcome these limitations. We demonstrate that HD-BET outperforms six popular, publicly available brain extraction algorithms in several large-scale neuroimaging datasets, including one from a prospective multicentric trial in neuro-oncology, yielding state-of-the-art performance with median improvements of +1.16 to +2.11 points for the DICE coefficient and -0.66 to -2.51 mm for the Hausdorff distance. Importantly, the HD-BET algorithm shows robust performance in the presence of pathology or treatment-induced tissue alterations, is applicable to a broad range of MRI sequence types and is not influenced by variations in MRI hardware and acquisition parameters encountered in both research and clinical practice. For broader accessibility our HD-BET prediction algorithm is made freely available (http://www.neuroAI-HD.org) and may become an essential component for robust, automated, high-throughput processing of MRI neuroimaging data.
Tasks
Published	2019-01-31
URL	https://arxiv.org/abs/1901.11341v2
PDF	https://arxiv.org/pdf/1901.11341v2.pdf
PWC	https://paperswithcode.com/paper/automated-brain-extraction-of-multi-sequence
Repo
Framework

Tag2Vec: Learning Tag Representations in Tag Networks


Title	Tag2Vec: Learning Tag Representations in Tag Networks
Authors	Junshan Wang, Zhicong Lu, Guojie Song, Yue Fan, Lun Du, Wei Lin
Abstract	Network embedding is a method to learn low-dimensional representation vectors for nodes in complex networks. In real networks, nodes may have multiple tags but existing methods ignore the abundant semantic and hierarchical information of tags. This information is useful to many network applications and usually very stable. In this paper, we propose a tag representation learning model, Tag2Vec, which mixes nodes and tags into a hybrid network. Firstly, for tag networks, we define semantic distance as the proximity between tags and design a novel strategy, parameterized random walk, to generate context with semantic and hierarchical information of tags adaptively. Then, we propose hyperbolic Skip-gram model to express the complex hierarchical structure better with lower output dimensions. We evaluate our model on the NBER U.S. patent dataset and WordNet dataset. The results show that our model can learn tag representations with rich semantic information and it outperforms other baselines.
Tasks	Network Embedding, Representation Learning
Published	2019-04-19
URL	http://arxiv.org/abs/1905.03041v1
PDF	http://arxiv.org/pdf/1905.03041v1.pdf
PWC	https://paperswithcode.com/paper/190503041
Repo
Framework

Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification


Title	Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification
Authors	Yun Tang, Guohong Ding, Jing Huang, Xiaodong He, Bowen Zhou
Abstract	This paper aims to improve the widely used deep speaker embedding x-vector model. We propose the following improvements: (1) a hybrid neural network structure using both time delay neural network (TDNN) and long short-term memory neural networks (LSTM) to generate complementary speaker information at different levels; (2) a multi-level pooling strategy to collect speaker information from both TDNN and LSTM layers; (3) a regularization scheme on the speaker embedding extraction layer to make the extracted embeddings suitable for the following fusion step. The synergy of these improvements are shown on the NIST SRE 2016 eval test (with a 19% EER reduction) and SRE 2018 dev test (with a 9% EER reduction), as well as more than 10% DCF scores reduction on these two test sets over the x-vector baseline.
Tasks	Speaker Verification, Text-Independent Speaker Verification
Published	2019-02-21
URL	http://arxiv.org/abs/1902.07821v1
PDF	http://arxiv.org/pdf/1902.07821v1.pdf
PWC	https://paperswithcode.com/paper/deep-speaker-embedding-learning-with-multi
Repo
Framework

Forecasting high-dimensional dynamics exploiting suboptimal embeddings


Title	Forecasting high-dimensional dynamics exploiting suboptimal embeddings
Authors	Shunya Okuno, Kazuyuki Aihara, Yoshito Hirata
Abstract	Delay embedding—a method for reconstructing dynamical systems by delay coordinates—is widely used to forecast nonlinear time series as a model-free approach. When multivariate time series are observed, several existing frameworks can be applied to yield a single forecast combining multiple forecasts derived from various embeddings. However, the performance of these frameworks is not always satisfactory because they randomly select embeddings or use brute force and do not consider the diversity of the embeddings to combine. Herein, we develop a forecasting framework that overcomes these existing problems. The framework exploits various “suboptimal embeddings” obtained by minimizing the in-sample error via combinatorial optimization. The framework achieves the best results among existing frameworks for sample toy datasets and a real-world flood dataset. We show that the framework is applicable to a wide range of data lengths and dimensions. Therefore, the framework can be applied to various fields such as neuroscience, ecology, finance, fluid dynamics, weather, and disaster prevention.
Tasks	Combinatorial Optimization, Time Series
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01552v1
PDF	https://arxiv.org/pdf/1907.01552v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-high-dimensional-dynamics
Repo
Framework

ScriptNet: Neural Static Analysis for Malicious JavaScript Detection


Title	ScriptNet: Neural Static Analysis for Malicious JavaScript Detection
Authors	Jack W. Stokes, Rakshit Agrawal, Geoff McDonald, Matthew Hausknecht
Abstract	Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our model variants are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating this model on a large corpus of 212,408 JavaScript files indicates that the best performing CPoLS model offers a 97.20% true positive rate (TPR) for the first 60K byte subsequence at a false positive rate (FPR) of 0.50%. The best performing CPoLS model significantly outperform several baseline models.
Tasks
Published	2019-04-01
URL	http://arxiv.org/abs/1904.01126v1
PDF	http://arxiv.org/pdf/1904.01126v1.pdf
PWC	https://paperswithcode.com/paper/scriptnet-neural-static-analysis-for
Repo
Framework

A Self-Organizing Network with Varying Density Structure for Characterizing Sensorimotor Transformations in Robotic Systems


Title	A Self-Organizing Network with Varying Density Structure for Characterizing Sensorimotor Transformations in Robotic Systems
Authors	Omar Zahra, David Navarro-Alarcon
Abstract	In this work, we present the development of a neuro-inspired approach for characterizing sensorimotor relations in robotic systems. The proposed method has self-organizing and associative properties that enable it to autonomously obtain these relations without any prior knowledge of either the motor (e.g. mechanical structure) or perceptual (e.g. sensor calibration) models. Self-organizing topographic properties are used to build both sensory and motor maps, then the associative properties rule the stability and accuracy of the emerging connections between these maps. Compared to previous works, our method introduces a new varying density self-organizing map (VDSOM) that controls the concentration of nodes in regions with large transformation errors without affecting much the computational time. A distortion metric is measured to achieve a self-tuning sensorimotor model that adapts to changes in either motor or sensory models. The obtained sensorimotor maps prove to have less error than conventional self-organizing methods and potential for further development.
Tasks	Calibration
Published	2019-05-01
URL	http://arxiv.org/abs/1905.00249v1
PDF	http://arxiv.org/pdf/1905.00249v1.pdf
PWC	https://paperswithcode.com/paper/a-self-organizing-network-with-varying
Repo
Framework

Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction


Title	Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction
Authors	Jian-Ya Ding, Chao Zhang, Lei Shen, Shengyin Li, Bing Wang, Yinghui Xu, Le Song
Abstract	Mixed Integer Programming (MIP) is one of the most widely used modeling techniques for combinatorial optimization problems. In many applications, a similar MIP model is solved on a regular basis, maintaining remarkable similarities in model structures and solution appearances but differing in formulation coefficients. This offers the opportunity for machine learning methods to explore the correlations between model structures and the resulting solution values. To address this issue, we propose to represent an MIP instance using a tripartite graph, based on which a Graph Convolutional Network (GCN) is constructed to predict solution values for binary variables. The predicted solutions are used to generate a local branching type cut which can be either treated as a global (invalid) inequality in the formulation resulting in a heuristic approach to solve the MIP, or as a root branching rule resulting in an exact approach. Computational evaluations on 8 distinct types of MIP problems show that the proposed framework improves the primal solution finding performance significantly on a state-of-the-art open-source MIP solver.
Tasks	Combinatorial Optimization
Published	2019-06-23
URL	https://arxiv.org/abs/1906.09575v2
PDF	https://arxiv.org/pdf/1906.09575v2.pdf
PWC	https://paperswithcode.com/paper/optimal-solution-predictions-for-mixed
Repo
Framework

On the Current State of Research in Explaining Ensemble Performance Using Margins


Title	On the Current State of Research in Explaining Ensemble Performance Using Margins
Authors	Waldyn Martinez, J. Brian Gray
Abstract	Empirical evidence shows that ensembles, such as bagging, boosting, random and rotation forests, generally perform better in terms of their generalization error than individual classifiers. To explain this performance, Schapire et al. (1998) developed an upper bound on the generalization error of an ensemble based on the margins of the training data, from which it was concluded that larger margins should lead to lower generalization error, everything else being equal. Many other researchers have backed this assumption and presented tighter bounds on the generalization error based on either the margins or functions of the margins. For instance, Shen and Li (2010) provide evidence suggesting that the generalization error of a voting classifier might be reduced by increasing the mean and decreasing the variance of the margins. In this article we propose several techniques and empirically test whether the current state of research in explaining ensemble performance holds. We evaluate the proposed methods through experiments with real and simulated data sets.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03123v1
PDF	https://arxiv.org/pdf/1906.03123v1.pdf
PWC	https://paperswithcode.com/paper/on-the-current-state-of-research-in
Repo
Framework

Impact of Data Pruning on Machine Learning Algorithm Performance


Title	Impact of Data Pruning on Machine Learning Algorithm Performance
Authors	Arun Thundyill Saseendran, Lovish Setia, Viren Chhabria, Debrup Chakraborty, Aneek Barman Roy
Abstract	Dataset pruning is the process of removing sub-optimal tuples from a dataset to improve the learning of a machine learning model. In this paper, we compared the performance of different algorithms, first on an unpruned dataset and then on an iteratively pruned dataset. The goal was to understand whether an algorithm (say A) on an unpruned dataset performs better than another algorithm (say B), will algorithm B perform better on the pruned data or vice-versa. The dataset chosen for our analysis is a subset of the largest movie ratings database publicly available on the internet, IMDb [1]. The learning objective of the model was to predict the categorical rating of a movie among 5 bins: poor, average, good, very good, excellent. The results indicated that an algorithm that performed better on an unpruned dataset also performed better on a pruned dataset.
Tasks
Published	2019-01-11
URL	http://arxiv.org/abs/1901.10539v1
PDF	http://arxiv.org/pdf/1901.10539v1.pdf
PWC	https://paperswithcode.com/paper/impact-of-data-pruning-on-machine-learning
Repo
Framework