January 30, 2020

2751 words 13 mins read

Paper Group ANR 372

Big Data Approaches to Knot Theory: Understanding the Structure of the Jones Polynomial. Rumour Detection via News Propagation Dynamics and User Representation Learning. Causal Regularization. Jacobian Policy Optimizations. Reading Comprehension Ability Test-A Turing Test for Reading Comprehension. Identity Connections in Residual Nets Improve Nois …

Big Data Approaches to Knot Theory: Understanding the Structure of the Jones Polynomial


Title	Big Data Approaches to Knot Theory: Understanding the Structure of the Jones Polynomial
Authors	Jesse S F Levitt, Mustafa Hajij, Radmila Sazdanovic
Abstract	We examine the structure and dimensionality of the Jones polynomial using manifold learning techniques. Our data set consists of more than 10 million knots up to 17 crossings and two other special families up to 2001 crossings. We introduce and describe a method for using filtrations to analyze infinite data sets where representative sampling is impossible or impractical, an essential requirement for working with knots and the data from knot invariants. In particular, this method provides a new approach for analyzing knot invariants using Principal Component Analysis. Using this approach on the Jones polynomial data we find that it can be viewed as an approximately 3 dimensional manifold, that this description is surprisingly stable with respect to the filtration by the crossing number, and that the results suggest further structures to be examined and understood.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10086v1
PDF	https://arxiv.org/pdf/1912.10086v1.pdf
PWC	https://paperswithcode.com/paper/big-data-approaches-to-knot-theory
Repo
Framework

Rumour Detection via News Propagation Dynamics and User Representation Learning


Title	Rumour Detection via News Propagation Dynamics and User Representation Learning
Authors	Tien Huu Do, Xiao Luo, Duc Minh Nguyen, Nikos Deligiannis
Abstract	Rumours have existed for a long time and have been known for serious consequences. The rapid growth of social media platforms has multiplied the negative impact of rumours; it thus becomes important to early detect them. Many methods have been introduced to detect rumours using the content or the social context of news. However, most existing methods ignore or do not explore effectively the propagation pattern of news in social media, including the sequence of interactions of social media users with news across time. In this work, we propose a novel method for rumour detection based on deep learning. Our method leverages the propagation process of the news by learning the users’ representation and the temporal interrelation of users’ responses. Experiments conducted on Twitter and Weibo datasets demonstrate the state-of-the-art performance of the proposed method.
Tasks	Representation Learning, Rumour Detection
Published	2019-04-18
URL	http://arxiv.org/abs/1905.03042v1
PDF	http://arxiv.org/pdf/1905.03042v1.pdf
PWC	https://paperswithcode.com/paper/190503042
Repo
Framework

Causal Regularization


Title	Causal Regularization
Authors	Dominik Janzing
Abstract	I argue that regularizing terms in standard regression methods not only help against overfitting finite data, but sometimes also yield better causal models in the infinite sample regime. I first consider a multi-dimensional variable linearly influencing a target variable with some multi-dimensional unobserved common cause, where the confounding effect can be decreased by keeping the penalizing term in Ridge and Lasso regression even in the population limit. Choosing the size of the penalizing term, is however challenging, because cross validation is pointless. Here it is done by first estimating the strength of confounding via a method proposed earlier, which yielded some reasonable results for simulated and real data. Further, I prove a `causal generalization bound’ which states (subject to a particular model of confounding) that the error made by interpreting any non-linear regression as causal model can be bounded from above whenever functions are taken from a not too rich class. In other words, the bound guarantees “generalization” from observational to interventional distributions, which is usually not subject of statistical learning theory (and is only possible due to the underlying symmetries of the confounder model). \|
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12179v1
PDF	https://arxiv.org/pdf/1906.12179v1.pdf
PWC	https://paperswithcode.com/paper/causal-regularization-1
Repo
Framework

Jacobian Policy Optimizations


Title	Jacobian Policy Optimizations
Authors	Arip Asadulaev, Gideon Stein, Igor Kuznetsov, Andrey Filchenkov
Abstract	Recently, natural policy gradient algorithms gained widespread recognition due to their strong performance in reinforcement learning tasks. However, their major drawback is the need to secure the policy being in a ``trust region’’ and meanwhile allowing for sufficient exploration. The main objective of this study was to present an approach which models dynamical isometry of agents policies by estimating conditioning of its Jacobian at individual points in the environment space. We present a Jacobian Policy Optimization algorithm for policy optimization, which dynamically adapts the trust interval with respect to policy conditioning. The suggested approach was tested across a range of Atari environments. This paper offers some important insights into an improvement of policy optimization in reinforcement learning tasks. \|
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05437v1
PDF	https://arxiv.org/pdf/1906.05437v1.pdf
PWC	https://paperswithcode.com/paper/jacobian-policy-optimizations
Repo
Framework

Reading Comprehension Ability Test-A Turing Test for Reading Comprehension


Title	Reading Comprehension Ability Test-A Turing Test for Reading Comprehension
Authors	Yuan Miao, Gongqi Lin, Yidan Hu, Chunyan Miao
Abstract	Reading comprehension is an important ability of human intelligence. Literacy and numeracy are two most essential foundation for people to succeed at study, at work and in life. Reading comprehension ability is a core component of literacy. In most of the education systems, developing reading comprehension ability is compulsory in the curriculum from year one to year 12. It is an indispensable ability in the dissemination of knowledge. With the emerging artificial intelligence, computers start to be able to read and understand like people in some context. They can even read better than human beings for some tasks, but have little clue in other tasks. It will be very beneficial if we can identify the levels of machine comprehension ability, which will direct us on the further improvement. Turing test is a well-known test of the difference between computer intelligence and human intelligence. In order to be able to compare the difference between people reading and machines reading, we proposed a test called (reading) Comprehension Ability Test (CAT).CAT is similar to Turing test, passing of which means we cannot differentiate people from algorithms in term of their comprehension ability. CAT has multiple levels showing the different abilities in reading comprehension, from identifying basic facts, performing inference, to understanding the intent and sentiment.
Tasks	Reading Comprehension
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02399v1
PDF	https://arxiv.org/pdf/1909.02399v1.pdf
PWC	https://paperswithcode.com/paper/reading-comprehension-ability-test-a-turing
Repo
Framework

Identity Connections in Residual Nets Improve Noise Stability


Title	Identity Connections in Residual Nets Improve Noise Stability
Authors	Shuzhi Yu, Carlo Tomasi
Abstract	Residual Neural Networks (ResNets) achieve state-of-the-art performance in many computer vision problems. Compared to plain networks without residual connections (PlnNets), ResNets train faster, generalize better, and suffer less from the so-called degradation problem. We introduce simplified (but still nonlinear) versions of ResNets and PlnNets for which these discrepancies still hold, although to a lesser degree. We establish a 1-1 mapping between simplified ResNets and simplified PlnNets, and show that they are exactly equivalent to each other in expressive power for the same computational complexity. We conjecture that ResNets generalize better because they have better noise stability, and empirically support it for both simplified and fully-fledged networks.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10944v1
PDF	https://arxiv.org/pdf/1905.10944v1.pdf
PWC	https://paperswithcode.com/paper/identity-connections-in-residual-nets-improve
Repo
Framework

Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization


Title	Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization
Authors	Filip de Roos, Philipp Hennig
Abstract	Pre-conditioning is a well-known concept that can significantly improve the convergence of optimization algorithms. For noise-free problems, where good pre-conditioners are not known a priori, iterative linear algebra methods offer one way to efficiently construct them. For the stochastic optimization problems that dominate contemporary machine learning, however, this approach is not readily available. We propose an iterative algorithm inspired by classic iterative linear solvers that uses a probabilistic model to actively infer a pre-conditioner in situations where Hessian-projections can only be constructed with strong Gaussian noise. The algorithm is empirically demonstrated to efficiently construct effective pre-conditioners for stochastic gradient descent and its variants. Experiments on problems of comparably low dimensionality show improved convergence. In very high-dimensional problems, such as those encountered in deep learning, the pre-conditioner effectively becomes an automatic learning-rate adaptation scheme, which we also empirically show to work well.
Tasks	Stochastic Optimization
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07557v1
PDF	http://arxiv.org/pdf/1902.07557v1.pdf
PWC	https://paperswithcode.com/paper/active-probabilistic-inference-on-matrices
Repo
Framework

Connections Between Adaptive Control and Optimization in Machine Learning


Title	Connections Between Adaptive Control and Optimization in Machine Learning
Authors	Joseph E. Gaudio, Travis E. Gibson, Anuradha M. Annaswamy, Michael A. Bolender, Eugene Lavretsky
Abstract	This paper demonstrates many immediate connections between adaptive control and optimization methods commonly employed in machine learning. Starting from common output error formulations, similarities in update law modifications are examined. Concepts in stability, performance, and learning, common to both fields are then discussed. Building on the similarities in update laws and common concepts, new intersections and opportunities for improved algorithm analysis are provided. In particular, a specific problem related to higher order learning is solved through insights obtained from these intersections.
Tasks
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05856v1
PDF	http://arxiv.org/pdf/1904.05856v1.pdf
PWC	https://paperswithcode.com/paper/connections-between-adaptive-control-and
Repo
Framework

Representation Learning: A Statistical Perspective


Title	Representation Learning: A Statistical Perspective
Authors	Jianwen Xie, Ruiqi Gao, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu
Abstract	Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning representations from a statistical perspective. In particular, we review the following two themes: (a) unsupervised learning of vector representations and (b) learning of both vector and matrix representations.
Tasks	Representation Learning
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11374v1
PDF	https://arxiv.org/pdf/1911.11374v1.pdf
PWC	https://paperswithcode.com/paper/representation-learning-a-statistical
Repo
Framework

Convergence of Gradient Methods on Bilinear Zero-Sum Games


Title	Convergence of Gradient Methods on Bilinear Zero-Sum Games
Authors	Guojun Zhang, Yaoliang Yu
Abstract	Min-max formulations have attracted great attention in the ML community due to the rise of deep generative models and adversarial methods, while understanding the dynamics of gradient algorithms for solving such formulations has remained a grand challenge. As a first step, we restrict to bilinear zero-sum games and give a systematic analysis of popular gradient updates, for both simultaneous and alternating versions. We provide exact conditions for their convergence and find the optimal parameter setup and convergence rates. In particular, our results offer formal evidence that alternating updates converge “better” than simultaneous ones.
Tasks
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05699v4
PDF	https://arxiv.org/pdf/1908.05699v4.pdf
PWC	https://paperswithcode.com/paper/convergence-behaviour-of-some-gradient-based
Repo
Framework

Multi-objective Evolutionary Approach to Grey-Box Identification of Buck Converter


Title	Multi-objective Evolutionary Approach to Grey-Box Identification of Buck Converter
Authors	Faizal Hafiz, Akshya Swain, Eduardo M. A. M. Mendes, Luis Aguirre
Abstract	The present study proposes a simple grey-box identification approach to model a real DC-DC buck converter operating in continuous conduction mode. The problem associated with the information void in the observed dynamical data, which is often obtained over a relatively narrow input range, is alleviated by exploiting the known static behavior of buck converter as a priori knowledge. A simple method is developed based on the concept of term clusters to determine the static response of the candidate models. The error in the static behavior is then directly embedded into the multi-objective framework for structure selection. In essence, the proposed approach casts grey-box identification problem into a multi-objective framework to balance bias-variance dilemma of model building while explicitly integrating a priori knowledge into the structure selection process. The results of the investigation, considering the case of practical buck converter, demonstrate that it is possible to identify parsimonious models which can capture both the dynamic and static behavior of the system over a wide input range.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04320v2
PDF	https://arxiv.org/pdf/1909.04320v2.pdf
PWC	https://paperswithcode.com/paper/multi-objective-evolutionary-approach-to-grey
Repo
Framework

Adaptive Iterative Hessian Sketch via A-Optimal Subsampling


Title	Adaptive Iterative Hessian Sketch via A-Optimal Subsampling
Authors	Aijun Zhang, Hengtao Zhang, Guosheng Yin
Abstract	Iterative Hessian sketch (IHS) is an effective sketching method for modeling large-scale data. It was originally proposed by Pilanci and Wainwright (2016; JMLR) based on randomized sketching matrices. However, it is computationally intensive due to the iterative sketch process. In this paper, we analyze the IHS algorithm under the unconstrained least squares problem setting, then propose a deterministic approach for improving IHS via A-optimal subsampling. Our contributions are three-fold: (1) a good initial estimator based on the A-optimal design is suggested; (2) a novel ridged preconditioner is developed for repeated sketching; and (3) an exact line search method is proposed for determining the optimal step length adaptively. Extensive experimental results demonstrate that our proposed A-optimal IHS algorithm outperforms the existing accelerated IHS methods.
Tasks
Published	2019-02-20
URL	https://arxiv.org/abs/1902.07627v2
PDF	https://arxiv.org/pdf/1902.07627v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-iterative-hessian-sketch-via-a
Repo
Framework

Closed-form Expressions for Maximum Mean Discrepancy with Applications to Wasserstein Auto-Encoders


Title	Closed-form Expressions for Maximum Mean Discrepancy with Applications to Wasserstein Auto-Encoders
Authors	Raif M. Rustamov
Abstract	The Maximum Mean Discrepancy (MMD) has found numerous applications in statistics and machine learning, most recently as a penalty in the Wasserstein Auto-Encoder (WAE). In this paper we compute closed-form expressions for estimating the Gaussian kernel based MMD between a given distribution and the standard multivariate normal distribution. We introduce the standardized version of MMD as a penalty for the WAE training objective, allowing for a better interpretability of MMD values and more compatibility across different hyperparameter settings. Next, we propose using a version of batch normalization at the code layer; this has the benefits of making the kernel width selection easier, reducing the training effort, and preventing outliers in the aggregate code distribution. Finally, we discuss the appropriate null distributions and provide thresholds for multivariate normality testing with the standardized MMD, leading to a number of easy rules of thumb for monitoring the progress of WAE training. Curiously, our MMD formula reveals a connection to the Baringhaus-Henze-Epps-Pulley (BHEP) statistic of the Henze-Zirkler test and provides further insights about the MMD. Our experiments on synthetic and real data show that the analytic formulation improves over the commonly used stochastic approximation of the MMD, and demonstrate that code normalization provides significant benefits when training WAEs.
Tasks
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03227v1
PDF	http://arxiv.org/pdf/1901.03227v1.pdf
PWC	https://paperswithcode.com/paper/closed-form-expressions-for-maximum-mean
Repo
Framework

Beta Survival Models


Title	Beta Survival Models
Authors	David Hubbard, Benoit Rostykus, Yves Raimond, Tony Jebara
Abstract	This article analyzes the problem of estimating the time until an event occurs, also known as survival modeling. We observe through substantial experiments on large real-world datasets and use-cases that populations are largely heterogeneous. Sub-populations have different mean and variance in their survival rates requiring flexible models that capture heterogeneity. We leverage a classical extension of the logistic function into the survival setting to characterize unobserved heterogeneity using the beta distribution. This yields insights into the geometry of the problem as well as efficient estimation methods for linear, tree and neural network models that adjust the beta distribution based on observed covariates. We also show that the additional information captured by the beta distribution leads to interesting ranking implications as we determine who is most-at-risk. We show theoretically that the ranking is variable as we forecast forward in time and prove that pairwise comparisons of survival remain transitive. Empirical results using large-scale datasets across two use-cases (online conversions and retention modeling), demonstrate the competitiveness of the method. The simplicity of the method and its ability to capture skew in the data makes it a viable alternative to standard techniques particularly when we are interested in the time to event and when the underlying probabilities are heterogeneous.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03818v1
PDF	https://arxiv.org/pdf/1905.03818v1.pdf
PWC	https://paperswithcode.com/paper/beta-survival-models
Repo
Framework


Title	Understanding Social Networks using Transfer Learning
Authors	Jun Sun, Steffen Staab, Jérôme Kunegis
Abstract	A detailed understanding of users contributes to the understanding of the Web’s evolution, and to the development of Web applications. Although for new Web platforms such a study is especially important, it is often jeopardized by the lack of knowledge about novel phenomena due to the sparsity of data. Akin to human transfer of experiences from one domain to the next, transfer learning as a subfield of machine learning adapts knowledge acquired in one domain to a new domain. We systematically investigate how the concept of transfer learning may be applied to the study of users on newly created (emerging) Web platforms, and propose our transfer learning-based approach, TraNet. We show two use cases where TraNet is applied to tasks involving the identification of user trust and roles on different Web platforms. We compare the performance of TraNet with other approaches and find that our approach can best transfer knowledge on users across platforms in the given tasks.
Tasks	Transfer Learning
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07918v1
PDF	https://arxiv.org/pdf/1910.07918v1.pdf
PWC	https://paperswithcode.com/paper/understanding-social-networks-using-transfer
Repo
Framework