October 21, 2019

3076 words 15 mins read

Paper Group AWR 71

Scaling Gaussian Process Regression with Derivatives. Where are we now? A large benchmark study of recent symbolic regression methods. Fast Randomized PCA for Sparse Data. Evading classifiers in discrete domains with provable optimality guarantees. Parser Extraction of Triples in Unstructured Text. Beyond Word Importance: Contextual Decomposition t …

Scaling Gaussian Process Regression with Derivatives


Title	Scaling Gaussian Process Regression with Derivatives
Authors	David Eriksson, Kun Dong, Eric Hans Lee, David Bindel, Andrew Gordon Wilson
Abstract	Gaussian processes (GPs) with derivatives are useful in many applications, including Bayesian optimization, implicit surface reconstruction, and terrain reconstruction. Fitting a GP to function values and derivatives at $n$ points in $d$ dimensions requires linear solves and log determinants with an ${n(d+1) \times n(d+1)}$ positive definite matrix – leading to prohibitive $\mathcal{O}(n^3d^3)$ computations for standard direct methods. We propose iterative solvers using fast $\mathcal{O}(nd)$ matrix-vector multiplications (MVMs), together with pivoted Cholesky preconditioning that cuts the iterations to convergence by several orders of magnitude, allowing for fast kernel learning and prediction. Our approaches, together with dimensionality reduction, enables Bayesian optimization with derivatives to scale to high-dimensional problems and large evaluation budgets.
Tasks	Dimensionality Reduction, Gaussian Processes
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12283v1
PDF	http://arxiv.org/pdf/1810.12283v1.pdf
PWC	https://paperswithcode.com/paper/scaling-gaussian-process-regression-with
Repo	https://github.com/ericlee0803/GP_Derivatives
Framework	none

Where are we now? A large benchmark study of recent symbolic regression methods


Title	Where are we now? A large benchmark study of recent symbolic regression methods
Authors	Patryk Orzechowski, William La Cava, Jason H. Moore
Abstract	In this paper we provide a broad benchmarking of recent genetic programming approaches to symbolic regression in the context of state of the art machine learning approaches. We use a set of nearly 100 regression benchmark problems culled from open source repositories across the web. We conduct a rigorous benchmarking of four recent symbolic regression approaches as well as nine machine learning approaches from scikit-learn. The results suggest that symbolic regression performs strongly compared to state-of-the-art gradient boosting algorithms, although in terms of running times is among the slowest of the available methodologies. We discuss the results in detail and point to future research directions that may allow symbolic regression to gain wider adoption in the machine learning community.
Tasks
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09331v2
PDF	http://arxiv.org/pdf/1804.09331v2.pdf
PWC	https://paperswithcode.com/paper/where-are-we-now-a-large-benchmark-study-of
Repo	https://github.com/EpistasisLab/regression-benchmark
Framework	none

Fast Randomized PCA for Sparse Data


Title	Fast Randomized PCA for Sparse Data
Authors	Xu Feng, Yuyang Xie, Mingye Song, Wenjian Yu, Jie Tang
Abstract	Principal component analysis (PCA) is widely used for dimension reduction and embedding of real data in social network analysis, information retrieval, and natural language processing, etc. In this work we propose a fast randomized PCA algorithm for processing large sparse data. The algorithm has similar accuracy to the basic randomized SVD (rPCA) algorithm (Halko et al., 2011), but is largely optimized for sparse data. It also has good flexibility to trade off runtime against accuracy for practical usage. Experiments on real data show that the proposed algorithm is up to 9.1X faster than the basic rPCA algorithm without accuracy loss, and is up to 20X faster than the svds in Matlab with little error. The algorithm computes the first 100 principal components of a large information retrieval data with 12,869,521 persons and 323,899 keywords in less than 400 seconds on a 24-core machine, while all conventional methods fail due to the out-of-memory issue.
Tasks	Dimensionality Reduction, Information Retrieval
Published	2018-10-16
URL	http://arxiv.org/abs/1810.06825v1
PDF	http://arxiv.org/pdf/1810.06825v1.pdf
PWC	https://paperswithcode.com/paper/fast-randomized-pca-for-sparse-data
Repo	https://github.com/XuFengthucs/frPCA_sparse
Framework	none

Evading classifiers in discrete domains with provable optimality guarantees


Title	Evading classifiers in discrete domains with provable optimality guarantees
Authors	Bogdan Kulynych, Jamie Hayes, Nikita Samarin, Carmela Troncoso
Abstract	Machine-learning models for security-critical applications such as bot, malware, or spam detection, operate in constrained discrete domains. These applications would benefit from having provable guarantees against adversarial examples. The existing literature on provable adversarial robustness of models, however, exclusively focuses on robustness to gradient-based attacks in domains such as images. These attacks model the adversarial cost, e.g., amount of distortion applied to an image, as a $p$-norm. We argue that this approach is not well-suited to model adversarial costs in constrained domains where not all examples are feasible. We introduce a graphical framework that (1) generalizes existing attacks in discrete domains, (2) can accommodate complex cost functions beyond $p$-norms, including financial cost incurred when attacking a classifier, and (3) efficiently produces valid adversarial examples with guarantees of minimal adversarial cost. These guarantees directly translate into a notion of adversarial robustness that takes into account domain constraints and the adversary’s capabilities. We show how our framework can be used to evaluate security by crafting adversarial examples that evade a Twitter-bot detection classifier with provably minimal number of changes; and to build privacy defenses by crafting adversarial examples that evade a privacy-invasive website-fingerprinting classifier.
Tasks	Twitter Bot Detection
Published	2018-10-25
URL	https://arxiv.org/abs/1810.10939v3
PDF	https://arxiv.org/pdf/1810.10939v3.pdf
PWC	https://paperswithcode.com/paper/evading-classifiers-in-discrete-domains-with
Repo	https://github.com/bogdan-kulynych/textfool
Framework	none

Parser Extraction of Triples in Unstructured Text


Title	Parser Extraction of Triples in Unstructured Text
Authors	Shaun D’Souza
Abstract	The web contains vast repositories of unstructured text. We investigate the opportunity for building a knowledge graph from these text sources. We generate a set of triples which can be used in knowledge gathering and integration. We define the architecture of a language compiler for processing subject-predicate-object triples using the OpenNLP parser. We implement a depth-first search traversal on the POS tagged syntactic tree appending predicate and object information. A parser enables higher precision and higher recall extractions of syntactic relationships across conjunction boundaries. We are able to extract 2-2.5 times the correct extractions of ReVerb. The extractions are used in a variety of semantic web applications and question answering. We verify extraction of 50,000 triples on the ClueWeb dataset.
Tasks	Question Answering
Published	2018-11-06
URL	http://arxiv.org/abs/1811.05768v1
PDF	http://arxiv.org/pdf/1811.05768v1.pdf
PWC	https://paperswithcode.com/paper/parser-extraction-of-triples-in-unstructured
Repo	https://github.com/shaundsouza/parser-triples
Framework	none

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs


Title	Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs
Authors	W. James Murdoch, Peter J. Liu, Bin Yu
Abstract	The driving force behind the recent success of LSTMs has been their ability to learn complex and non-linear relationships. Consequently, our inability to describe these relationships has led to LSTMs being characterized as black boxes. To this end, we introduce contextual decomposition (CD), an interpretation algorithm for analysing individual predictions made by standard LSTMs, without any changes to the underlying model. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM. On the task of sentiment analysis with the Yelp and SST data sets, we show that CD is able to reliably identify words and phrases of contrasting sentiment, and how they are combined to yield the LSTM’s final prediction. Using the phrase-level labels in SST, we also demonstrate that CD is able to successfully extract positive and negative negations from an LSTM, something which has not previously been done.
Tasks	Sentiment Analysis
Published	2018-01-16
URL	http://arxiv.org/abs/1801.05453v2
PDF	http://arxiv.org/pdf/1801.05453v2.pdf
PWC	https://paperswithcode.com/paper/beyond-word-importance-contextual
Repo	https://github.com/jamie-murdoch/ContextualDecomposition
Framework	pytorch

Understanding the Modeling of Computer Network Delays using Neural Networks


Title	Understanding the Modeling of Computer Network Delays using Neural Networks
Authors	Albert Mestres, Eduard Alarcón, Yusheng Ji, Albert Cabellos-Aparicio
Abstract	Recent trends in networking are proposing the use of Machine Learning (ML) techniques for the control and operation of the network. In this context, ML can be used as a computer network modeling technique to build models that estimate the network performance. Indeed, network modeling is a central technique to many networking functions, for instance in the field of optimization, in which the model is used to search a configuration that satisfies the target policy. In this paper, we aim to provide an answer to the following question: Can neural networks accurately model the delay of a computer network as a function of the input traffic? For this, we assume the network as a black-box that has as input a traffic matrix and as output delays. Then we train different neural networks models and evaluate its accuracy under different fundamental network characteristics: topology, size, traffic intensity and routing. With this, we aim to have a better understanding of computer network modeling with neural nets and ultimately provide practical guidelines on how such models need to be trained.
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08652v1
PDF	http://arxiv.org/pdf/1807.08652v1.pdf
PWC	https://paperswithcode.com/paper/understanding-the-modeling-of-computer
Repo	https://github.com/knowledgedefinednetworking/Understanding-the-Modeling-of-Network-Delays-using-NN
Framework	tf

Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation


Title	Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation
Authors	Hareesh Bahuleyan, Lili Mou, Hao Zhou, Olga Vechtomova
Abstract	The variational autoencoder (VAE) imposes a probabilistic distribution (typically Gaussian) on the latent space and penalizes the Kullback–Leibler (KL) divergence between the posterior and prior. In NLP, VAEs are extremely difficult to train due to the problem of KL collapsing to zero. One has to implement various heuristics such as KL weight annealing and word dropout in a carefully engineered manner to successfully train a VAE for text. In this paper, we propose to use the Wasserstein autoencoder (WAE) for probabilistic sentence generation, where the encoder could be either stochastic or deterministic. We show theoretically and empirically that, in the original WAE, the stochastically encoded Gaussian distribution tends to become a Dirac-delta function, and we propose a variant of WAE that encourages the stochasticity of the encoder. Experimental results show that the latent space learned by WAE exhibits properties of continuity and smoothness as in VAEs, while simultaneously achieving much higher BLEU scores for sentence reconstruction.
Tasks	Text Generation
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08462v2
PDF	http://arxiv.org/pdf/1806.08462v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-natural-language-generation
Repo	https://github.com/HareeshBahuleyan/probabilistic_nlg
Framework	tf

Multi-task Deep Reinforcement Learning with PopArt


Title	Multi-task Deep Reinforcement Learning with PopArt
Authors	Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt
Abstract	The reinforcement learning community has made great strides in designing algorithms capable of exceeding human performance on specific tasks. These algorithms are mostly trained one task at the time, each new task requiring to train a brand new agent instance. This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on. In this work, we study the problem of learning to master not one but multiple sequential-decision tasks at once. A general issue in multi-task learning is that a balance must be found between the needs of multiple tasks competing for the limited resources of a single learning system. Many learning algorithms can get distracted by certain tasks in the set of tasks to solve. Such tasks appear more salient to the learning process, for instance because of the density or magnitude of the in-task rewards. This causes the algorithm to focus on those salient tasks at the expense of generality. We propose to automatically adapt the contribution of each task to the agent’s updates, so that all tasks have a similar impact on the learning dynamics. This resulted in state of the art performance on learning to play all games in a set of 57 diverse Atari games. Excitingly, our method learned a single trained policy - with a single set of weights - that exceeds median human performance. To our knowledge, this was the first time a single agent surpassed human-level performance on this multi-task domain. The same approach also demonstrated state of the art performance on a set of 30 tasks in the 3D reinforcement learning platform DeepMind Lab.
Tasks	Atari Games, Multi-Task Learning
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04474v1
PDF	http://arxiv.org/pdf/1809.04474v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-deep-reinforcement-learning-with
Repo	https://github.com/aluscher/torchbeastpopart
Framework	pytorch

LEGO: Learning Edge with Geometry all at Once by Watching Videos


Title	LEGO: Learning Edge with Geometry all at Once by Watching Videos
Authors	Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia
Abstract	Learning to estimate 3D geometry in a single image by watching unlabeled videos via deep convolutional network is attracting significant attention. In this paper, we introduce a “3D as-smooth-as-possible (3D-ASAP)” prior inside the pipeline, which enables joint estimation of edges and 3D scene, yielding results with significant improvement in accuracy for fine detailed structures. Specifically, we define the 3D-ASAP prior by requiring that any two points recovered in 3D from an image should lie on an existing planar surface if no other cues provided. We design an unsupervised framework that Learns Edges and Geometry (depth, normal) all at Once (LEGO). The predicted edges are embedded into depth and surface normal smoothness terms, where pixels without edges in-between are constrained to satisfy the prior. In our framework, the predicted depths, normals and edges are forced to be consistent all the time. We conduct experiments on KITTI to evaluate our estimated geometry and CityScapes to perform edge evaluation. We show that in all of the tasks, i.e.depth, normal and edge, our algorithm vastly outperforms other state-of-the-art (SOTA) algorithms, demonstrating the benefits of our approach.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05648v2
PDF	http://arxiv.org/pdf/1803.05648v2.pdf
PWC	https://paperswithcode.com/paper/lego-learning-edge-with-geometry-all-at-once
Repo	https://github.com/zhenheny/LEGO
Framework	tf

Grammar Induction with Neural Language Models: An Unusual Replication


Title	Grammar Induction with Neural Language Models: An Unusual Replication
Authors	Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman
Abstract	A substantial thread of recent work on latent tree learning has attempted to develop neural network models with parse-valued latent variables and train them on non-parsing tasks, in the hope of having them discover interpretable tree structure. In a recent paper, Shen et al. (2018) introduce such a model and report near-state-of-the-art results on the target task of language modeling, and the first strong latent tree learning result on constituency parsing. In an attempt to reproduce these results, we discover issues that make the original results hard to trust, including tuning and even training on what is effectively the test set. Here, we attempt to reproduce these results in a fair experiment and to extend them to two new datasets. We find that the results of this work are robust: All variants of the model under study outperform all latent tree learning baselines, and perform competitively with symbolic grammar induction systems. We find that this model represents the first empirical success for latent tree learning, and that neural network language modeling warrants further study as a setting for grammar induction.
Tasks	Constituency Parsing, Language Modelling
Published	2018-08-29
URL	http://arxiv.org/abs/1808.10000v1
PDF	http://arxiv.org/pdf/1808.10000v1.pdf
PWC	https://paperswithcode.com/paper/grammar-induction-with-neural-language-models
Repo	https://github.com/nyu-mll/PRPN-Analysis
Framework	pytorch

The Elephant in the Room


Title	The Elephant in the Room
Authors	Amir Rosenfeld, Richard Zemel, John K. Tsotsos
Abstract	We showcase a family of common failures of state-of-the art object detectors. These are obtained by replacing image sub-regions by another sub-image that contains a trained object. We call this “object transplanting”. Modifying an image in this manner is shown to have a non-local impact on object detection. Slight changes in object position can affect its identity according to an object detector as well as that of other objects in the image. We provide some analysis and suggest possible reasons for the reported phenomena.
Tasks	Object Detection
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03305v1
PDF	http://arxiv.org/pdf/1808.03305v1.pdf
PWC	https://paperswithcode.com/paper/the-elephant-in-the-room
Repo	https://github.com/airalcorn2/strike-with-a-pose
Framework	pytorch

Robust Visual Tracking using Multi-Frame Multi-Feature Joint Modeling


Title	Robust Visual Tracking using Multi-Frame Multi-Feature Joint Modeling
Authors	Peng Zhang, Shujian Yu, Jiamiao Xu, Xinge You, Xiubao Jiang, Xiao-Yuan Jing, Dacheng Tao
Abstract	It remains a huge challenge to design effective and efficient trackers under complex scenarios, including occlusions, illumination changes and pose variations. To cope with this problem, a promising solution is to integrate the temporal consistency across consecutive frames and multiple feature cues in a unified model. Motivated by this idea, we propose a novel correlation filter-based tracker in this work, in which the temporal relatedness is reconciled under a multi-task learning framework and the multiple feature cues are modeled using a multi-view learning approach. We demonstrate the resulting regression model can be efficiently learned by exploiting the structure of blockwise diagonal matrix. A fast blockwise diagonal matrix inversion algorithm is developed thereafter for efficient online tracking. Meanwhile, we incorporate an adaptive scale estimation mechanism to strengthen the stability of scale variation tracking. We implement our tracker using two types of features and test it on two benchmark datasets. Experimental results demonstrate the superiority of our proposed approach when compared with other state-of-the-art trackers. project homepage http://bmal.hust.edu.cn/project/KMF2JMTtracking.html
Tasks	Multi-Task Learning, MULTI-VIEW LEARNING, Object Tracking, Visual Object Tracking, Visual Tracking
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07498v1
PDF	http://arxiv.org/pdf/1811.07498v1.pdf
PWC	https://paperswithcode.com/paper/robust-visual-tracking-using-multi-frame
Repo	https://github.com/dscv-lab/KMF2JMTtracking
Framework	none

Eigenvalue Corrected Noisy Natural Gradient


Title	Eigenvalue Corrected Noisy Natural Gradient
Authors	Juhan Bae, Guodong Zhang, Roger Grosse
Abstract	Variational Bayesian neural networks combine the flexibility of deep learning with Bayesian uncertainty estimation. However, inference procedures for flexible variational posteriors are computationally expensive. A recently proposed method, noisy natural gradient, is a surprisingly simple method to fit expressive posteriors by adding weight noise to regular natural gradient updates. Noisy K-FAC is an instance of noisy natural gradient that fits a matrix-variate Gaussian posterior with minor changes to ordinary K-FAC. Nevertheless, a matrix-variate Gaussian posterior does not capture an accurate diagonal variance. In this work, we extend on noisy K-FAC to obtain a more flexible posterior distribution called eigenvalue corrected matrix-variate Gaussian. The proposed method computes the full diagonal re-scaling factor in Kronecker-factored eigenbasis. Empirically, our approach consistently outperforms existing algorithms (e.g., noisy K-FAC) on regression and classification tasks.
Tasks
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12565v1
PDF	http://arxiv.org/pdf/1811.12565v1.pdf
PWC	https://paperswithcode.com/paper/eigenvalue-corrected-noisy-natural-gradient
Repo	https://github.com/shwang/NNG
Framework	tf

Visual Object Tracking: The Initialisation Problem


Title	Visual Object Tracking: The Initialisation Problem
Authors	George De Ath, Richard Everson
Abstract	Model initialisation is an important component of object tracking. Tracking algorithms are generally provided with the first frame of a sequence and a bounding box (BB) indicating the location of the object. This BB may contain a large number of background pixels in addition to the object and can lead to parts-based tracking algorithms initialising their object models in background regions of the BB. In this paper, we tackle this as a missing labels problem, marking pixels sufficiently away from the BB as belonging to the background and learning the labels of the unknown pixels. Three techniques, One-Class SVM (OC-SVM), Sampled-Based Background Model (SBBM) (a novel background model based on pixel samples), and Learning Based Digital Matting (LBDM), are adapted to the problem. These are evaluated with leave-one-video-out cross-validation on the VOT2016 tracking benchmark. Our evaluation shows both OC-SVMs and SBBM are capable of providing a good level of segmentation accuracy but are too parameter-dependent to be used in real-world scenarios. We show that LBDM achieves significantly increased performance with parameters selected by cross validation and we show that it is robust to parameter variation.
Tasks	Object Tracking, Visual Object Tracking
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01146v2
PDF	http://arxiv.org/pdf/1805.01146v2.pdf
PWC	https://paperswithcode.com/paper/visual-object-tracking-the-initialisation
Repo	https://github.com/georgedeath/initialisation-problem
Framework	none