October 21, 2019

2909 words 14 mins read

Paper Group AWR 88

Learning Binary Latent Variable Models: A Tensor Eigenpair Approach. Investigating Capsule Networks with Dynamic Routing for Text Classification. Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift. Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks. Attending to Mathemat …

Learning Binary Latent Variable Models: A Tensor Eigenpair Approach


Title	Learning Binary Latent Variable Models: A Tensor Eigenpair Approach
Authors	Ariel Jaffe, Roi Weiss, Shai Carmi, Yuval Kluger, Boaz Nadler
Abstract	Latent variable models with hidden binary units appear in various applications. Learning such models, in particular in the presence of noise, is a challenging computational problem. In this paper we propose a novel spectral approach to this problem, based on the eigenvectors of both the second order moment matrix and third order moment tensor of the observed data. We prove that under mild non-degeneracy conditions, our method consistently estimates the model parameters at the optimal parametric rate. Our tensor-based method generalizes previous orthogonal tensor decomposition approaches, where the hidden units were assumed to be either statistically independent or mutually exclusive. We illustrate the consistency of our method on simulated data and demonstrate its usefulness in learning a common model for population mixtures in genetics.
Tasks	Latent Variable Models
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09656v1
PDF	http://arxiv.org/pdf/1802.09656v1.pdf
PWC	https://paperswithcode.com/paper/learning-binary-latent-variable-models-a
Repo	https://github.com/arJaffe/BinaryLatentVariables
Framework	none

Investigating Capsule Networks with Dynamic Routing for Text Classification


Title	Investigating Capsule Networks with Dynamic Routing for Text Classification
Authors	Wei Zhao, Jianbo Ye, Min Yang, Zeyang Lei, Suofei Zhang, Zhou Zhao
Abstract	In this study, we explore capsule networks with dynamic routing for text classification. We propose three strategies to stabilize the dynamic routing process to alleviate the disturbance of some noise capsules which may contain “background” information or have not been successfully trained. A series of experiments are conducted with capsule networks on six text classification benchmarks. Capsule networks achieve state of the art on 4 out of 6 datasets, which shows the effectiveness of capsule networks for text classification. We additionally show that capsule networks exhibit significant improvement when transfer single-label to multi-label text classification over strong baseline methods. To the best of our knowledge, this is the first work that capsule networks have been empirically investigated for text modeling.
Tasks	Multi-Label Text Classification, Sentiment Analysis, Subjectivity Analysis, Text Classification
Published	2018-03-29
URL	http://arxiv.org/abs/1804.00538v4
PDF	http://arxiv.org/pdf/1804.00538v4.pdf
PWC	https://paperswithcode.com/paper/investigating-capsule-networks-with-dynamic
Repo	https://github.com/andyweizhao/capsule_text_classification
Framework	tf

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift


Title	Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift
Authors	Stephan Rabanser, Stephan Günnemann, Zachary C. Lipton
Abstract	We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. Machine learning (ML) systems, however, which depend strongly on properties of their inputs (e.g. the i.i.d. assumption), tend to fail silently. This paper explores the problem of building ML systems that fail loudly, investigating methods for detecting dataset shift, identifying exemplars that most typify the shift, and quantifying shift malignancy. We focus on several datasets and various perturbations to both covariates and label distributions with varying magnitudes and fractions of data affected. Interestingly, we show that across the dataset shifts that we explore, a two-sample-testing-based approach, using pre-trained classifiers for dimensionality reduction, performs best. Moreover, we demonstrate that domain-discriminating approaches tend to be helpful for characterizing shifts qualitatively and determining if they are harmful.
Tasks	Dimensionality Reduction
Published	2018-10-29
URL	https://arxiv.org/abs/1810.11953v4
PDF	https://arxiv.org/pdf/1810.11953v4.pdf
PWC	https://paperswithcode.com/paper/failing-loudly-an-empirical-study-of-methods
Repo	https://github.com/steverab/failing-loudly
Framework	tf

Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks


Title	Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks
Authors	Victor Dibia, Çağatay Demiralp
Abstract	Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper we introduce Data2Vis, a neural translation model for automatically generating visualizations from given datasets. We formulate visualization generation as a sequence to sequence translation problem where data specifications are mapped to visualization specifications in a declarative language (Vega-Lite). To this end, we train a multilayered attention-based recurrent neural network (RNN) with long short-term memory (LSTM) units on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean) and how to use common data selection patterns that occur within data visualizations. Data2Vis generates visualizations that are comparable to manually-created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale.
Tasks
Published	2018-04-09
URL	http://arxiv.org/abs/1804.03126v3
PDF	http://arxiv.org/pdf/1804.03126v3.pdf
PWC	https://paperswithcode.com/paper/data2vis-automatic-generation-of-data
Repo	https://github.com/victordibia/data2vis
Framework	tf

Attending to Mathematical Language with Transformers


Title	Attending to Mathematical Language with Transformers
Authors	Artit Wangperawong
Abstract	Mathematical expressions were generated, evaluated and used to train neural network models based on the transformer architecture. The expressions and their targets were analyzed as a character-level sequence transduction task in which the encoder and decoder are built on attention mechanisms. Three models were trained to understand and evaluate symbolic variables and expressions in mathematics: (1) the self-attentive and feed-forward transformer without recurrence or convolution, (2) the universal transformer with recurrence, and (3) the adaptive universal transformer with recurrence and adaptive computation time. The models respectively achieved test accuracies as high as 76.1%, 78.8% and 84.9% in evaluating the expressions to match the target values. For the cases inferred incorrectly, the results differed from the targets by only one or two characters. The models notably learned to add, subtract and multiply both positive and negative decimal numbers of variable digits assigned to symbolic variables.
Tasks
Published	2018-12-05
URL	https://arxiv.org/abs/1812.02825v5
PDF	https://arxiv.org/pdf/1812.02825v5.pdf
PWC	https://paperswithcode.com/paper/attending-to-mathematical-language-with
Repo	https://github.com/tensorflow/tensor2tensor
Framework	tf

Safe Grid Search with Optimal Complexity


Title	Safe Grid Search with Optimal Complexity
Authors	Eugene Ndiaye, Tam Le, Olivier Fercoq, Joseph Salmon, Ichiro Takeuchi
Abstract	Popular machine learning estimators involve regularization parameters that can be challenging to tune, and standard strategies rely on grid search for this task. In this paper, we revisit the techniques of approximating the regularization path up to predefined tolerance $\epsilon$ in a unified framework and show that its complexity is $O(1/\sqrt[d]{\epsilon})$ for uniformly convex loss of order $d \geq 2$ and $O(1/\sqrt{\epsilon})$ for Generalized Self-Concordant functions. This framework encompasses least-squares but also logistic regression, a case that as far as we know was not handled as precisely in previous works. We leverage our technique to provide refined bounds on the validation error as well as a practical algorithm for hyperparameter tuning. The latter has global convergence guarantee when targeting a prescribed accuracy on the validation set. Last but not least, our approach helps relieving the practitioner from the (often neglected) task of selecting a stopping criterion when optimizing over the training set: our method automatically calibrates this criterion based on the targeted accuracy on the validation set.
Tasks
Published	2018-10-12
URL	https://arxiv.org/abs/1810.05471v3
PDF	https://arxiv.org/pdf/1810.05471v3.pdf
PWC	https://paperswithcode.com/paper/safe-grid-search-with-optimal-complexity
Repo	https://github.com/EugeneNdiaye/safe_grid_search
Framework	none

CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction


Title	CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction
Authors	Mohammad Golbabaee, Zhouye Chen, Yves Wiaux, Mike Davies
Abstract	Current popular methods for Magnetic Resonance Fingerprint (MRF) recovery are bottlenecked by the heavy computations of a matched-filtering step due to the growing size and complexity of the fingerprint dictionaries in multi-parametric quantitative MRI applications. We address this shortcoming by arranging dictionary atoms in the form of cover tree structures and adopt the corresponding fast approximate nearest neighbour searches to accelerate matched-filtering. For datasets belonging to smooth low-dimensional manifolds cover trees offer search complexities logarithmic in terms of data population. With this motivation we propose an iterative reconstruction algorithm, named CoverBLIP, to address large-size MRF problems where the fingerprint dictionary i.e. discrete manifold of Bloch responses, encodes several intrinsic NMR parameters. We study different forms of convergence for this algorithm and we show that provided with a notion of embedding, the inexact and non-convex iterations of CoverBLIP linearly convergence toward a near-global solution with the same order of accuracy as using exact brute-force searches. Our further examinations on both synthetic and real-world datasets and using different sampling strategies, indicates between 2 to 3 orders of magnitude reduction in total search computations. Cover trees are robust against the curse-of-dimensionality and therefore CoverBLIP provides a notion of scalability – a consistent gain in time-accuracy performance– for searching high-dimensional atoms which may not be easily preprocessed (i.e. for dimensionality reduction) due to the increasing degrees of non-linearities appearing in the emerging multi-parametric MRF dictionaries.
Tasks	Dimensionality Reduction
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01967v1
PDF	http://arxiv.org/pdf/1810.01967v1.pdf
PWC	https://paperswithcode.com/paper/coverblip-accelerated-and-scalable-iterative
Repo	https://github.com/mgolbabaee/CoverBLIP
Framework	none

ManifoldNet: A Deep Network Framework for Manifold-valued Data


Title	ManifoldNet: A Deep Network Framework for Manifold-valued Data
Authors	Rudrasis Chakraborty, Jose Bouza, Jonathan Manton, Baba C. Vemuri
Abstract	Deep neural networks have become the main work horse for many tasks involving learning from data in a variety of applications in Science and Engineering. Traditionally, the input to these networks lie in a vector space and the operations employed within the network are well defined on vector-spaces. In the recent past, due to technological advances in sensing, it has become possible to acquire manifold-valued data sets either directly or indirectly. Examples include but are not limited to data from omnidirectional cameras on automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic resonance imaging, elastography and conductance imaging in the Medical Imaging domain and others. Thus, there is need to generalize the deep neural networks to cope with input data that reside on curved manifolds where vector space operations are not naturally admissible. In this paper, we present a novel theoretical framework to generalize the widely popular convolutional neural networks (CNNs) to high dimensional manifold-valued data inputs. We call these networks, ManifoldNets. In ManifoldNets, convolution operation on data residing on Riemannian manifolds is achieved via a provably convergent recursive computation of the weighted Fr'{e}chet Mean (wFM) of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence ManifoldNet does not need the non-linear ReLU unit used in standard CNNs. We present experiments, using the ManifoldNet framework, to achieve dimensionality reduction by computing the principal linear subspaces that naturally reside on a Grassmannian. The experimental results demonstrate the efficacy of ManifoldNets in the context of classification and reconstruction accuracy.
Tasks	Dimensionality Reduction
Published	2018-09-11
URL	http://arxiv.org/abs/1809.06211v3
PDF	http://arxiv.org/pdf/1809.06211v3.pdf
PWC	https://paperswithcode.com/paper/manifoldnet-a-deep-network-framework-for
Repo	https://github.com/jjbouza/manifold-net
Framework	pytorch

Visualization of High-dimensional Scalar Functions Using Principal Parameterizations


Title	Visualization of High-dimensional Scalar Functions Using Principal Parameterizations
Authors	Rafael Ballester-Ripoll, Renato Pajarola
Abstract	Insightful visualization of multidimensional scalar fields, in particular parameter spaces, is key to many fields in computational science and engineering. We propose a principal component-based approach to visualize such fields that accurately reflects their sensitivity to input parameters. The method performs dimensionality reduction on the vast $L^2$ Hilbert space formed by all possible partial functions (i.e., those defined by fixing one or more input parameters to specific values), which are projected to low-dimensional parameterized manifolds such as 3D curves, surfaces, and ensembles thereof. Our mapping provides a direct geometrical and visual interpretation in terms of Sobol’s celebrated method for variance-based sensitivity analysis. We furthermore contribute a practical realization of the proposed method by means of tensor decomposition, which enables accurate yet interactive integration and multilinear principal component analysis of high-dimensional models.
Tasks	Dimensionality Reduction
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03618v1
PDF	http://arxiv.org/pdf/1809.03618v1.pdf
PWC	https://paperswithcode.com/paper/visualization-of-high-dimensional-scalar
Repo	https://github.com/rballester/ttpca
Framework	none

Unsupervised Disentangled Representation Learning with Analogical Relations


Title	Unsupervised Disentangled Representation Learning with Analogical Relations
Authors	Zejian Li, Yongchuan Tang, Yongxing He
Abstract	Learning the disentangled representation of interpretable generative factors of data is one of the foundations to allow artificial intelligence to think like people. In this paper, we propose the analogical training strategy for the unsupervised disentangled representation learning in generative models. The analogy is one of the typical cognitive processes, and our proposed strategy is based on the observation that sample pairs in which one is different from the other in one specific generative factor show the same analogical relation. Thus, the generator is trained to generate sample pairs from which a designed classifier can identify the underlying analogical relation. In addition, we propose a disentanglement metric called the subspace score, which is inspired by subspace learning methods and does not require supervised information. Experiments show that our proposed training strategy allows the generative models to find the disentangled factors, and that our methods can give competitive performances as compared with the state-of-the-art methods.
Tasks	Representation Learning
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09502v1
PDF	http://arxiv.org/pdf/1804.09502v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-disentangled-representation
Repo	https://github.com/ZejianLi/analogical-training
Framework	pytorch

Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns


Title	Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns
Authors	Kellie Webster, Marta Recasens, Vera Axelrod, Jason Baldridge
Abstract	Coreference resolution is an important task for natural language understanding, and the resolution of ambiguous pronouns a longstanding challenge. Nonetheless, existing corpora do not capture ambiguous pronouns in sufficient volume or diversity to accurately indicate the practical utility of models. Furthermore, we find gender bias in existing corpora and systems favoring masculine entities. To address this, we present and release GAP, a gender-balanced labeled corpus of 8,908 ambiguous pronoun-name pairs sampled to provide diverse coverage of challenges posed by real-world text. We explore a range of baselines which demonstrate the complexity of the challenge, the best achieving just 66.9% F1. We show that syntactic structure and continuous neural models provide promising, complementary cues for approaching the challenge.
Tasks	Coreference Resolution
Published	2018-10-11
URL	http://arxiv.org/abs/1810.05201v1
PDF	http://arxiv.org/pdf/1810.05201v1.pdf
PWC	https://paperswithcode.com/paper/mind-the-gap-a-balanced-corpus-of-gendered
Repo	https://github.com/sattree/gpr_pub
Framework	none

Syntax-Directed Variational Autoencoder for Structured Data


Title	Syntax-Directed Variational Autoencoder for Structured Data
Authors	Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song
Abstract	Deep generative models have been enjoying success in modeling continuous data. However it remains challenging to capture the representations for discrete structures with formal grammars and semantics, e.g., computer programs and molecular structures. How to generate both syntactically and semantically correct data still remains largely an open problem. Inspired by the theory of compiler where the syntax and semantics check is done via syntax-directed translation (SDT), we propose a novel syntax-directed variational autoencoder (SD-VAE) by introducing stochastic lazy attributes. This approach converts the offline SDT check into on-the-fly generated guidance for constraining the decoder. Comparing to the state-of-the-art methods, our approach enforces constraints on the output space so that the output will be not only syntactically valid, but also semantically reasonable. We evaluate the proposed model with applications in programming language and molecules, including reconstruction and program/molecule optimization. The results demonstrate the effectiveness in incorporating syntactic and semantic constraints in discrete generative models, which is significantly better than current state-of-the-art approaches.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08786v1
PDF	http://arxiv.org/pdf/1802.08786v1.pdf
PWC	https://paperswithcode.com/paper/syntax-directed-variational-autoencoder-for
Repo	https://github.com/Hanjun-Dai/sdvae
Framework	pytorch


Title	Cross and Learn: Cross-Modal Self-Supervision
Authors	Nawid Sayed, Biagio Brattoli, Björn Ommer
Abstract	In this paper we present a self-supervised method for representation learning utilizing two different modalities. Based on the observation that cross-modal information has a high semantic meaning we propose a method to effectively exploit this signal. For our approach we utilize video data since it is available on a large scale and provides easily accessible modalities given by RGB and optical flow. We demonstrate state-of-the-art performance on highly contested action recognition datasets in the context of self-supervised learning. We show that our feature representation also transfers to other tasks and conduct extensive ablation studies to validate our core contributions. Code and model can be found at https://github.com/nawidsayed/Cross-and-Learn.
Tasks	Optical Flow Estimation, Representation Learning, Temporal Action Localization
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03879v3
PDF	http://arxiv.org/pdf/1811.03879v3.pdf
PWC	https://paperswithcode.com/paper/cross-and-learn-cross-modal-self-supervision
Repo	https://github.com/nawidsayed/Cross-and-Learn
Framework	pytorch

GamePad: A Learning Environment for Theorem Proving


Title	GamePad: A Learning Environment for Theorem Proving
Authors	Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever
Abstract	In this paper, we introduce a system called GamePad that can be used to explore the application of machine learning methods to theorem proving in the Coq proof assistant. Interactive theorem provers such as Coq enable users to construct machine-checkable proofs in a step-by-step manner. Hence, they provide an opportunity to explore theorem proving with human supervision. We use GamePad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in tactic-based theorem proving.
Tasks	Automated Theorem Proving
Published	2018-06-02
URL	http://arxiv.org/abs/1806.00608v2
PDF	http://arxiv.org/pdf/1806.00608v2.pdf
PWC	https://paperswithcode.com/paper/gamepad-a-learning-environment-for-theorem
Repo	https://github.com/ml4tp/gamepad
Framework	none

State-of-the-art Chinese Word Segmentation with Bi-LSTMs


Title	State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Authors	Ji Ma, Kuzman Ganchev, David Weiss
Abstract	A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation. Surprisingly, we find that a bidirectional LSTM model, when combined with standard deep learning techniques and best practices, can achieve better accuracy on many of the popular datasets as compared to models based on more complex neural-network architectures. Furthermore, our error analysis shows that out-of-vocabulary words remain challenging for neural-network models, and many of the remaining errors are unlikely to be fixed through architecture changes. Instead, more effort should be made on exploring resources for further improvement.
Tasks	Chinese Word Segmentation
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06511v2
PDF	http://arxiv.org/pdf/1808.06511v2.pdf
PWC	https://paperswithcode.com/paper/state-of-the-art-chinese-word-segmentation
Repo	https://github.com/efeatikkan/Chinese_Word_Segmenter
Framework	tf