Paper Group AWR 88
Learning Binary Latent Variable Models: A Tensor Eigenpair Approach. Investigating Capsule Networks with Dynamic Routing for Text Classification. Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift. Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks. Attending to Mathemat …
Learning Binary Latent Variable Models: A Tensor Eigenpair Approach
Title | Learning Binary Latent Variable Models: A Tensor Eigenpair Approach |
Authors | Ariel Jaffe, Roi Weiss, Shai Carmi, Yuval Kluger, Boaz Nadler |
Abstract | Latent variable models with hidden binary units appear in various applications. Learning such models, in particular in the presence of noise, is a challenging computational problem. In this paper we propose a novel spectral approach to this problem, based on the eigenvectors of both the second order moment matrix and third order moment tensor of the observed data. We prove that under mild non-degeneracy conditions, our method consistently estimates the model parameters at the optimal parametric rate. Our tensor-based method generalizes previous orthogonal tensor decomposition approaches, where the hidden units were assumed to be either statistically independent or mutually exclusive. We illustrate the consistency of our method on simulated data and demonstrate its usefulness in learning a common model for population mixtures in genetics. |
Tasks | Latent Variable Models |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09656v1 |
http://arxiv.org/pdf/1802.09656v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-binary-latent-variable-models-a |
Repo | https://github.com/arJaffe/BinaryLatentVariables |
Framework | none |
Investigating Capsule Networks with Dynamic Routing for Text Classification
Title | Investigating Capsule Networks with Dynamic Routing for Text Classification |
Authors | Wei Zhao, Jianbo Ye, Min Yang, Zeyang Lei, Suofei Zhang, Zhou Zhao |
Abstract | In this study, we explore capsule networks with dynamic routing for text classification. We propose three strategies to stabilize the dynamic routing process to alleviate the disturbance of some noise capsules which may contain “background” information or have not been successfully trained. A series of experiments are conducted with capsule networks on six text classification benchmarks. Capsule networks achieve state of the art on 4 out of 6 datasets, which shows the effectiveness of capsule networks for text classification. We additionally show that capsule networks exhibit significant improvement when transfer single-label to multi-label text classification over strong baseline methods. To the best of our knowledge, this is the first work that capsule networks have been empirically investigated for text modeling. |
Tasks | Multi-Label Text Classification, Sentiment Analysis, Subjectivity Analysis, Text Classification |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1804.00538v4 |
http://arxiv.org/pdf/1804.00538v4.pdf | |
PWC | https://paperswithcode.com/paper/investigating-capsule-networks-with-dynamic |
Repo | https://github.com/andyweizhao/capsule_text_classification |
Framework | tf |
Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift
Title | Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift |
Authors | Stephan Rabanser, Stephan Günnemann, Zachary C. Lipton |
Abstract | We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. Machine learning (ML) systems, however, which depend strongly on properties of their inputs (e.g. the i.i.d. assumption), tend to fail silently. This paper explores the problem of building ML systems that fail loudly, investigating methods for detecting dataset shift, identifying exemplars that most typify the shift, and quantifying shift malignancy. We focus on several datasets and various perturbations to both covariates and label distributions with varying magnitudes and fractions of data affected. Interestingly, we show that across the dataset shifts that we explore, a two-sample-testing-based approach, using pre-trained classifiers for dimensionality reduction, performs best. Moreover, we demonstrate that domain-discriminating approaches tend to be helpful for characterizing shifts qualitatively and determining if they are harmful. |
Tasks | Dimensionality Reduction |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.11953v4 |
https://arxiv.org/pdf/1810.11953v4.pdf | |
PWC | https://paperswithcode.com/paper/failing-loudly-an-empirical-study-of-methods |
Repo | https://github.com/steverab/failing-loudly |
Framework | tf |
Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks
Title | Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks |
Authors | Victor Dibia, Çağatay Demiralp |
Abstract | Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper we introduce Data2Vis, a neural translation model for automatically generating visualizations from given datasets. We formulate visualization generation as a sequence to sequence translation problem where data specifications are mapped to visualization specifications in a declarative language (Vega-Lite). To this end, we train a multilayered attention-based recurrent neural network (RNN) with long short-term memory (LSTM) units on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean) and how to use common data selection patterns that occur within data visualizations. Data2Vis generates visualizations that are comparable to manually-created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale. |
Tasks | |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.03126v3 |
http://arxiv.org/pdf/1804.03126v3.pdf | |
PWC | https://paperswithcode.com/paper/data2vis-automatic-generation-of-data |
Repo | https://github.com/victordibia/data2vis |
Framework | tf |
Attending to Mathematical Language with Transformers
Title | Attending to Mathematical Language with Transformers |
Authors | Artit Wangperawong |
Abstract | Mathematical expressions were generated, evaluated and used to train neural network models based on the transformer architecture. The expressions and their targets were analyzed as a character-level sequence transduction task in which the encoder and decoder are built on attention mechanisms. Three models were trained to understand and evaluate symbolic variables and expressions in mathematics: (1) the self-attentive and feed-forward transformer without recurrence or convolution, (2) the universal transformer with recurrence, and (3) the adaptive universal transformer with recurrence and adaptive computation time. The models respectively achieved test accuracies as high as 76.1%, 78.8% and 84.9% in evaluating the expressions to match the target values. For the cases inferred incorrectly, the results differed from the targets by only one or two characters. The models notably learned to add, subtract and multiply both positive and negative decimal numbers of variable digits assigned to symbolic variables. |
Tasks | |
Published | 2018-12-05 |
URL | https://arxiv.org/abs/1812.02825v5 |
https://arxiv.org/pdf/1812.02825v5.pdf | |
PWC | https://paperswithcode.com/paper/attending-to-mathematical-language-with |
Repo | https://github.com/tensorflow/tensor2tensor |
Framework | tf |
Safe Grid Search with Optimal Complexity
Title | Safe Grid Search with Optimal Complexity |
Authors | Eugene Ndiaye, Tam Le, Olivier Fercoq, Joseph Salmon, Ichiro Takeuchi |
Abstract | Popular machine learning estimators involve regularization parameters that can be challenging to tune, and standard strategies rely on grid search for this task. In this paper, we revisit the techniques of approximating the regularization path up to predefined tolerance $\epsilon$ in a unified framework and show that its complexity is $O(1/\sqrt[d]{\epsilon})$ for uniformly convex loss of order $d \geq 2$ and $O(1/\sqrt{\epsilon})$ for Generalized Self-Concordant functions. This framework encompasses least-squares but also logistic regression, a case that as far as we know was not handled as precisely in previous works. We leverage our technique to provide refined bounds on the validation error as well as a practical algorithm for hyperparameter tuning. The latter has global convergence guarantee when targeting a prescribed accuracy on the validation set. Last but not least, our approach helps relieving the practitioner from the (often neglected) task of selecting a stopping criterion when optimizing over the training set: our method automatically calibrates this criterion based on the targeted accuracy on the validation set. |
Tasks | |
Published | 2018-10-12 |
URL | https://arxiv.org/abs/1810.05471v3 |
https://arxiv.org/pdf/1810.05471v3.pdf | |
PWC | https://paperswithcode.com/paper/safe-grid-search-with-optimal-complexity |
Repo | https://github.com/EugeneNdiaye/safe_grid_search |
Framework | none |
CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction
Title | CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction |
Authors | Mohammad Golbabaee, Zhouye Chen, Yves Wiaux, Mike Davies |
Abstract | Current popular methods for Magnetic Resonance Fingerprint (MRF) recovery are bottlenecked by the heavy computations of a matched-filtering step due to the growing size and complexity of the fingerprint dictionaries in multi-parametric quantitative MRI applications. We address this shortcoming by arranging dictionary atoms in the form of cover tree structures and adopt the corresponding fast approximate nearest neighbour searches to accelerate matched-filtering. For datasets belonging to smooth low-dimensional manifolds cover trees offer search complexities logarithmic in terms of data population. With this motivation we propose an iterative reconstruction algorithm, named CoverBLIP, to address large-size MRF problems where the fingerprint dictionary i.e. discrete manifold of Bloch responses, encodes several intrinsic NMR parameters. We study different forms of convergence for this algorithm and we show that provided with a notion of embedding, the inexact and non-convex iterations of CoverBLIP linearly convergence toward a near-global solution with the same order of accuracy as using exact brute-force searches. Our further examinations on both synthetic and real-world datasets and using different sampling strategies, indicates between 2 to 3 orders of magnitude reduction in total search computations. Cover trees are robust against the curse-of-dimensionality and therefore CoverBLIP provides a notion of scalability – a consistent gain in time-accuracy performance– for searching high-dimensional atoms which may not be easily preprocessed (i.e. for dimensionality reduction) due to the increasing degrees of non-linearities appearing in the emerging multi-parametric MRF dictionaries. |
Tasks | Dimensionality Reduction |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01967v1 |
http://arxiv.org/pdf/1810.01967v1.pdf | |
PWC | https://paperswithcode.com/paper/coverblip-accelerated-and-scalable-iterative |
Repo | https://github.com/mgolbabaee/CoverBLIP |
Framework | none |
ManifoldNet: A Deep Network Framework for Manifold-valued Data
Title | ManifoldNet: A Deep Network Framework for Manifold-valued Data |
Authors | Rudrasis Chakraborty, Jose Bouza, Jonathan Manton, Baba C. Vemuri |
Abstract | Deep neural networks have become the main work horse for many tasks involving learning from data in a variety of applications in Science and Engineering. Traditionally, the input to these networks lie in a vector space and the operations employed within the network are well defined on vector-spaces. In the recent past, due to technological advances in sensing, it has become possible to acquire manifold-valued data sets either directly or indirectly. Examples include but are not limited to data from omnidirectional cameras on automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic resonance imaging, elastography and conductance imaging in the Medical Imaging domain and others. Thus, there is need to generalize the deep neural networks to cope with input data that reside on curved manifolds where vector space operations are not naturally admissible. In this paper, we present a novel theoretical framework to generalize the widely popular convolutional neural networks (CNNs) to high dimensional manifold-valued data inputs. We call these networks, ManifoldNets. In ManifoldNets, convolution operation on data residing on Riemannian manifolds is achieved via a provably convergent recursive computation of the weighted Fr'{e}chet Mean (wFM) of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence ManifoldNet does not need the non-linear ReLU unit used in standard CNNs. We present experiments, using the ManifoldNet framework, to achieve dimensionality reduction by computing the principal linear subspaces that naturally reside on a Grassmannian. The experimental results demonstrate the efficacy of ManifoldNets in the context of classification and reconstruction accuracy. |
Tasks | Dimensionality Reduction |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.06211v3 |
http://arxiv.org/pdf/1809.06211v3.pdf | |
PWC | https://paperswithcode.com/paper/manifoldnet-a-deep-network-framework-for |
Repo | https://github.com/jjbouza/manifold-net |
Framework | pytorch |
Visualization of High-dimensional Scalar Functions Using Principal Parameterizations
Title | Visualization of High-dimensional Scalar Functions Using Principal Parameterizations |
Authors | Rafael Ballester-Ripoll, Renato Pajarola |
Abstract | Insightful visualization of multidimensional scalar fields, in particular parameter spaces, is key to many fields in computational science and engineering. We propose a principal component-based approach to visualize such fields that accurately reflects their sensitivity to input parameters. The method performs dimensionality reduction on the vast $L^2$ Hilbert space formed by all possible partial functions (i.e., those defined by fixing one or more input parameters to specific values), which are projected to low-dimensional parameterized manifolds such as 3D curves, surfaces, and ensembles thereof. Our mapping provides a direct geometrical and visual interpretation in terms of Sobol’s celebrated method for variance-based sensitivity analysis. We furthermore contribute a practical realization of the proposed method by means of tensor decomposition, which enables accurate yet interactive integration and multilinear principal component analysis of high-dimensional models. |
Tasks | Dimensionality Reduction |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03618v1 |
http://arxiv.org/pdf/1809.03618v1.pdf | |
PWC | https://paperswithcode.com/paper/visualization-of-high-dimensional-scalar |
Repo | https://github.com/rballester/ttpca |
Framework | none |
Unsupervised Disentangled Representation Learning with Analogical Relations
Title | Unsupervised Disentangled Representation Learning with Analogical Relations |
Authors | Zejian Li, Yongchuan Tang, Yongxing He |
Abstract | Learning the disentangled representation of interpretable generative factors of data is one of the foundations to allow artificial intelligence to think like people. In this paper, we propose the analogical training strategy for the unsupervised disentangled representation learning in generative models. The analogy is one of the typical cognitive processes, and our proposed strategy is based on the observation that sample pairs in which one is different from the other in one specific generative factor show the same analogical relation. Thus, the generator is trained to generate sample pairs from which a designed classifier can identify the underlying analogical relation. In addition, we propose a disentanglement metric called the subspace score, which is inspired by subspace learning methods and does not require supervised information. Experiments show that our proposed training strategy allows the generative models to find the disentangled factors, and that our methods can give competitive performances as compared with the state-of-the-art methods. |
Tasks | Representation Learning |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09502v1 |
http://arxiv.org/pdf/1804.09502v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-disentangled-representation |
Repo | https://github.com/ZejianLi/analogical-training |
Framework | pytorch |
Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns
Title | Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns |
Authors | Kellie Webster, Marta Recasens, Vera Axelrod, Jason Baldridge |
Abstract | Coreference resolution is an important task for natural language understanding, and the resolution of ambiguous pronouns a longstanding challenge. Nonetheless, existing corpora do not capture ambiguous pronouns in sufficient volume or diversity to accurately indicate the practical utility of models. Furthermore, we find gender bias in existing corpora and systems favoring masculine entities. To address this, we present and release GAP, a gender-balanced labeled corpus of 8,908 ambiguous pronoun-name pairs sampled to provide diverse coverage of challenges posed by real-world text. We explore a range of baselines which demonstrate the complexity of the challenge, the best achieving just 66.9% F1. We show that syntactic structure and continuous neural models provide promising, complementary cues for approaching the challenge. |
Tasks | Coreference Resolution |
Published | 2018-10-11 |
URL | http://arxiv.org/abs/1810.05201v1 |
http://arxiv.org/pdf/1810.05201v1.pdf | |
PWC | https://paperswithcode.com/paper/mind-the-gap-a-balanced-corpus-of-gendered |
Repo | https://github.com/sattree/gpr_pub |
Framework | none |
Syntax-Directed Variational Autoencoder for Structured Data
Title | Syntax-Directed Variational Autoencoder for Structured Data |
Authors | Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song |
Abstract | Deep generative models have been enjoying success in modeling continuous data. However it remains challenging to capture the representations for discrete structures with formal grammars and semantics, e.g., computer programs and molecular structures. How to generate both syntactically and semantically correct data still remains largely an open problem. Inspired by the theory of compiler where the syntax and semantics check is done via syntax-directed translation (SDT), we propose a novel syntax-directed variational autoencoder (SD-VAE) by introducing stochastic lazy attributes. This approach converts the offline SDT check into on-the-fly generated guidance for constraining the decoder. Comparing to the state-of-the-art methods, our approach enforces constraints on the output space so that the output will be not only syntactically valid, but also semantically reasonable. We evaluate the proposed model with applications in programming language and molecules, including reconstruction and program/molecule optimization. The results demonstrate the effectiveness in incorporating syntactic and semantic constraints in discrete generative models, which is significantly better than current state-of-the-art approaches. |
Tasks | |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08786v1 |
http://arxiv.org/pdf/1802.08786v1.pdf | |
PWC | https://paperswithcode.com/paper/syntax-directed-variational-autoencoder-for |
Repo | https://github.com/Hanjun-Dai/sdvae |
Framework | pytorch |
Cross and Learn: Cross-Modal Self-Supervision
Title | Cross and Learn: Cross-Modal Self-Supervision |
Authors | Nawid Sayed, Biagio Brattoli, Björn Ommer |
Abstract | In this paper we present a self-supervised method for representation learning utilizing two different modalities. Based on the observation that cross-modal information has a high semantic meaning we propose a method to effectively exploit this signal. For our approach we utilize video data since it is available on a large scale and provides easily accessible modalities given by RGB and optical flow. We demonstrate state-of-the-art performance on highly contested action recognition datasets in the context of self-supervised learning. We show that our feature representation also transfers to other tasks and conduct extensive ablation studies to validate our core contributions. Code and model can be found at https://github.com/nawidsayed/Cross-and-Learn. |
Tasks | Optical Flow Estimation, Representation Learning, Temporal Action Localization |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03879v3 |
http://arxiv.org/pdf/1811.03879v3.pdf | |
PWC | https://paperswithcode.com/paper/cross-and-learn-cross-modal-self-supervision |
Repo | https://github.com/nawidsayed/Cross-and-Learn |
Framework | pytorch |
GamePad: A Learning Environment for Theorem Proving
Title | GamePad: A Learning Environment for Theorem Proving |
Authors | Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever |
Abstract | In this paper, we introduce a system called GamePad that can be used to explore the application of machine learning methods to theorem proving in the Coq proof assistant. Interactive theorem provers such as Coq enable users to construct machine-checkable proofs in a step-by-step manner. Hence, they provide an opportunity to explore theorem proving with human supervision. We use GamePad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in tactic-based theorem proving. |
Tasks | Automated Theorem Proving |
Published | 2018-06-02 |
URL | http://arxiv.org/abs/1806.00608v2 |
http://arxiv.org/pdf/1806.00608v2.pdf | |
PWC | https://paperswithcode.com/paper/gamepad-a-learning-environment-for-theorem |
Repo | https://github.com/ml4tp/gamepad |
Framework | none |
State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Title | State-of-the-art Chinese Word Segmentation with Bi-LSTMs |
Authors | Ji Ma, Kuzman Ganchev, David Weiss |
Abstract | A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation. Surprisingly, we find that a bidirectional LSTM model, when combined with standard deep learning techniques and best practices, can achieve better accuracy on many of the popular datasets as compared to models based on more complex neural-network architectures. Furthermore, our error analysis shows that out-of-vocabulary words remain challenging for neural-network models, and many of the remaining errors are unlikely to be fixed through architecture changes. Instead, more effort should be made on exploring resources for further improvement. |
Tasks | Chinese Word Segmentation |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.06511v2 |
http://arxiv.org/pdf/1808.06511v2.pdf | |
PWC | https://paperswithcode.com/paper/state-of-the-art-chinese-word-segmentation |
Repo | https://github.com/efeatikkan/Chinese_Word_Segmenter |
Framework | tf |