Paper Group ANR 159
Inducing Regular Grammars Using Recurrent Neural Networks. Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks. A German Corpus for Text Similarity Detection Tasks. Towards Synthesizing Complex Programs from Input-Output Examples. Attention Strategies for Multi-Source Sequence-to-Sequence Learning. Automatic …
Inducing Regular Grammars Using Recurrent Neural Networks
Title | Inducing Regular Grammars Using Recurrent Neural Networks |
Authors | Mor Cohen, Avi Caciularu, Idan Rejwan, Jonathan Berant |
Abstract | Grammar induction is the task of learning a grammar from a set of examples. Recently, neural networks have been shown to be powerful learning machines that can identify patterns in streams of data. In this work we investigate their effectiveness in inducing a regular grammar from data, without any assumptions about the grammar. We train a recurrent neural network to distinguish between strings that are in or outside a regular language, and utilize an algorithm for extracting the learned finite-state automaton. We apply this method to several regular languages and find unexpected results regarding the connections between the network’s states that may be regarded as evidence for generalization. |
Tasks | |
Published | 2017-10-28 |
URL | http://arxiv.org/abs/1710.10453v2 |
http://arxiv.org/pdf/1710.10453v2.pdf | |
PWC | https://paperswithcode.com/paper/inducing-regular-grammars-using-recurrent |
Repo | |
Framework | |
Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks
Title | Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks |
Authors | Cicero Nogueira dos Santos, Kahini Wadhawan, Bowen Zhou |
Abstract | We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements semi-supervised learning in a straightforward manner. We propose instantiations of DAN for two different prediction tasks: classification and ranking. Our experimental results on three datasets of different tasks demonstrate that DAN is a promising framework for both semi-supervised learning and learning loss functions for predictors. For all tasks, the semi-supervised capability of DAN can significantly boost the predictor performance for small labeled sets with minor architecture changes across tasks. Moreover, the loss functions automatically learned by DANs are very competitive and usually outperform the standard pairwise and negative log-likelihood loss functions for both semi-supervised and supervised learning. |
Tasks | |
Published | 2017-07-07 |
URL | http://arxiv.org/abs/1707.02198v1 |
http://arxiv.org/pdf/1707.02198v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-loss-functions-for-semi-supervised |
Repo | |
Framework | |
A German Corpus for Text Similarity Detection Tasks
Title | A German Corpus for Text Similarity Detection Tasks |
Authors | Juan-Manuel Torres-Moreno, Gerardo Sierra, Peter Peinl |
Abstract | Text similarity detection aims at measuring the degree of similarity between a pair of texts. Corpora available for text similarity detection are designed to evaluate the algorithms to assess the paraphrase level among documents. In this paper we present a textual German corpus for similarity detection. The purpose of this corpus is to automatically assess the similarity between a pair of texts and to evaluate different similarity measures, both for whole documents or for individual sentences. Therefore we have calculated several simple measures on our corpus based on a library of similarity functions. |
Tasks | |
Published | 2017-03-11 |
URL | http://arxiv.org/abs/1703.03923v1 |
http://arxiv.org/pdf/1703.03923v1.pdf | |
PWC | https://paperswithcode.com/paper/a-german-corpus-for-text-similarity-detection |
Repo | |
Framework | |
Towards Synthesizing Complex Programs from Input-Output Examples
Title | Towards Synthesizing Complex Programs from Input-Output Examples |
Authors | Xinyun Chen, Chang Liu, Dawn Song |
Abstract | In recent years, deep learning techniques have been developed to improve the performance of program synthesis from input-output examples. Albeit its significant progress, the programs that can be synthesized by state-of-the-art approaches are still simple in terms of their complexity. In this work, we move a significant step forward along this direction by proposing a new class of challenging tasks in the domain of program synthesis from input-output examples: learning a context-free parser from pairs of input programs and their parse trees. We show that this class of tasks are much more challenging than previously studied tasks, and the test accuracy of existing approaches is almost 0%. We tackle the challenges by developing three novel techniques inspired by three novel observations, which reveal the key ingredients of using deep learning to synthesize a complex program. First, the use of a non-differentiable machine is the key to effectively restrict the search space. Thus our proposed approach learns a neural program operating a domain-specific non-differentiable machine. Second, recursion is the key to achieve generalizability. Thus, we bake-in the notion of recursion in the design of our non-differentiable machine. Third, reinforcement learning is the key to learn how to operate the non-differentiable machine, but it is also hard to train the model effectively with existing reinforcement learning algorithms from a cold boot. We develop a novel two-phase reinforcement learning-based search algorithm to overcome this issue. In our evaluation, we show that using our novel approach, neural parsing programs can be learned to achieve 100% test accuracy on test inputs that are 500x longer than the training samples. |
Tasks | Program Synthesis |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01284v4 |
http://arxiv.org/pdf/1706.01284v4.pdf | |
PWC | https://paperswithcode.com/paper/towards-synthesizing-complex-programs-from |
Repo | |
Framework | |
Attention Strategies for Multi-Source Sequence-to-Sequence Learning
Title | Attention Strategies for Multi-Source Sequence-to-Sequence Learning |
Authors | Jindřich Libovický, Jindřich Helcl |
Abstract | Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. We compare the proposed methods with existing techniques and present results of systematic evaluation of those methods on the WMT16 Multimodal Translation and Automatic Post-editing tasks. We show that the proposed methods achieve competitive results on both tasks. |
Tasks | Automatic Post-Editing |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06567v1 |
http://arxiv.org/pdf/1704.06567v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-strategies-for-multi-source |
Repo | |
Framework | |
Automatic segmentation of trees in dynamic outdoor environments
Title | Automatic segmentation of trees in dynamic outdoor environments |
Authors | Amy Tabb, Henry Medeiros |
Abstract | Segmentation in dynamic outdoor environments can be difficult when the illumination levels and other aspects of the scene cannot be controlled. Specifically in orchard and vineyard automation contexts, a background material is often used to shield a camera’s field of view from other rows of crops. In this paper, we describe a method that uses superpixels to determine low texture regions of the image that correspond to the background material, and then show how this information can be integrated with the color distribution of the image to compute optimal segmentation parameters to segment objects of interest. Quantitative and qualitative experiments demonstrate the suitability of this approach for dynamic outdoor environments, specifically for tree reconstruction and apple flower detection applications. |
Tasks | |
Published | 2017-02-24 |
URL | http://arxiv.org/abs/1702.07611v3 |
http://arxiv.org/pdf/1702.07611v3.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-of-trees-in-dynamic |
Repo | |
Framework | |
Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure
Title | Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure |
Authors | Beilun Wang, Arshdeep Sekhon, Yanjun Qi |
Abstract | We focus on the problem of estimating the change in the dependency structures of two $p$-dimensional Gaussian Graphical models (GGMs). Previous studies for sparse change estimation in GGMs involve expensive and difficult non-smooth optimization. We propose a novel method, DIFFEE for estimating DIFFerential networks via an Elementary Estimator under a high-dimensional situation. DIFFEE is solved through a faster and closed form solution that enables it to work in large-scale settings. We conduct a rigorous statistical analysis showing that surprisingly DIFFEE achieves the same asymptotic convergence rates as the state-of-the-art estimators that are much more difficult to compute. Our experimental results on multiple synthetic datasets and one real-world data about brain connectivity show strong performance improvements over baselines, as well as significant computational benefits. |
Tasks | |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.11223v3 |
http://arxiv.org/pdf/1710.11223v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-scalable-learning-of-sparse-changes |
Repo | |
Framework | |
Riemannian Stein Variational Gradient Descent for Bayesian Inference
Title | Riemannian Stein Variational Gradient Descent for Bayesian Inference |
Authors | Chang Liu, Jun Zhu |
Abstract | We develop Riemannian Stein Variational Gradient Descent (RSVGD), a Bayesian inference method that generalizes Stein Variational Gradient Descent (SVGD) to Riemann manifold. The benefits are two-folds: (i) for inference tasks in Euclidean spaces, RSVGD has the advantage over SVGD of utilizing information geometry, and (ii) for inference tasks on Riemann manifolds, RSVGD brings the unique advantages of SVGD to the Riemannian world. To appropriately transfer to Riemann manifolds, we conceive novel and non-trivial techniques for RSVGD, which are required by the intrinsically different characteristics of general Riemann manifolds from Euclidean spaces. We also discover Riemannian Stein’s Identity and Riemannian Kernelized Stein Discrepancy. Experimental results show the advantages over SVGD of exploring distribution geometry and the advantages of particle-efficiency, iteration-effectiveness and approximation flexibility over other inference methods on Riemann manifolds. |
Tasks | Bayesian Inference |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1711.11216v1 |
http://arxiv.org/pdf/1711.11216v1.pdf | |
PWC | https://paperswithcode.com/paper/riemannian-stein-variational-gradient-descent |
Repo | |
Framework | |
Pose-driven Deep Convolutional Model for Person Re-identification
Title | Pose-driven Deep Convolutional Model for Person Re-identification |
Authors | Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian |
Abstract | Feature extraction and matching are two crucial components in person Re-Identification (ReID). The large pose deformations and the complex view variations exhibited by the captured person images significantly increase the difficulty of learning and matching of the features from person images. To overcome these difficulties, in this work we propose a Pose-driven Deep Convolutional (PDC) model to learn improved feature extraction and matching models from end to end. Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts. To match the features from global human body and local body parts, a pose driven feature weighting sub-network is further designed to learn adaptive feature fusions. Extensive experimental analyses and results on three popular datasets demonstrate significant performance improvements of our model over all published state-of-the-art methods. |
Tasks | Person Re-Identification |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08325v1 |
http://arxiv.org/pdf/1709.08325v1.pdf | |
PWC | https://paperswithcode.com/paper/pose-driven-deep-convolutional-model-for |
Repo | |
Framework | |
Convolutional neural networks on irregular domains based on approximate vertex-domain translations
Title | Convolutional neural networks on irregular domains based on approximate vertex-domain translations |
Authors | Bastien Pasdeloup, Vincent Gripon, Jean-Charles Vialatte, Dominique Pastor, Pascal Frossard |
Abstract | We propose a generalization of convolutional neural networks (CNNs) to irregular domains, through the use of a translation operator on a graph structure. In regular settings such as images, convolutional layers are designed by translating a convolutional kernel over all pixels, thus enforcing translation equivariance. In the case of general graphs however, translation is not a well-defined operation, which makes shifting a convolutional kernel not straightforward. In this article, we introduce a methodology to allow the design of convolutional layers that are adapted to signals evolving on irregular topologies, even in the absence of a natural translation. Using the designed layers, we build a CNN that we train using the initial set of signals. Contrary to other approaches that aim at extending CNNs to irregular domains, we incorporate the classical settings of CNNs for 2D signals as a particular case of our approach. Designing convolutional layers in the vertex domain directly implies weight sharing, which in other approaches is generally estimated a posteriori using heuristics. |
Tasks | |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10035v2 |
http://arxiv.org/pdf/1710.10035v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-on-irregular |
Repo | |
Framework | |
Linear Discriminant Generative Adversarial Networks
Title | Linear Discriminant Generative Adversarial Networks |
Authors | Zhun Sun, Mete Ozay, Takayuki Okatani |
Abstract | We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN). The discriminator of an LD-GAN is trained to maximize the linear separability between distributions of hidden representations of generated and targeted samples, while the generator is updated based on the decision hyper-planes computed by performing LDA over the hidden representations. LD-GAN provides a concrete metric of separation capacity for the discriminator, and we experimentally show that it is possible to stabilize the training of LD-GAN simply by calibrating the update frequencies between generators and discriminators in the unsupervised case, without employment of normalization methods and constraints on weights. In the class conditional generation tasks, the proposed method shows improved training stability together with better generalization performance compared to WGAN that employs an auxiliary classifier. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.07831v1 |
http://arxiv.org/pdf/1707.07831v1.pdf | |
PWC | https://paperswithcode.com/paper/linear-discriminant-generative-adversarial |
Repo | |
Framework | |
Shape optimization in laminar flow with a label-guided variational autoencoder
Title | Shape optimization in laminar flow with a label-guided variational autoencoder |
Authors | Stephan Eismann, Stefan Bartzsch, Stefano Ermon |
Abstract | Computational design optimization in fluid dynamics usually requires to solve non-linear partial differential equations numerically. In this work, we explore a Bayesian optimization approach to minimize an object’s drag coefficient in laminar flow based on predicting drag directly from the object shape. Jointly training an architecture combining a variational autoencoder mapping shapes to latent representations and Gaussian process regression allows us to generate improved shapes in the two dimensional case we consider. |
Tasks | |
Published | 2017-12-10 |
URL | http://arxiv.org/abs/1712.03599v1 |
http://arxiv.org/pdf/1712.03599v1.pdf | |
PWC | https://paperswithcode.com/paper/shape-optimization-in-laminar-flow-with-a |
Repo | |
Framework | |
PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval
Title | PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval |
Authors | Weixun Zhou, Shawn Newsam, Congmin Li, Zhenfeng Shao |
Abstract | Remote sensing image retrieval(RSIR), which aims to efficiently retrieve data of interest from large collections of remote sensing data, is a fundamental task in remote sensing. Over the past several decades, there has been significant effort to extract powerful feature representations for this task since the retrieval performance depends on the representative strength of the features. Benchmark datasets are also critical for developing, evaluating, and comparing RSIR approaches. Current benchmark datasets are deficient in that 1) they were originally collected for land use/land cover classification and not image retrieval, 2) they are relatively small in terms of the number of classes as well the number of sample images per class, and 3) the retrieval performance has saturated. These limitations have severely restricted the development of novel feature representations for RSIR, particularly the recent deep-learning based features which require large amounts of training data. We therefore present in this paper, a new large-scale remote sensing dataset termed “PatternNet” that was collected specifically for RSIR. PatternNet was collected from high-resolution imagery and contains 38 classes with 800 images per class. We also provide a thorough review of RSIR approaches ranging from traditional handcrafted feature based methods to recent deep learning based ones. We evaluate over 35 methods to establish extensive baseline results for future RSIR research using the PatternNet benchmark. |
Tasks | Image Retrieval |
Published | 2017-06-11 |
URL | http://arxiv.org/abs/1706.03424v2 |
http://arxiv.org/pdf/1706.03424v2.pdf | |
PWC | https://paperswithcode.com/paper/patternnet-a-benchmark-dataset-for |
Repo | |
Framework | |
A Machine-Learning Framework for Design for Manufacturability
Title | A Machine-Learning Framework for Design for Manufacturability |
Authors | Aditya Balu, Sambit Ghadai, Gavin Young, Soumik Sarkar, Adarsh Krishnamurthy |
Abstract | this is a duplicate submission(original is arXiv:1612.02141). Hence want to withdraw it |
Tasks | |
Published | 2017-03-04 |
URL | http://arxiv.org/abs/1703.01499v2 |
http://arxiv.org/pdf/1703.01499v2.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-framework-for-design-for |
Repo | |
Framework | |
ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules
Title | ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules |
Authors | Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg |
Abstract | One of the grand challenges in modern theoretical chemistry is designing and implementing approximations that expedite ab initio methods without loss of accuracy. Machine learning (ML), in particular neural networks, are emerging as a powerful approach to constructing various forms of transferable atomistic potentials. They have been successfully applied in a variety of applications in chemistry, biology, catalysis, and solid-state physics. However, these models are heavily dependent on the quality and quantity of data used in their fitting. Fitting highly flexible ML potentials comes at a cost: a vast amount of reference data is required to properly train these models. We address this need by providing access to a large computational DFT database, which consists of 20M conformations for 57,454 small organic molecules. We believe it will become a new standard benchmark for comparison of current and future methods in the ML potential community. |
Tasks | |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.04987v4 |
http://arxiv.org/pdf/1708.04987v4.pdf | |
PWC | https://paperswithcode.com/paper/ani-1-a-data-set-of-20m-off-equilibrium-dft |
Repo | |
Framework | |