October 15, 2019

2051 words 10 mins read

Paper Group NANR 202

Paper Group NANR 202

On the limitations of first order approximation in GAN dynamics. What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python. Fast and Accurate Reordering with ITG Transition RNN. Local Density Estimation in High Dimensions. Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages. Pronunciation Variants and …

On the limitations of first order approximation in GAN dynamics

Title On the limitations of first order approximation in GAN dynamics
Authors Jerry Li, Aleksander Madry, John Peebles, Ludwig Schmidt
Abstract Generative Adversarial Networks (GANs) have been proposed as an approach to learning generative models. While GANs have demonstrated promising performance on multiple vision tasks, their learning dynamics are not yet well understood, neither in theory nor in practice. In particular, the work in this domain has been focused so far only on understanding the properties of the stationary solutions that this dynamics might converge to, and of the behavior of that dynamics in this solutions’ immediate neighborhood. To address this issue, in this work we take a first step towards a principled study of the GAN dynamics itself. To this end, we propose a model that, on one hand, exhibits several of the common problematic convergence behaviors (e.g., vanishing gradient, mode collapse, diverging or oscillatory behavior), but on the other hand, is sufficiently simple to enable rigorous convergence analysis. This methodology enables us to exhibit an interesting phenomena: a GAN with an optimal discriminator provably converges, while guiding the GAN training using only a first order approximation of the discriminator leads to unstable GAN dynamics and mode collapse. This suggests that such usage of the first order approximation of the discriminator, which is a de-facto standard in all the existing GAN dynamics, might be one of the factors that makes GAN training so challenging in practice. Additionally, our convergence result constitutes the first rigorous analysis of a dynamics of a concrete parametric GAN.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=HJYQLb-RW
PDF https://openreview.net/pdf?id=HJYQLb-RW
PWC https://paperswithcode.com/paper/on-the-limitations-of-first-order
Repo
Framework

What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python

Title What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python
Authors Bal{'a}zs Indig, Andr{'a}s Simonyi, No{'e}mi Ligeti-Nagy
Abstract
Tasks Chunking
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1091/
PDF https://www.aclweb.org/anthology/L18-1091
PWC https://paperswithcode.com/paper/whats-wrong-python-a-visual-differ-and-graph
Repo
Framework

Fast and Accurate Reordering with ITG Transition RNN

Title Fast and Accurate Reordering with ITG Transition RNN
Authors Hao Zhang, Axel Ng, Richard Sproat
Abstract Attention-based sequence-to-sequence neural network models learn to jointly align and translate. The quadratic-time attention mechanism is powerful as it is capable of handling arbitrary long-distance reordering, but computationally expensive. In this paper, towards making neural translation both accurate and efficient, we follow the traditional pre-reordering approach to decouple reordering from translation. We add a reordering RNN that shares the input encoder with the decoder. The RNNs are trained jointly with a multi-task loss function and applied sequentially at inference time. The task of the reordering model is to predict the permutation of the input words following the target language word order. After reordering, the attention in the decoder becomes more peaked and monotonic. For reordering, we adopt the Inversion Transduction Grammars (ITG) and propose a transition system to parse input to trees for reordering. We harness the ITG transition system with RNN. With the modeling power of RNN, we achieve superior reordering accuracy without any feature engineering. In experiments, we apply the model to the task of text normalization. Compared to a strong baseline of attention-based RNN, our ITG RNN re-ordering model can reach the same reordering accuracy with only 1/10 of the training data and is 2.5x faster in decoding.
Tasks Feature Engineering, Machine Translation, Morphological Inflection, Speech Recognition
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1123/
PDF https://www.aclweb.org/anthology/C18-1123
PWC https://paperswithcode.com/paper/fast-and-accurate-reordering-with-itg
Repo
Framework

Local Density Estimation in High Dimensions

Title Local Density Estimation in High Dimensions
Authors Xian Wu, Moses Charikar, Vishnu Natchu
Abstract An important question that arises in the study of high dimensional vector representations learned from data is: given a set D of vectors and a query q, estimate the number of points within a specified distance threshold of q. Our algorithm uses locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via an unbiased estimator that uses importance sampling. A key innovation is the ability to maintain a small number of hash tables via preprocessing data structures and algorithms that sample from multiple buckets in each hash table. We give bounds on the space requirements and query complexity of our scheme, and demonstrate the effectiveness of our algorithm by experiments on a standard word embedding dataset.
Tasks Density Estimation
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2460
PDF http://proceedings.mlr.press/v80/wu18a/wu18a.pdf
PWC https://paperswithcode.com/paper/local-density-estimation-in-high-dimensions
Repo
Framework

Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages

Title Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages
Authors Nicola Ueffing, Jos{'e} G. C. de Souza, Gregor Leusch
Abstract At eBay, we are automatically generating a large amount of natural language titles for eCommerce browse pages using machine translation (MT) technology. While automatic approaches can generate millions of titles very fast, they are prone to errors. We therefore develop quality estimation (QE) methods which can automatically detect titles with low quality in order to prevent them from going live. In this paper, we present different approaches: The first one is a Random Forest (RF) model that explores hand-crafted, robust features, which are a mix of established features commonly used in Machine Translation Quality Estimation (MTQE) and new features developed specifically for our task. The second model is based on Siamese Networks (SNs) which embed the metadata input sequence and the generated title in the same space and do not require hand-crafted features at all. We thoroughly evaluate and compare those approaches on in-house data. While the RF models are competitive for scenarios with smaller amounts of training data and somewhat more robust, they are clearly outperformed by the SN models when the amount of training data is larger.
Tasks Machine Translation
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-3007/
PDF https://www.aclweb.org/anthology/N18-3007
PWC https://paperswithcode.com/paper/quality-estimation-for-automatically
Repo
Framework

Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech

Title Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Authors David Luke{\v{s}}, Marie Kop{\v{r}}ivov{'a}, Zuzana Komrskov{'a}, Petra Poukarov{'a}
Abstract
Tasks Speech Recognition
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1428/
PDF https://www.aclweb.org/anthology/L18-1428
PWC https://paperswithcode.com/paper/pronunciation-variants-and-asr-of-colloquial
Repo
Framework

Learning in Reproducing Kernel Kreı̆n Spaces

Title Learning in Reproducing Kernel Kreı̆n Spaces
Authors Dino Oglic, Thomas Gaertner
Abstract We formulate a novel regularized risk minimization problem for learning in reproducing kernel Kre{ı̆}n spaces and show that the strong representer theorem applies to it. As a result of the latter, the learning problem can be expressed as the minimization of a quadratic form over a hypersphere of constant radius. We present an algorithm that can find a globally optimal solution to this non-convex optimization problem in time cubic in the number of instances. Moreover, we derive the gradient of the solution with respect to its hyperparameters and, in this way, provide means for efficient hyperparameter tuning. The approach comes with a generalization bound expressed in terms of the Rademacher complexity of the corresponding hypothesis space. The major advantage over standard kernel methods is the ability to learn with various domain specific similarity measures for which positive definiteness does not hold or is difficult to establish. The approach is evaluated empirically using indefinite kernels defined on structured as well as vectorial data. The empirical results demonstrate a superior performance of our approach over the state-of-the-art baselines.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2200
PDF http://proceedings.mlr.press/v80/oglic18a/oglic18a.pdf
PWC https://paperswithcode.com/paper/learning-in-reproducing-kernel-kren-spaces
Repo
Framework

Chat,Chunk and Topic in Casual Conversation

Title Chat,Chunk and Topic in Casual Conversation
Authors Emer Gilmartin, Carl Vogel
Abstract
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/W18-4705/
PDF https://www.aclweb.org/anthology/W18-4705
PWC https://paperswithcode.com/paper/chatchunk-and-topic-in-casual-conversation
Repo
Framework

Grammar Size and Quantitative Restrictions on Movement

Title Grammar Size and Quantitative Restrictions on Movement
Authors Thomas Graf
Abstract
Tasks
Published 2018-01-01
URL https://www.aclweb.org/anthology/W18-0303/
PDF https://www.aclweb.org/anthology/W18-0303
PWC https://paperswithcode.com/paper/grammar-size-and-quantitative-restrictions-on
Repo
Framework

Multivariate Time Series Imputation with Generative Adversarial Networks

Title Multivariate Time Series Imputation with Generative Adversarial Networks
Authors Yonghong Luo, Xiangrui Cai, Ying Zhang, Jun Xu, Yuan Xiaojie
Abstract Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process. A modified Gate Recurrent Unit is employed in GAN to model the temporal irregularity of the incomplete time series. Experiments on two multivariate time series datasets show that the proposed model outperformed the baselines in terms of accuracy of imputation. Experimental results also showed that a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of our model in downstream applications.
Tasks Image Generation, Imputation, Multivariate Time Series Imputation, Time Series
Published 2018-12-01
URL http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks
PDF http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks.pdf
PWC https://paperswithcode.com/paper/multivariate-time-series-imputation-with
Repo
Framework

Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval

Title Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval
Authors Su-Mei Shiue, Lang-Jyi Huang, Wei-Ho Tsai, Yen-Lin Chen
Abstract
Tasks Information Retrieval, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/O18-1025/
PDF https://www.aclweb.org/anthology/O18-1025
PWC https://paperswithcode.com/paper/smart-vs-solid-solutions-in-computational
Repo
Framework

Formal Restrictions On Multiple Tiers

Title Formal Restrictions On Multiple Tiers
Authors Al{"e}na Aks{"e}nova, Sanket Deshmukh
Abstract
Tasks
Published 2018-01-01
URL https://www.aclweb.org/anthology/W18-0307/
PDF https://www.aclweb.org/anthology/W18-0307
PWC https://paperswithcode.com/paper/formal-restrictions-on-multiple-tiers
Repo
Framework

Content-Based Conflict of Interest Detection on Wikipedia

Title Content-Based Conflict of Interest Detection on Wikipedia
Authors Udochukwu Orizu, Yulan He
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1026/
PDF https://www.aclweb.org/anthology/L18-1026
PWC https://paperswithcode.com/paper/content-based-conflict-of-interest-detection
Repo
Framework

Unleashing the Potential of CNNs for Interpretable Few-Shot Learning

Title Unleashing the Potential of CNNs for Interpretable Few-Shot Learning
Authors Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille
Abstract Convolutional neural networks (CNNs) have been generally acknowledged as one of the driving forces for the advancement of computer vision. Despite their promising performances on many tasks, CNNs still face major obstacles on the road to achieving ideal machine intelligence. One is that CNNs are complex and hard to interpret. Another is that standard CNNs require large amounts of annotated data, which is sometimes very hard to obtain, and it is desirable to be able to learn them from few examples. In this work, we address these limitations of CNNs by developing novel, simple, and interpretable models for few-shot learn- ing. Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs. We first adapt the learning of visual concepts to the few-shot setting, and then uncover two key properties of feature encoding using visual concepts, which we call category sensitivity and spatial pattern. Motivated by these properties, we present two intuitive models for the problem of few-shot learning. Experiments show that our models achieve competitive performances, while being much more flexible and interpretable than alternative state-of-the-art few-shot learning methods. We conclude that using visual concepts helps expose the natural capability of CNNs for few-shot learning.
Tasks Few-Shot Learning
Published 2018-01-01
URL https://openreview.net/forum?id=BJ_QxP1AZ
PDF https://openreview.net/pdf?id=BJ_QxP1AZ
PWC https://paperswithcode.com/paper/unleashing-the-potential-of-cnns-for
Repo
Framework

DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings

Title DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings
Authors Dmitry Kravchenko, Lidia Pivovarova
Abstract The paper describes our approach for SemEval-2018 Task 1: Affect Detection in Tweets. We perform experiments with manually compelled sentiment lexicons and word embeddings. We test their performance on twitter affect detection task to determine which features produce the most informative representation of a sentence. We demonstrate that general-purpose word embeddings produces more informative sentence representation than lexicon features. However, combining lexicon features with embeddings yields higher performance than embeddings alone.
Tasks Emotion Classification, Word Embeddings
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1025/
PDF https://www.aclweb.org/anthology/S18-1025
PWC https://paperswithcode.com/paper/dl-team-at-semeval-2018-task-1-tweet-affect
Repo
Framework
comments powered by Disqus