October 15, 2019

2051 words 10 mins read

Paper Group NANR 202

On the limitations of first order approximation in GAN dynamics. What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python. Fast and Accurate Reordering with ITG Transition RNN. Local Density Estimation in High Dimensions. Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages. Pronunciation Variants and …

On the limitations of first order approximation in GAN dynamics


Title	On the limitations of first order approximation in GAN dynamics
Authors	Jerry Li, Aleksander Madry, John Peebles, Ludwig Schmidt
Abstract	Generative Adversarial Networks (GANs) have been proposed as an approach to learning generative models. While GANs have demonstrated promising performance on multiple vision tasks, their learning dynamics are not yet well understood, neither in theory nor in practice. In particular, the work in this domain has been focused so far only on understanding the properties of the stationary solutions that this dynamics might converge to, and of the behavior of that dynamics in this solutions’ immediate neighborhood. To address this issue, in this work we take a first step towards a principled study of the GAN dynamics itself. To this end, we propose a model that, on one hand, exhibits several of the common problematic convergence behaviors (e.g., vanishing gradient, mode collapse, diverging or oscillatory behavior), but on the other hand, is sufficiently simple to enable rigorous convergence analysis. This methodology enables us to exhibit an interesting phenomena: a GAN with an optimal discriminator provably converges, while guiding the GAN training using only a first order approximation of the discriminator leads to unstable GAN dynamics and mode collapse. This suggests that such usage of the first order approximation of the discriminator, which is a de-facto standard in all the existing GAN dynamics, might be one of the factors that makes GAN training so challenging in practice. Additionally, our convergence result constitutes the first rigorous analysis of a dynamics of a concrete parametric GAN.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=HJYQLb-RW
PDF	https://openreview.net/pdf?id=HJYQLb-RW
PWC	https://paperswithcode.com/paper/on-the-limitations-of-first-order
Repo
Framework

What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python


Title	What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python
Authors	Bal{'a}zs Indig, Andr{'a}s Simonyi, No{'e}mi Ligeti-Nagy
Abstract
Tasks	Chunking
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1091/
PDF	https://www.aclweb.org/anthology/L18-1091
PWC	https://paperswithcode.com/paper/whats-wrong-python-a-visual-differ-and-graph
Repo
Framework

Fast and Accurate Reordering with ITG Transition RNN


Title	Fast and Accurate Reordering with ITG Transition RNN
Authors	Hao Zhang, Axel Ng, Richard Sproat
Abstract	Attention-based sequence-to-sequence neural network models learn to jointly align and translate. The quadratic-time attention mechanism is powerful as it is capable of handling arbitrary long-distance reordering, but computationally expensive. In this paper, towards making neural translation both accurate and efficient, we follow the traditional pre-reordering approach to decouple reordering from translation. We add a reordering RNN that shares the input encoder with the decoder. The RNNs are trained jointly with a multi-task loss function and applied sequentially at inference time. The task of the reordering model is to predict the permutation of the input words following the target language word order. After reordering, the attention in the decoder becomes more peaked and monotonic. For reordering, we adopt the Inversion Transduction Grammars (ITG) and propose a transition system to parse input to trees for reordering. We harness the ITG transition system with RNN. With the modeling power of RNN, we achieve superior reordering accuracy without any feature engineering. In experiments, we apply the model to the task of text normalization. Compared to a strong baseline of attention-based RNN, our ITG RNN re-ordering model can reach the same reordering accuracy with only 1/10 of the training data and is 2.5x faster in decoding.
Tasks	Feature Engineering, Machine Translation, Morphological Inflection, Speech Recognition
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1123/
PDF	https://www.aclweb.org/anthology/C18-1123
PWC	https://paperswithcode.com/paper/fast-and-accurate-reordering-with-itg
Repo
Framework

Local Density Estimation in High Dimensions


Title	Local Density Estimation in High Dimensions
Authors	Xian Wu, Moses Charikar, Vishnu Natchu
Abstract	An important question that arises in the study of high dimensional vector representations learned from data is: given a set D of vectors and a query q, estimate the number of points within a specified distance threshold of q. Our algorithm uses locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via an unbiased estimator that uses importance sampling. A key innovation is the ability to maintain a small number of hash tables via preprocessing data structures and algorithms that sample from multiple buckets in each hash table. We give bounds on the space requirements and query complexity of our scheme, and demonstrate the effectiveness of our algorithm by experiments on a standard word embedding dataset.
Tasks	Density Estimation
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2460
PDF	http://proceedings.mlr.press/v80/wu18a/wu18a.pdf
PWC	https://paperswithcode.com/paper/local-density-estimation-in-high-dimensions
Repo
Framework

Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages


Title	Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages
Authors	Nicola Ueffing, Jos{'e} G. C. de Souza, Gregor Leusch
Abstract	At eBay, we are automatically generating a large amount of natural language titles for eCommerce browse pages using machine translation (MT) technology. While automatic approaches can generate millions of titles very fast, they are prone to errors. We therefore develop quality estimation (QE) methods which can automatically detect titles with low quality in order to prevent them from going live. In this paper, we present different approaches: The first one is a Random Forest (RF) model that explores hand-crafted, robust features, which are a mix of established features commonly used in Machine Translation Quality Estimation (MTQE) and new features developed specifically for our task. The second model is based on Siamese Networks (SNs) which embed the metadata input sequence and the generated title in the same space and do not require hand-crafted features at all. We thoroughly evaluate and compare those approaches on in-house data. While the RF models are competitive for scenarios with smaller amounts of training data and somewhat more robust, they are clearly outperformed by the SN models when the amount of training data is larger.
Tasks	Machine Translation
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-3007/
PDF	https://www.aclweb.org/anthology/N18-3007
PWC	https://paperswithcode.com/paper/quality-estimation-for-automatically
Repo
Framework

Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech


Title	Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Authors	David Luke{\v{s}}, Marie Kop{\v{r}}ivov{'a}, Zuzana Komrskov{'a}, Petra Poukarov{'a}
Abstract
Tasks	Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1428/
PDF	https://www.aclweb.org/anthology/L18-1428
PWC	https://paperswithcode.com/paper/pronunciation-variants-and-asr-of-colloquial
Repo
Framework

Learning in Reproducing Kernel Kreı̆n Spaces


Title	Learning in Reproducing Kernel Kreı̆n Spaces
Authors	Dino Oglic, Thomas Gaertner
Abstract	We formulate a novel regularized risk minimization problem for learning in reproducing kernel Kre{ı̆}n spaces and show that the strong representer theorem applies to it. As a result of the latter, the learning problem can be expressed as the minimization of a quadratic form over a hypersphere of constant radius. We present an algorithm that can find a globally optimal solution to this non-convex optimization problem in time cubic in the number of instances. Moreover, we derive the gradient of the solution with respect to its hyperparameters and, in this way, provide means for efficient hyperparameter tuning. The approach comes with a generalization bound expressed in terms of the Rademacher complexity of the corresponding hypothesis space. The major advantage over standard kernel methods is the ability to learn with various domain specific similarity measures for which positive definiteness does not hold or is difficult to establish. The approach is evaluated empirically using indefinite kernels defined on structured as well as vectorial data. The empirical results demonstrate a superior performance of our approach over the state-of-the-art baselines.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2200
PDF	http://proceedings.mlr.press/v80/oglic18a/oglic18a.pdf
PWC	https://paperswithcode.com/paper/learning-in-reproducing-kernel-kren-spaces
Repo
Framework

Chat,Chunk and Topic in Casual Conversation


Title	Chat,Chunk and Topic in Casual Conversation
Authors	Emer Gilmartin, Carl Vogel
Abstract
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4705/
PDF	https://www.aclweb.org/anthology/W18-4705
PWC	https://paperswithcode.com/paper/chatchunk-and-topic-in-casual-conversation
Repo
Framework

Grammar Size and Quantitative Restrictions on Movement


Title	Grammar Size and Quantitative Restrictions on Movement
Authors	Thomas Graf
Abstract
Tasks
Published	2018-01-01
URL	https://www.aclweb.org/anthology/W18-0303/
PDF	https://www.aclweb.org/anthology/W18-0303
PWC	https://paperswithcode.com/paper/grammar-size-and-quantitative-restrictions-on
Repo
Framework

Multivariate Time Series Imputation with Generative Adversarial Networks


Title	Multivariate Time Series Imputation with Generative Adversarial Networks
Authors	Yonghong Luo, Xiangrui Cai, Ying Zhang, Jun Xu, Yuan Xiaojie
Abstract	Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process. A modified Gate Recurrent Unit is employed in GAN to model the temporal irregularity of the incomplete time series. Experiments on two multivariate time series datasets show that the proposed model outperformed the baselines in terms of accuracy of imputation. Experimental results also showed that a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of our model in downstream applications.
Tasks	Image Generation, Imputation, Multivariate Time Series Imputation, Time Series
Published	2018-12-01
URL	http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks
PDF	http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks.pdf
PWC	https://paperswithcode.com/paper/multivariate-time-series-imputation-with
Repo
Framework

Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval


Title	Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval
Authors	Su-Mei Shiue, Lang-Jyi Huang, Wei-Ho Tsai, Yen-Lin Chen
Abstract
Tasks	Information Retrieval, Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1025/
PDF	https://www.aclweb.org/anthology/O18-1025
PWC	https://paperswithcode.com/paper/smart-vs-solid-solutions-in-computational
Repo
Framework

Formal Restrictions On Multiple Tiers


Title	Formal Restrictions On Multiple Tiers
Authors	Al{"e}na Aks{"e}nova, Sanket Deshmukh
Abstract
Tasks
Published	2018-01-01
URL	https://www.aclweb.org/anthology/W18-0307/
PDF	https://www.aclweb.org/anthology/W18-0307
PWC	https://paperswithcode.com/paper/formal-restrictions-on-multiple-tiers
Repo
Framework

Content-Based Conflict of Interest Detection on Wikipedia


Title	Content-Based Conflict of Interest Detection on Wikipedia
Authors	Udochukwu Orizu, Yulan He
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1026/
PDF	https://www.aclweb.org/anthology/L18-1026
PWC	https://paperswithcode.com/paper/content-based-conflict-of-interest-detection
Repo
Framework

Unleashing the Potential of CNNs for Interpretable Few-Shot Learning


Title	Unleashing the Potential of CNNs for Interpretable Few-Shot Learning
Authors	Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille
Abstract	Convolutional neural networks (CNNs) have been generally acknowledged as one of the driving forces for the advancement of computer vision. Despite their promising performances on many tasks, CNNs still face major obstacles on the road to achieving ideal machine intelligence. One is that CNNs are complex and hard to interpret. Another is that standard CNNs require large amounts of annotated data, which is sometimes very hard to obtain, and it is desirable to be able to learn them from few examples. In this work, we address these limitations of CNNs by developing novel, simple, and interpretable models for few-shot learn- ing. Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs. We first adapt the learning of visual concepts to the few-shot setting, and then uncover two key properties of feature encoding using visual concepts, which we call category sensitivity and spatial pattern. Motivated by these properties, we present two intuitive models for the problem of few-shot learning. Experiments show that our models achieve competitive performances, while being much more flexible and interpretable than alternative state-of-the-art few-shot learning methods. We conclude that using visual concepts helps expose the natural capability of CNNs for few-shot learning.
Tasks	Few-Shot Learning
Published	2018-01-01
URL	https://openreview.net/forum?id=BJ_QxP1AZ
PDF	https://openreview.net/pdf?id=BJ_QxP1AZ
PWC	https://paperswithcode.com/paper/unleashing-the-potential-of-cnns-for
Repo
Framework

DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings


Title	DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings
Authors	Dmitry Kravchenko, Lidia Pivovarova
Abstract	The paper describes our approach for SemEval-2018 Task 1: Affect Detection in Tweets. We perform experiments with manually compelled sentiment lexicons and word embeddings. We test their performance on twitter affect detection task to determine which features produce the most informative representation of a sentence. We demonstrate that general-purpose word embeddings produces more informative sentence representation than lexicon features. However, combining lexicon features with embeddings yields higher performance than embeddings alone.
Tasks	Emotion Classification, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1025/
PDF	https://www.aclweb.org/anthology/S18-1025
PWC	https://paperswithcode.com/paper/dl-team-at-semeval-2018-task-1-tweet-affect
Repo
Framework