Paper Group NANR 202
On the limitations of first order approximation in GAN dynamics. What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python. Fast and Accurate Reordering with ITG Transition RNN. Local Density Estimation in High Dimensions. Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages. Pronunciation Variants and …
On the limitations of first order approximation in GAN dynamics
Title | On the limitations of first order approximation in GAN dynamics |
Authors | Jerry Li, Aleksander Madry, John Peebles, Ludwig Schmidt |
Abstract | Generative Adversarial Networks (GANs) have been proposed as an approach to learning generative models. While GANs have demonstrated promising performance on multiple vision tasks, their learning dynamics are not yet well understood, neither in theory nor in practice. In particular, the work in this domain has been focused so far only on understanding the properties of the stationary solutions that this dynamics might converge to, and of the behavior of that dynamics in this solutions’ immediate neighborhood. To address this issue, in this work we take a first step towards a principled study of the GAN dynamics itself. To this end, we propose a model that, on one hand, exhibits several of the common problematic convergence behaviors (e.g., vanishing gradient, mode collapse, diverging or oscillatory behavior), but on the other hand, is sufficiently simple to enable rigorous convergence analysis. This methodology enables us to exhibit an interesting phenomena: a GAN with an optimal discriminator provably converges, while guiding the GAN training using only a first order approximation of the discriminator leads to unstable GAN dynamics and mode collapse. This suggests that such usage of the first order approximation of the discriminator, which is a de-facto standard in all the existing GAN dynamics, might be one of the factors that makes GAN training so challenging in practice. Additionally, our convergence result constitutes the first rigorous analysis of a dynamics of a concrete parametric GAN. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJYQLb-RW |
https://openreview.net/pdf?id=HJYQLb-RW | |
PWC | https://paperswithcode.com/paper/on-the-limitations-of-first-order |
Repo | |
Framework | |
What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python
Title | What’s Wrong, Python? – A Visual Differ and Graph Library for NLP in Python |
Authors | Bal{'a}zs Indig, Andr{'a}s Simonyi, No{'e}mi Ligeti-Nagy |
Abstract | |
Tasks | Chunking |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1091/ |
https://www.aclweb.org/anthology/L18-1091 | |
PWC | https://paperswithcode.com/paper/whats-wrong-python-a-visual-differ-and-graph |
Repo | |
Framework | |
Fast and Accurate Reordering with ITG Transition RNN
Title | Fast and Accurate Reordering with ITG Transition RNN |
Authors | Hao Zhang, Axel Ng, Richard Sproat |
Abstract | Attention-based sequence-to-sequence neural network models learn to jointly align and translate. The quadratic-time attention mechanism is powerful as it is capable of handling arbitrary long-distance reordering, but computationally expensive. In this paper, towards making neural translation both accurate and efficient, we follow the traditional pre-reordering approach to decouple reordering from translation. We add a reordering RNN that shares the input encoder with the decoder. The RNNs are trained jointly with a multi-task loss function and applied sequentially at inference time. The task of the reordering model is to predict the permutation of the input words following the target language word order. After reordering, the attention in the decoder becomes more peaked and monotonic. For reordering, we adopt the Inversion Transduction Grammars (ITG) and propose a transition system to parse input to trees for reordering. We harness the ITG transition system with RNN. With the modeling power of RNN, we achieve superior reordering accuracy without any feature engineering. In experiments, we apply the model to the task of text normalization. Compared to a strong baseline of attention-based RNN, our ITG RNN re-ordering model can reach the same reordering accuracy with only 1/10 of the training data and is 2.5x faster in decoding. |
Tasks | Feature Engineering, Machine Translation, Morphological Inflection, Speech Recognition |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1123/ |
https://www.aclweb.org/anthology/C18-1123 | |
PWC | https://paperswithcode.com/paper/fast-and-accurate-reordering-with-itg |
Repo | |
Framework | |
Local Density Estimation in High Dimensions
Title | Local Density Estimation in High Dimensions |
Authors | Xian Wu, Moses Charikar, Vishnu Natchu |
Abstract | An important question that arises in the study of high dimensional vector representations learned from data is: given a set D of vectors and a query q, estimate the number of points within a specified distance threshold of q. Our algorithm uses locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via an unbiased estimator that uses importance sampling. A key innovation is the ability to maintain a small number of hash tables via preprocessing data structures and algorithms that sample from multiple buckets in each hash table. We give bounds on the space requirements and query complexity of our scheme, and demonstrate the effectiveness of our algorithm by experiments on a standard word embedding dataset. |
Tasks | Density Estimation |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2460 |
http://proceedings.mlr.press/v80/wu18a/wu18a.pdf | |
PWC | https://paperswithcode.com/paper/local-density-estimation-in-high-dimensions |
Repo | |
Framework | |
Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages
Title | Quality Estimation for Automatically Generated Titles of eCommerce Browse Pages |
Authors | Nicola Ueffing, Jos{'e} G. C. de Souza, Gregor Leusch |
Abstract | At eBay, we are automatically generating a large amount of natural language titles for eCommerce browse pages using machine translation (MT) technology. While automatic approaches can generate millions of titles very fast, they are prone to errors. We therefore develop quality estimation (QE) methods which can automatically detect titles with low quality in order to prevent them from going live. In this paper, we present different approaches: The first one is a Random Forest (RF) model that explores hand-crafted, robust features, which are a mix of established features commonly used in Machine Translation Quality Estimation (MTQE) and new features developed specifically for our task. The second model is based on Siamese Networks (SNs) which embed the metadata input sequence and the generated title in the same space and do not require hand-crafted features at all. We thoroughly evaluate and compare those approaches on in-house data. While the RF models are competitive for scenarios with smaller amounts of training data and somewhat more robust, they are clearly outperformed by the SN models when the amount of training data is larger. |
Tasks | Machine Translation |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3007/ |
https://www.aclweb.org/anthology/N18-3007 | |
PWC | https://paperswithcode.com/paper/quality-estimation-for-automatically |
Repo | |
Framework | |
Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech
Title | Pronunciation Variants and ASR of Colloquial Speech: A Case Study on Czech |
Authors | David Luke{\v{s}}, Marie Kop{\v{r}}ivov{'a}, Zuzana Komrskov{'a}, Petra Poukarov{'a} |
Abstract | |
Tasks | Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1428/ |
https://www.aclweb.org/anthology/L18-1428 | |
PWC | https://paperswithcode.com/paper/pronunciation-variants-and-asr-of-colloquial |
Repo | |
Framework | |
Learning in Reproducing Kernel Kreı̆n Spaces
Title | Learning in Reproducing Kernel Kreı̆n Spaces |
Authors | Dino Oglic, Thomas Gaertner |
Abstract | We formulate a novel regularized risk minimization problem for learning in reproducing kernel Kre{ı̆}n spaces and show that the strong representer theorem applies to it. As a result of the latter, the learning problem can be expressed as the minimization of a quadratic form over a hypersphere of constant radius. We present an algorithm that can find a globally optimal solution to this non-convex optimization problem in time cubic in the number of instances. Moreover, we derive the gradient of the solution with respect to its hyperparameters and, in this way, provide means for efficient hyperparameter tuning. The approach comes with a generalization bound expressed in terms of the Rademacher complexity of the corresponding hypothesis space. The major advantage over standard kernel methods is the ability to learn with various domain specific similarity measures for which positive definiteness does not hold or is difficult to establish. The approach is evaluated empirically using indefinite kernels defined on structured as well as vectorial data. The empirical results demonstrate a superior performance of our approach over the state-of-the-art baselines. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2200 |
http://proceedings.mlr.press/v80/oglic18a/oglic18a.pdf | |
PWC | https://paperswithcode.com/paper/learning-in-reproducing-kernel-kren-spaces |
Repo | |
Framework | |
Chat,Chunk and Topic in Casual Conversation
Title | Chat,Chunk and Topic in Casual Conversation |
Authors | Emer Gilmartin, Carl Vogel |
Abstract | |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4705/ |
https://www.aclweb.org/anthology/W18-4705 | |
PWC | https://paperswithcode.com/paper/chatchunk-and-topic-in-casual-conversation |
Repo | |
Framework | |
Grammar Size and Quantitative Restrictions on Movement
Title | Grammar Size and Quantitative Restrictions on Movement |
Authors | Thomas Graf |
Abstract | |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/W18-0303/ |
https://www.aclweb.org/anthology/W18-0303 | |
PWC | https://paperswithcode.com/paper/grammar-size-and-quantitative-restrictions-on |
Repo | |
Framework | |
Multivariate Time Series Imputation with Generative Adversarial Networks
Title | Multivariate Time Series Imputation with Generative Adversarial Networks |
Authors | Yonghong Luo, Xiangrui Cai, Ying Zhang, Jun Xu, Yuan Xiaojie |
Abstract | Multivariate time series usually contain a large number of missing values, which hinders the application of advanced analysis methods on multivariate time series data. Conventional approaches to addressing the challenge of missing values, including mean/zero imputation, case deletion, and matrix factorization-based imputation, are all incapable of modeling the temporal dependencies and the nature of complex distribution in multivariate time series. In this paper, we treat the problem of missing value imputation as data generation. Inspired by the success of Generative Adversarial Networks (GAN) in image generation, we propose to learn the overall distribution of a multivariate time series dataset with GAN, which is further used to generate the missing values for each sample. Different from the image data, the time series data are usually incomplete due to the nature of data recording process. A modified Gate Recurrent Unit is employed in GAN to model the temporal irregularity of the incomplete time series. Experiments on two multivariate time series datasets show that the proposed model outperformed the baselines in terms of accuracy of imputation. Experimental results also showed that a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of our model in downstream applications. |
Tasks | Image Generation, Imputation, Multivariate Time Series Imputation, Time Series |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks |
http://papers.nips.cc/paper/7432-multivariate-time-series-imputation-with-generative-adversarial-networks.pdf | |
PWC | https://paperswithcode.com/paper/multivariate-time-series-imputation-with |
Repo | |
Framework | |
Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval
Title | Smart vs. Solid Solutions in Computational Linguistics—Machine Translation or Information Retrieval |
Authors | Su-Mei Shiue, Lang-Jyi Huang, Wei-Ho Tsai, Yen-Lin Chen |
Abstract | |
Tasks | Information Retrieval, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1025/ |
https://www.aclweb.org/anthology/O18-1025 | |
PWC | https://paperswithcode.com/paper/smart-vs-solid-solutions-in-computational |
Repo | |
Framework | |
Formal Restrictions On Multiple Tiers
Title | Formal Restrictions On Multiple Tiers |
Authors | Al{"e}na Aks{"e}nova, Sanket Deshmukh |
Abstract | |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/W18-0307/ |
https://www.aclweb.org/anthology/W18-0307 | |
PWC | https://paperswithcode.com/paper/formal-restrictions-on-multiple-tiers |
Repo | |
Framework | |
Content-Based Conflict of Interest Detection on Wikipedia
Title | Content-Based Conflict of Interest Detection on Wikipedia |
Authors | Udochukwu Orizu, Yulan He |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1026/ |
https://www.aclweb.org/anthology/L18-1026 | |
PWC | https://paperswithcode.com/paper/content-based-conflict-of-interest-detection |
Repo | |
Framework | |
Unleashing the Potential of CNNs for Interpretable Few-Shot Learning
Title | Unleashing the Potential of CNNs for Interpretable Few-Shot Learning |
Authors | Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille |
Abstract | Convolutional neural networks (CNNs) have been generally acknowledged as one of the driving forces for the advancement of computer vision. Despite their promising performances on many tasks, CNNs still face major obstacles on the road to achieving ideal machine intelligence. One is that CNNs are complex and hard to interpret. Another is that standard CNNs require large amounts of annotated data, which is sometimes very hard to obtain, and it is desirable to be able to learn them from few examples. In this work, we address these limitations of CNNs by developing novel, simple, and interpretable models for few-shot learn- ing. Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs. We first adapt the learning of visual concepts to the few-shot setting, and then uncover two key properties of feature encoding using visual concepts, which we call category sensitivity and spatial pattern. Motivated by these properties, we present two intuitive models for the problem of few-shot learning. Experiments show that our models achieve competitive performances, while being much more flexible and interpretable than alternative state-of-the-art few-shot learning methods. We conclude that using visual concepts helps expose the natural capability of CNNs for few-shot learning. |
Tasks | Few-Shot Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BJ_QxP1AZ |
https://openreview.net/pdf?id=BJ_QxP1AZ | |
PWC | https://paperswithcode.com/paper/unleashing-the-potential-of-cnns-for |
Repo | |
Framework | |
DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings
Title | DL Team at SemEval-2018 Task 1: Tweet Affect Detection using Sentiment Lexicons and Embeddings |
Authors | Dmitry Kravchenko, Lidia Pivovarova |
Abstract | The paper describes our approach for SemEval-2018 Task 1: Affect Detection in Tweets. We perform experiments with manually compelled sentiment lexicons and word embeddings. We test their performance on twitter affect detection task to determine which features produce the most informative representation of a sentence. We demonstrate that general-purpose word embeddings produces more informative sentence representation than lexicon features. However, combining lexicon features with embeddings yields higher performance than embeddings alone. |
Tasks | Emotion Classification, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1025/ |
https://www.aclweb.org/anthology/S18-1025 | |
PWC | https://paperswithcode.com/paper/dl-team-at-semeval-2018-task-1-tweet-affect |
Repo | |
Framework | |