Paper Group NANR 26
The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction. Learning Joint 2D-3D Representations for Depth Completion. Detecting Collocations Similarity via Logical-Linguistic Model. Reconstructing Capsule Networks for Zero-shot Intent …
The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection
Title | The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection |
Authors | Amir Zeldes, Debopam Das, Erick Galani Maziero, Juliano Antonio, Mikel Iruskieta |
Abstract | In 2019, we organized the first iteration of a shared task dedicated to the underlying units used in discourse parsing across formalisms: the DISRPT Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. In this paper we review the data included in the task, which cover 2.6 million manually annotated tokens from 15 datasets in 10 languages, survey and compare submitted systems and report on system performance on each task for both annotated and plain-tokenized versions of the data. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-2713/ |
https://www.aclweb.org/anthology/W19-2713 | |
PWC | https://paperswithcode.com/paper/the-disrpt-2019-shared-task-on-elementary |
Repo | |
Framework | |
A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction
Title | A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction |
Authors | Mengjie Zhao, Hinrich Sch{"u}tze |
Abstract | We present a new method for sentiment lexicon induction that is designed to be applicable to the entire range of typological diversity of the world{'}s languages. We evaluate our method on Parallel Bible Corpus+ (PBC+), a parallel corpus of 1593 languages. The key idea is to use Byte Pair Encodings (BPEs) as basic units for multilingual embeddings. Through zero-shot transfer from English sentiment, we learn a seed lexicon for each language in the domain of PBC+. Through domain adaptation, we then generalize the domain-specific lexicon to a general one. We show {–} across typologically diverse languages in PBC+ {–} good quality of seed and general-domain sentiment lexicons by intrinsic and extrinsic and by automatic and human evaluation. We make freely available our code, seed sentiment lexicons for all 1593 languages and induced general-domain sentiment lexicons for 200 languages. |
Tasks | Domain Adaptation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1341/ |
https://www.aclweb.org/anthology/P19-1341 | |
PWC | https://paperswithcode.com/paper/a-multilingual-bpe-embedding-space-for |
Repo | |
Framework | |
Learning Joint 2D-3D Representations for Depth Completion
Title | Learning Joint 2D-3D Representations for Depth Completion |
Authors | Yun Chen, Bin Yang, Ming Liang, Raquel Urtasun |
Abstract | In this paper, we tackle the problem of depth completion from RGBD data. Towards this goal, we design a simple yet effective neural network block that learns to extract joint 2D and 3D features. Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points, with their output features fused in image space. We build the depth completion network simply by stacking the proposed block, which has the advantage of learning hierarchical representations that are fully fused between 2D and 3D spaces at multiple levels. We demonstrate the effectiveness of our approach on the challenging KITTI depth completion benchmark and show that our approach outperforms the state-of-the-art. |
Tasks | Depth Completion |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Chen_Learning_Joint_2D-3D_Representations_for_Depth_Completion_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Chen_Learning_Joint_2D-3D_Representations_for_Depth_Completion_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-joint-2d-3d-representations-for |
Repo | |
Framework | |
Detecting Collocations Similarity via Logical-Linguistic Model
Title | Detecting Collocations Similarity via Logical-Linguistic Model |
Authors | Nina Khairova, Svitlana Petrasova, Orken Mamyrbayev, Kuralay Mukhsina |
Abstract | Semantic similarity between collocations, along with words similarity, is one of the main issues of NLP, which must be addressed, in particular, in order to facilitate the automatic thesaurus generation. In the paper, we consider the logical-linguistic model that allows defining the relation of semantic similarity of collocations via the logical-algebraic equations. We provide the model for English, Ukrainian and Russian text corpora. The implementation for each language is slightly different in the equations of the finite predicates algebra and used linguistic resources. As a dataset for our experiment, we use 5801 pairs of sentences of Microsoft Research Paraphrase Corpus for English and more than 1 000 texts of scientific papers for Russian and Ukrainian. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0802/ |
https://www.aclweb.org/anthology/W19-0802 | |
PWC | https://paperswithcode.com/paper/detecting-collocations-similarity-via-logical |
Repo | |
Framework | |
Reconstructing Capsule Networks for Zero-shot Intent Classification
Title | Reconstructing Capsule Networks for Zero-shot Intent Classification |
Authors | Han Liu, Xiaotong Zhang, Lu Fan, Xu Fu, i, Qimai Li, Xiao-Ming Wu, Albert Y.S. Lam |
Abstract | Intent classification is an important building block of dialogue systems. With the burgeoning of conversational AI, existing systems are not capable of handling numerous fast-emerging intents, which motivates zero-shot intent classification. Nevertheless, research on this problem is still in the incipient stage and few methods are available. A recently proposed zero-shot intent classification method, IntentCapsNet, has been shown to achieve state-of-the-art performance. However, it has two unaddressed limitations: (1) it cannot deal with polysemy when extracting semantic capsules; (2) it hardly recognizes the utterances of unseen intents in the generalized zero-shot intent classification setting. To overcome these limitations, we propose to reconstruct capsule networks for zero-shot intent classification. First, we introduce a dimensional attention mechanism to fight against polysemy. Second, we reconstruct the transformation matrices for unseen intents by utilizing abundant latent information of the labeled utterances, which significantly improves the model generalization ability. Experimental results on two task-oriented dialogue datasets in different languages show that our proposed method outperforms IntentCapsNet and other strong baselines. |
Tasks | Intent Classification |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1486/ |
https://www.aclweb.org/anthology/D19-1486 | |
PWC | https://paperswithcode.com/paper/reconstructing-capsule-networks-for-zero-shot |
Repo | |
Framework | |
Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention
Title | Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention |
Authors | Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo |
Abstract | Abstractive Sentence Summarization (ASSUM) targets at grasping the core idea of the source sentence and presenting it as the summary. It is extensively studied using statistical models or neural models based on the large-scale monolingual source-summary parallel corpus. But there is no cross-lingual parallel corpus, whose source sentence language is different to the summary language, to directly train a cross-lingual ASSUM system. We propose to solve this zero-shot problem by using resource-rich monolingual ASSUM system to teach zero-shot cross-lingual ASSUM system on both summary word generation and attention. This teaching process is along with a back-translation process which simulates source-summary pairs. Experiments on cross-lingual ASSUM task show that our proposed method is significantly better than pipeline baselines and previous works, and greatly enhances the cross-lingual performances closer to the monolingual performances. |
Tasks | Abstractive Sentence Summarization |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1305/ |
https://www.aclweb.org/anthology/P19-1305 | |
PWC | https://paperswithcode.com/paper/zero-shot-cross-lingual-abstractive-sentence |
Repo | |
Framework | |
Paving the way towards counterfactual generation in argumentative conversational agents
Title | Paving the way towards counterfactual generation in argumentative conversational agents |
Authors | Ilia Stepin, Alej Catala, ro, Martin Pereira-Fari{~n}a, Jose M. Alonso |
Abstract | |
Tasks | |
Published | 2019-01-01 |
URL | https://www.aclweb.org/anthology/W19-8405/ |
https://www.aclweb.org/anthology/W19-8405 | |
PWC | https://paperswithcode.com/paper/paving-the-way-towards-counterfactual |
Repo | |
Framework | |
Deep Metric Learning With Tuplet Margin Loss
Title | Deep Metric Learning With Tuplet Margin Loss |
Authors | Baosheng Yu, Dacheng Tao |
Abstract | Deep metric learning, in which the loss function plays a key role, has proven to be extremely useful in visual recognition tasks. However, existing deep metric learning loss functions such as contrastive loss and triplet loss usually rely on delicately selected samples (pairs or triplets) for fast convergence. In this paper, we propose a new deep metric learning loss function, tuplet margin loss, using randomly selected samples from each mini-batch. Specifically, the proposed tuplet margin loss implicitly up-weights hard samples and down-weights easy samples, while a slack margin in angular space is introduced to mitigate the problem of overfitting on the hardest sample. Furthermore, we address the problem of intra-pair variation by disentangling class-specific information to improve the generalizability of tuplet margin loss. Experimental results on three widely used deep metric learning datasets, CARS196, CUB200-2011, and Stanford Online Products, demonstrate significant improvements over existing deep metric learning methods. |
Tasks | Metric Learning |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Deep_Metric_Learning_With_Tuplet_Margin_Loss_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Yu_Deep_Metric_Learning_With_Tuplet_Margin_Loss_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-metric-learning-with-tuplet-margin-loss |
Repo | |
Framework | |
A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS
Title | A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS |
Authors | Zichao Wang, Randall Balestriero, Richard Baraniuk |
Abstract | We develop a framework for understanding and improving recurrent neural net-works (RNNs) using max-affine spline operators (MASO). We prove that RNNs using piecewise affine and convex nonlinearities can be written as a simple piecewise affine spline operator. The resulting representation provides several new perspectives for analyzing RNNs, three of which we study in this paper. First, we show that an RNN internally partitions the input space during training and that it builds up the partition through time. Second, we show that the affine parameter of an RNN corresponds to an input-specific template, from which we can interpret an RNN as performing a simple template matching (matched filtering) given the input. Third, by closely examining the MASO RNN formula, we prove that injecting Gaussian noise in the initial hidden state in RNNs corresponds to an explicit L2 regularization on the affine parameters, which links to exploding gradient issues and improves generalization. Extensive experiments on several datasets of various modalities demonstrate and validate each of the above analyses. In particular, using initial hidden states elevates simple RNNs to state-of-the-art performance on these datasets. |
Tasks | L2 Regularization |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJej72AqF7 |
https://openreview.net/pdf?id=BJej72AqF7 | |
PWC | https://paperswithcode.com/paper/a-max-affine-spline-perspective-of-recurrent |
Repo | |
Framework | |
Look Harder: A Neural Machine Translation Model with Hard Attention
Title | Look Harder: A Neural Machine Translation Model with Hard Attention |
Authors | Sathish Reddy Indurthi, Insoo Chung, Sangha Kim |
Abstract | Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. These models attend all the words in the source sequence for each target token, which makes them ineffective for long sequence translation. In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to the discrete nature of the hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. Experimental results show that the proposed model performs better on long sequences and thereby achieves significant BLEU score improvement on English-German (EN-DE) and English-French (ENFR) translation tasks compared to the soft attention based NMT. |
Tasks | Machine Translation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1290/ |
https://www.aclweb.org/anthology/P19-1290 | |
PWC | https://paperswithcode.com/paper/look-harder-a-neural-machine-translation |
Repo | |
Framework | |
Neural Text Style Transfer via Denoising and Reranking
Title | Neural Text Style Transfer via Denoising and Reranking |
Authors | Joseph Lee, Ziang Xie, Cindy Wang, Max Drach, Dan Jurafsky, Andrew Ng |
Abstract | We introduce a simple method for text style transfer that frames style transfer as denoising: we synthesize a noisy corpus and treat the source style as a noisy version of the target style. To control for aspects such as preserving meaning while modifying style, we propose a reranking approach in the data synthesis phase. We evaluate our method on three novel style transfer tasks: transferring between British and American varieties, text genres (formal vs. casual), and lyrics from different musical genres. By measuring style transfer quality, meaning preservation, and the fluency of generated outputs, we demonstrate that our method is able both to produce high-quality output while maintaining the flexibility to suggest syntactically rich stylistic edits. |
Tasks | Denoising, Style Transfer, Text Style Transfer |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-2309/ |
https://www.aclweb.org/anthology/W19-2309 | |
PWC | https://paperswithcode.com/paper/neural-text-style-transfer-via-denoising-and |
Repo | |
Framework | |
Convergence-Rate-Matching Discretization of Accelerated Optimization Flows Through Opportunistic State-Triggered Control
Title | Convergence-Rate-Matching Discretization of Accelerated Optimization Flows Through Opportunistic State-Triggered Control |
Authors | Miguel Vaquero, Jorge Cortes |
Abstract | A recent body of exciting work seeks to shed light on the behavior of accelerated methods in optimization via high-resolution differential equations. These differential equations are continuous counterparts of the discrete-time optimization algorithms, and their convergence properties can be characterized using the powerful tools provided by classical Lyapunov stability analysis. An outstanding question of pivotal importance is how to discretize these continuous flows while maintaining their convergence rates. This paper provides a novel approach through the idea of opportunistic state-triggered control. We take advantage of the Lyapunov functions employed to characterize the rate of convergence of high-resolution differential equations to design variable-stepsize forward-Euler discretizations that preserve the Lyapunov decay of the original dynamics. The philosophy of our approach is not limited to forward-Euler discretizations and may be combined with other integration schemes. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9170-convergence-rate-matching-discretization-of-accelerated-optimization-flows-through-opportunistic-state-triggered-control |
http://papers.nips.cc/paper/9170-convergence-rate-matching-discretization-of-accelerated-optimization-flows-through-opportunistic-state-triggered-control.pdf | |
PWC | https://paperswithcode.com/paper/convergence-rate-matching-discretization-of |
Repo | |
Framework | |
Integrated Steganography and Steganalysis with Generative Adversarial Networks
Title | Integrated Steganography and Steganalysis with Generative Adversarial Networks |
Authors | Chong Yu |
Abstract | Recently, generative adversarial network is the hotspot in research areas and industrial application areas. It’s application on data generation in computer vision is most common usage. This paper extends its application to data hiding and security area. In this paper, we propose the novel framework to integrate steganography and steganalysis processes. The proposed framework applies generative adversarial networks as the core structure. The discriminative model simulate the steganalysis process, which can help us understand the sensitivity of cover images to semantic changes. The steganography generative model is to generate stego image which is aligned with the original cover image, and attempts to confuse steganalysis discriminative model. The introduction of cycle discriminative model and inconsistent loss can help to enhance the quality and security of generated stego image in the iterative training process. Training dataset is mixed with intact images as well as intentional attacked images. The mix training process can further improve the robustness and security of new framework. Through the qualitative, quantitative experiments and analysis, this novel framework shows compelling performance and advantages over the current state-of-the-art methods in steganography and steganalysis benchmarks. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=r1Vx_oA5YQ |
https://openreview.net/pdf?id=r1Vx_oA5YQ | |
PWC | https://paperswithcode.com/paper/integrated-steganography-and-steganalysis |
Repo | |
Framework | |
PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation
Title | PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation |
Authors | Wei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie |
Abstract | This paper describes the models designated for the MEDIQA 2019 shared tasks by the team PANLP. We take advantages of the recent advances in pre-trained bidirectional transformer language models such as BERT (Devlin et al., 2018) and MT-DNN (Liu et al., 2019b). We find that pre-trained language models can significantly outperform traditional deep learning models. Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large. A knowledge distillation process is implemented, to distill the knowledge contained in a set of models and transfer it into an single model, whose performance turns out to be comparable with that obtained by the ensemble of that set of models. Finally, for test submissions, model ensemble and a re-ranking process are implemented to boost the performances. Our models participated in all three tasks and ranked the 1st place for the RQE task, and the 2nd place for the NLI task, and also the 2nd place for the QA task. |
Tasks | Transfer Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5040/ |
https://www.aclweb.org/anthology/W19-5040 | |
PWC | https://paperswithcode.com/paper/panlp-at-mediqa-2019-pre-trained-language |
Repo | |
Framework | |
Deep Bayesian Convolutional Networks with Many Channels are Gaussian Processes
Title | Deep Bayesian Convolutional Networks with Many Channels are Gaussian Processes |
Authors | Roman Novak, Lechao Xiao, Yasaman Bahri, Jaehoon Lee, Greg Yang, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-dickstein |
Abstract | There is a previously identified equivalence between wide fully connected neural networks (FCNs) and Gaussian processes (GPs). This equivalence enables, for instance, test set predictions that would have resulted from a fully Bayesian, infinitely wide trained FCN to be computed without ever instantiating the FCN, but by instead evaluating the corresponding GP. In this work, we derive an analogous equivalence for multi-layer convolutional neural networks (CNNs) both with and without pooling layers, and achieve state of the art results on CIFAR10 for GPs without trainable kernels. We also introduce a Monte Carlo method to estimate the GP corresponding to a given neural network architecture, even in cases where the analytic form has too many terms to be computationally feasible. Surprisingly, in the absence of pooling layers, the GPs corresponding to CNNs with and without weight sharing are identical. As a consequence, translation equivariance in finite channel CNNs trained with stochastic gradient descent (SGD) has no corresponding property in the Bayesian treatment of the infinite channel limit – a qualitative difference between the two regimes that is not present in the FCN case. We confirm experimentally, that while in some scenarios the performance of SGD-trained finite CNNs approaches that of the corresponding GPs as the channel count increases, with careful tuning SGD-trained CNNs can significantly outperform their corresponding GPs. |
Tasks | Gaussian Processes |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1g30j0qF7 |
https://openreview.net/pdf?id=B1g30j0qF7 | |
PWC | https://paperswithcode.com/paper/deep-bayesian-convolutional-networks-with |
Repo | |
Framework | |