January 25, 2020

2706 words 13 mins read

Paper Group NANR 26

The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction. Learning Joint 2D-3D Representations for Depth Completion. Detecting Collocations Similarity via Logical-Linguistic Model. Reconstructing Capsule Networks for Zero-shot Intent …

The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection


Title	The DISRPT 2019 Shared Task on Elementary Discourse Unit Segmentation and Connective Detection
Authors	Amir Zeldes, Debopam Das, Erick Galani Maziero, Juliano Antonio, Mikel Iruskieta
Abstract	In 2019, we organized the first iteration of a shared task dedicated to the underlying units used in discourse parsing across formalisms: the DISRPT Shared Task on Elementary Discourse Unit Segmentation and Connective Detection. In this paper we review the data included in the task, which cover 2.6 million manually annotated tokens from 15 datasets in 10 languages, survey and compare submitted systems and report on system performance on each task for both annotated and plain-tokenized versions of the data.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2713/
PDF	https://www.aclweb.org/anthology/W19-2713
PWC	https://paperswithcode.com/paper/the-disrpt-2019-shared-task-on-elementary
Repo
Framework

A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction


Title	A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction
Authors	Mengjie Zhao, Hinrich Sch{"u}tze
Abstract	We present a new method for sentiment lexicon induction that is designed to be applicable to the entire range of typological diversity of the world{'}s languages. We evaluate our method on Parallel Bible Corpus+ (PBC+), a parallel corpus of 1593 languages. The key idea is to use Byte Pair Encodings (BPEs) as basic units for multilingual embeddings. Through zero-shot transfer from English sentiment, we learn a seed lexicon for each language in the domain of PBC+. Through domain adaptation, we then generalize the domain-specific lexicon to a general one. We show {–} across typologically diverse languages in PBC+ {–} good quality of seed and general-domain sentiment lexicons by intrinsic and extrinsic and by automatic and human evaluation. We make freely available our code, seed sentiment lexicons for all 1593 languages and induced general-domain sentiment lexicons for 200 languages.
Tasks	Domain Adaptation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1341/
PDF	https://www.aclweb.org/anthology/P19-1341
PWC	https://paperswithcode.com/paper/a-multilingual-bpe-embedding-space-for
Repo
Framework

Learning Joint 2D-3D Representations for Depth Completion


Title	Learning Joint 2D-3D Representations for Depth Completion
Authors	Yun Chen, Bin Yang, Ming Liang, Raquel Urtasun
Abstract	In this paper, we tackle the problem of depth completion from RGBD data. Towards this goal, we design a simple yet effective neural network block that learns to extract joint 2D and 3D features. Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points, with their output features fused in image space. We build the depth completion network simply by stacking the proposed block, which has the advantage of learning hierarchical representations that are fully fused between 2D and 3D spaces at multiple levels. We demonstrate the effectiveness of our approach on the challenging KITTI depth completion benchmark and show that our approach outperforms the state-of-the-art.
Tasks	Depth Completion
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Chen_Learning_Joint_2D-3D_Representations_for_Depth_Completion_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Chen_Learning_Joint_2D-3D_Representations_for_Depth_Completion_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/learning-joint-2d-3d-representations-for
Repo
Framework

Detecting Collocations Similarity via Logical-Linguistic Model


Title	Detecting Collocations Similarity via Logical-Linguistic Model
Authors	Nina Khairova, Svitlana Petrasova, Orken Mamyrbayev, Kuralay Mukhsina
Abstract	Semantic similarity between collocations, along with words similarity, is one of the main issues of NLP, which must be addressed, in particular, in order to facilitate the automatic thesaurus generation. In the paper, we consider the logical-linguistic model that allows defining the relation of semantic similarity of collocations via the logical-algebraic equations. We provide the model for English, Ukrainian and Russian text corpora. The implementation for each language is slightly different in the equations of the finite predicates algebra and used linguistic resources. As a dataset for our experiment, we use 5801 pairs of sentences of Microsoft Research Paraphrase Corpus for English and more than 1 000 texts of scientific papers for Russian and Ukrainian.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2019-05-01
URL	https://www.aclweb.org/anthology/W19-0802/
PDF	https://www.aclweb.org/anthology/W19-0802
PWC	https://paperswithcode.com/paper/detecting-collocations-similarity-via-logical
Repo
Framework

Reconstructing Capsule Networks for Zero-shot Intent Classification


Title	Reconstructing Capsule Networks for Zero-shot Intent Classification
Authors	Han Liu, Xiaotong Zhang, Lu Fan, Xu Fu, i, Qimai Li, Xiao-Ming Wu, Albert Y.S. Lam
Abstract	Intent classification is an important building block of dialogue systems. With the burgeoning of conversational AI, existing systems are not capable of handling numerous fast-emerging intents, which motivates zero-shot intent classification. Nevertheless, research on this problem is still in the incipient stage and few methods are available. A recently proposed zero-shot intent classification method, IntentCapsNet, has been shown to achieve state-of-the-art performance. However, it has two unaddressed limitations: (1) it cannot deal with polysemy when extracting semantic capsules; (2) it hardly recognizes the utterances of unseen intents in the generalized zero-shot intent classification setting. To overcome these limitations, we propose to reconstruct capsule networks for zero-shot intent classification. First, we introduce a dimensional attention mechanism to fight against polysemy. Second, we reconstruct the transformation matrices for unseen intents by utilizing abundant latent information of the labeled utterances, which significantly improves the model generalization ability. Experimental results on two task-oriented dialogue datasets in different languages show that our proposed method outperforms IntentCapsNet and other strong baselines.
Tasks	Intent Classification
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1486/
PDF	https://www.aclweb.org/anthology/D19-1486
PWC	https://paperswithcode.com/paper/reconstructing-capsule-networks-for-zero-shot
Repo
Framework

Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention


Title	Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention
Authors	Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo
Abstract	Abstractive Sentence Summarization (ASSUM) targets at grasping the core idea of the source sentence and presenting it as the summary. It is extensively studied using statistical models or neural models based on the large-scale monolingual source-summary parallel corpus. But there is no cross-lingual parallel corpus, whose source sentence language is different to the summary language, to directly train a cross-lingual ASSUM system. We propose to solve this zero-shot problem by using resource-rich monolingual ASSUM system to teach zero-shot cross-lingual ASSUM system on both summary word generation and attention. This teaching process is along with a back-translation process which simulates source-summary pairs. Experiments on cross-lingual ASSUM task show that our proposed method is significantly better than pipeline baselines and previous works, and greatly enhances the cross-lingual performances closer to the monolingual performances.
Tasks	Abstractive Sentence Summarization
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1305/
PDF	https://www.aclweb.org/anthology/P19-1305
PWC	https://paperswithcode.com/paper/zero-shot-cross-lingual-abstractive-sentence
Repo
Framework

Paving the way towards counterfactual generation in argumentative conversational agents


Title	Paving the way towards counterfactual generation in argumentative conversational agents
Authors	Ilia Stepin, Alej Catala, ro, Martin Pereira-Fari{~n}a, Jose M. Alonso
Abstract
Tasks
Published	2019-01-01
URL	https://www.aclweb.org/anthology/W19-8405/
PDF	https://www.aclweb.org/anthology/W19-8405
PWC	https://paperswithcode.com/paper/paving-the-way-towards-counterfactual
Repo
Framework

Deep Metric Learning With Tuplet Margin Loss


Title	Deep Metric Learning With Tuplet Margin Loss
Authors	Baosheng Yu, Dacheng Tao
Abstract	Deep metric learning, in which the loss function plays a key role, has proven to be extremely useful in visual recognition tasks. However, existing deep metric learning loss functions such as contrastive loss and triplet loss usually rely on delicately selected samples (pairs or triplets) for fast convergence. In this paper, we propose a new deep metric learning loss function, tuplet margin loss, using randomly selected samples from each mini-batch. Specifically, the proposed tuplet margin loss implicitly up-weights hard samples and down-weights easy samples, while a slack margin in angular space is introduced to mitigate the problem of overfitting on the hardest sample. Furthermore, we address the problem of intra-pair variation by disentangling class-specific information to improve the generalizability of tuplet margin loss. Experimental results on three widely used deep metric learning datasets, CARS196, CUB200-2011, and Stanford Online Products, demonstrate significant improvements over existing deep metric learning methods.
Tasks	Metric Learning
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Deep_Metric_Learning_With_Tuplet_Margin_Loss_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Yu_Deep_Metric_Learning_With_Tuplet_Margin_Loss_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deep-metric-learning-with-tuplet-margin-loss
Repo
Framework

A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS


Title	A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS
Authors	Zichao Wang, Randall Balestriero, Richard Baraniuk
Abstract	We develop a framework for understanding and improving recurrent neural net-works (RNNs) using max-affine spline operators (MASO). We prove that RNNs using piecewise affine and convex nonlinearities can be written as a simple piecewise affine spline operator. The resulting representation provides several new perspectives for analyzing RNNs, three of which we study in this paper. First, we show that an RNN internally partitions the input space during training and that it builds up the partition through time. Second, we show that the affine parameter of an RNN corresponds to an input-specific template, from which we can interpret an RNN as performing a simple template matching (matched filtering) given the input. Third, by closely examining the MASO RNN formula, we prove that injecting Gaussian noise in the initial hidden state in RNNs corresponds to an explicit L2 regularization on the affine parameters, which links to exploding gradient issues and improves generalization. Extensive experiments on several datasets of various modalities demonstrate and validate each of the above analyses. In particular, using initial hidden states elevates simple RNNs to state-of-the-art performance on these datasets.
Tasks	L2 Regularization
Published	2019-05-01
URL	https://openreview.net/forum?id=BJej72AqF7
PDF	https://openreview.net/pdf?id=BJej72AqF7
PWC	https://paperswithcode.com/paper/a-max-affine-spline-perspective-of-recurrent
Repo
Framework

Look Harder: A Neural Machine Translation Model with Hard Attention


Title	Look Harder: A Neural Machine Translation Model with Hard Attention
Authors	Sathish Reddy Indurthi, Insoo Chung, Sangha Kim
Abstract	Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. These models attend all the words in the source sequence for each target token, which makes them ineffective for long sequence translation. In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to the discrete nature of the hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. Experimental results show that the proposed model performs better on long sequences and thereby achieves significant BLEU score improvement on English-German (EN-DE) and English-French (ENFR) translation tasks compared to the soft attention based NMT.
Tasks	Machine Translation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1290/
PDF	https://www.aclweb.org/anthology/P19-1290
PWC	https://paperswithcode.com/paper/look-harder-a-neural-machine-translation
Repo
Framework

Neural Text Style Transfer via Denoising and Reranking


Title	Neural Text Style Transfer via Denoising and Reranking
Authors	Joseph Lee, Ziang Xie, Cindy Wang, Max Drach, Dan Jurafsky, Andrew Ng
Abstract	We introduce a simple method for text style transfer that frames style transfer as denoising: we synthesize a noisy corpus and treat the source style as a noisy version of the target style. To control for aspects such as preserving meaning while modifying style, we propose a reranking approach in the data synthesis phase. We evaluate our method on three novel style transfer tasks: transferring between British and American varieties, text genres (formal vs. casual), and lyrics from different musical genres. By measuring style transfer quality, meaning preservation, and the fluency of generated outputs, we demonstrate that our method is able both to produce high-quality output while maintaining the flexibility to suggest syntactically rich stylistic edits.
Tasks	Denoising, Style Transfer, Text Style Transfer
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2309/
PDF	https://www.aclweb.org/anthology/W19-2309
PWC	https://paperswithcode.com/paper/neural-text-style-transfer-via-denoising-and
Repo
Framework

Convergence-Rate-Matching Discretization of Accelerated Optimization Flows Through Opportunistic State-Triggered Control


Title	Convergence-Rate-Matching Discretization of Accelerated Optimization Flows Through Opportunistic State-Triggered Control
Authors	Miguel Vaquero, Jorge Cortes
Abstract	A recent body of exciting work seeks to shed light on the behavior of accelerated methods in optimization via high-resolution differential equations. These differential equations are continuous counterparts of the discrete-time optimization algorithms, and their convergence properties can be characterized using the powerful tools provided by classical Lyapunov stability analysis. An outstanding question of pivotal importance is how to discretize these continuous flows while maintaining their convergence rates. This paper provides a novel approach through the idea of opportunistic state-triggered control. We take advantage of the Lyapunov functions employed to characterize the rate of convergence of high-resolution differential equations to design variable-stepsize forward-Euler discretizations that preserve the Lyapunov decay of the original dynamics. The philosophy of our approach is not limited to forward-Euler discretizations and may be combined with other integration schemes.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9170-convergence-rate-matching-discretization-of-accelerated-optimization-flows-through-opportunistic-state-triggered-control
PDF	http://papers.nips.cc/paper/9170-convergence-rate-matching-discretization-of-accelerated-optimization-flows-through-opportunistic-state-triggered-control.pdf
PWC	https://paperswithcode.com/paper/convergence-rate-matching-discretization-of
Repo
Framework

Integrated Steganography and Steganalysis with Generative Adversarial Networks


Title	Integrated Steganography and Steganalysis with Generative Adversarial Networks
Authors	Chong Yu
Abstract	Recently, generative adversarial network is the hotspot in research areas and industrial application areas. It’s application on data generation in computer vision is most common usage. This paper extends its application to data hiding and security area. In this paper, we propose the novel framework to integrate steganography and steganalysis processes. The proposed framework applies generative adversarial networks as the core structure. The discriminative model simulate the steganalysis process, which can help us understand the sensitivity of cover images to semantic changes. The steganography generative model is to generate stego image which is aligned with the original cover image, and attempts to confuse steganalysis discriminative model. The introduction of cycle discriminative model and inconsistent loss can help to enhance the quality and security of generated stego image in the iterative training process. Training dataset is mixed with intact images as well as intentional attacked images. The mix training process can further improve the robustness and security of new framework. Through the qualitative, quantitative experiments and analysis, this novel framework shows compelling performance and advantages over the current state-of-the-art methods in steganography and steganalysis benchmarks.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=r1Vx_oA5YQ
PDF	https://openreview.net/pdf?id=r1Vx_oA5YQ
PWC	https://paperswithcode.com/paper/integrated-steganography-and-steganalysis
Repo
Framework

PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation


Title	PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation
Authors	Wei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie
Abstract	This paper describes the models designated for the MEDIQA 2019 shared tasks by the team PANLP. We take advantages of the recent advances in pre-trained bidirectional transformer language models such as BERT (Devlin et al., 2018) and MT-DNN (Liu et al., 2019b). We find that pre-trained language models can significantly outperform traditional deep learning models. Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large. A knowledge distillation process is implemented, to distill the knowledge contained in a set of models and transfer it into an single model, whose performance turns out to be comparable with that obtained by the ensemble of that set of models. Finally, for test submissions, model ensemble and a re-ranking process are implemented to boost the performances. Our models participated in all three tasks and ranked the 1st place for the RQE task, and the 2nd place for the NLI task, and also the 2nd place for the QA task.
Tasks	Transfer Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5040/
PDF	https://www.aclweb.org/anthology/W19-5040
PWC	https://paperswithcode.com/paper/panlp-at-mediqa-2019-pre-trained-language
Repo
Framework

Deep Bayesian Convolutional Networks with Many Channels are Gaussian Processes


Title	Deep Bayesian Convolutional Networks with Many Channels are Gaussian Processes
Authors	Roman Novak, Lechao Xiao, Yasaman Bahri, Jaehoon Lee, Greg Yang, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-dickstein
Abstract	There is a previously identified equivalence between wide fully connected neural networks (FCNs) and Gaussian processes (GPs). This equivalence enables, for instance, test set predictions that would have resulted from a fully Bayesian, infinitely wide trained FCN to be computed without ever instantiating the FCN, but by instead evaluating the corresponding GP. In this work, we derive an analogous equivalence for multi-layer convolutional neural networks (CNNs) both with and without pooling layers, and achieve state of the art results on CIFAR10 for GPs without trainable kernels. We also introduce a Monte Carlo method to estimate the GP corresponding to a given neural network architecture, even in cases where the analytic form has too many terms to be computationally feasible. Surprisingly, in the absence of pooling layers, the GPs corresponding to CNNs with and without weight sharing are identical. As a consequence, translation equivariance in finite channel CNNs trained with stochastic gradient descent (SGD) has no corresponding property in the Bayesian treatment of the infinite channel limit – a qualitative difference between the two regimes that is not present in the FCN case. We confirm experimentally, that while in some scenarios the performance of SGD-trained finite CNNs approaches that of the corresponding GPs as the channel count increases, with careful tuning SGD-trained CNNs can significantly outperform their corresponding GPs.
Tasks	Gaussian Processes
Published	2019-05-01
URL	https://openreview.net/forum?id=B1g30j0qF7
PDF	https://openreview.net/pdf?id=B1g30j0qF7
PWC	https://paperswithcode.com/paper/deep-bayesian-convolutional-networks-with
Repo
Framework