Paper Group NANR 24
Reading Like HER: Human Reading Inspired Extractive Summarization. Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge. Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions. Learning Personalized Modular Network Guided by Structured Knowledge. Typography Wit …
Reading Like HER: Human Reading Inspired Extractive Summarization
Title | Reading Like HER: Human Reading Inspired Extractive Summarization |
Authors | Ling Luo, Xiang Ao, Yan Song, Feiyang Pan, Min Yang, Qing He |
Abstract | In this work, we re-examine the problem of extractive text summarization for long documents. We observe that the process of extracting summarization of human can be divided into two stages: 1) a rough reading stage to look for sketched information, and 2) a subsequent careful reading stage to select key sentences to form the summary. By simulating such a two-stage process, we propose a novel approach for extractive summarization. We formulate the problem as a contextual-bandit problem and solve it with policy gradient. We adopt a convolutional neural network to encode gist of paragraphs for rough reading, and a decision making policy with an adapted termination mechanism for careful reading. Experiments on the CNN and DailyMail datasets show that our proposed method can provide high-quality summaries with varied length, and significantly outperform the state-of-the-art extractive methods in terms of ROUGE metrics. |
Tasks | Decision Making, Text Summarization |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1300/ |
https://www.aclweb.org/anthology/D19-1300 | |
PWC | https://paperswithcode.com/paper/reading-like-her-human-reading-inspired |
Repo | |
Framework | |
Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge
Title | Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge |
Authors | Wenhao Ying, Rong Xiang, Qin Lu |
Abstract | Deep learning based general language models have achieved state-of-the-art results in many popular tasks such as sentiment analysis and QA tasks. Text in domains like social media has its own salient characteristics. Domain knowledge should be helpful in domain relevant tasks. In this work, we devise a simple method to obtain domain knowledge and further propose a method to integrate domain knowledge with general knowledge based on deep language models to improve performance of emotion classification. Experiments on Twitter data show that even though a deep language model fine-tuned by a target domain data has attained comparable results to that of previous state-of-the-art models, this fine-tuned model can still benefit from our extracted domain knowledge to obtain more improvement. This highlights the importance of making use of domain knowledge in domain-specific applications. |
Tasks | Emotion Classification, Language Modelling, Sentiment Analysis |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5541/ |
https://www.aclweb.org/anthology/D19-5541 | |
PWC | https://paperswithcode.com/paper/improving-multi-label-emotion-classification-1 |
Repo | |
Framework | |
Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions
Title | Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions |
Authors | Philipp Koehn, Francisco Guzm{'a}n, Vishrav Chaudhary, Juan Pino |
Abstract | Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2{%} and 10{%} of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali-English and Sinhala-English. Eleven participants from companies, national research labs, and universities participated in this task. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5404/ |
https://www.aclweb.org/anthology/W19-5404 | |
PWC | https://paperswithcode.com/paper/findings-of-the-wmt-2019-shared-task-on-1 |
Repo | |
Framework | |
Learning Personalized Modular Network Guided by Structured Knowledge
Title | Learning Personalized Modular Network Guided by Structured Knowledge |
Authors | Xiaodan Liang |
Abstract | The dominant deep learning approaches use a “one-size-fits-all” paradigm with the hope that underlying characteristics of diverse inputs can be captured via a fixed structure. They also overlook the importance of explicitly modeling feature hierarchy. However, complex real-world tasks often require discovering diverse reasoning paths for different inputs to achieve satisfying predictions, especially for challenging large-scale recognition tasks with complex label relations. In this paper, we treat the structured commonsense knowledge (e.g. concept hierarchy) as the guidance of customizing more powerful and explainable network structures for distinct inputs, leading to dynamic and individualized inference paths. Give an off-the-shelf large network configuration, the proposed Personalized Modular Network (PMN) is learned by selectively activating a sequence of network modules where each of them is designated to recognize particular levels of structured knowledge. Learning semantic configurations and activation of modules to align well with structured knowledge can be regarded as a decision-making procedure, which is solved by a new graph-based reinforcement learning algorithm. Experiments on three semantic segmentation tasks and classification tasks show our PMN can achieve superior performance with the reduced number of network modules while discovering personalized and explainable module configurations for each input. |
Tasks | Decision Making, Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Liang_Learning_Personalized_Modular_Network_Guided_by_Structured_Knowledge_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Liang_Learning_Personalized_Modular_Network_Guided_by_Structured_Knowledge_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-personalized-modular-network-guided |
Repo | |
Framework | |
Typography With Decor: Intelligent Text Style Transfer
Title | Typography With Decor: Intelligent Text Style Transfer |
Authors | Wenjing Wang, Jiaying Liu, Shuai Yang, Zongming Guo |
Abstract | Text effects transfer can dramatically make the text visually pleasing. In this paper, we present a novel framework to stylize the text with exquisite decor, which is ignored by the previous text stylization methods. Decorative elements pose a challenge to spontaneously handle basal text effects and decor, which are two different styles. To address this issue, our key idea is to learn to separate, transfer and recombine the decors and the basal text effect. A novel text effect transfer network is proposed to infer the styled version of the target text. The stylized text is finally embellished with decor where the placement of the decor is carefully determined by a novel structure-aware strategy. Furthermore, we propose a domain adaptation strategy for decor detection and a one-shot training strategy for text effects transfer, which greatly enhance the robustness of our network to new styles. We base our experiments on our collected topography dataset including 59,000 professionally styled text and demonstrate the superiority of our method over other state-of-the-art style transfer methods. |
Tasks | Domain Adaptation, Style Transfer, Text Effects Transfer, Text Style Transfer |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Typography_With_Decor_Intelligent_Text_Style_Transfer_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Typography_With_Decor_Intelligent_Text_Style_Transfer_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/typography-with-decor-intelligent-text-style |
Repo | |
Framework | |
Historical Text Normalization with Delayed Rewards
Title | Historical Text Normalization with Delayed Rewards |
Authors | Simon Flachs, Marcel Bollmann, Anders S{\o}gaard |
Abstract | Training neural sequence-to-sequence models with simple token-level log-likelihood is now a standard approach to historical text normalization, albeit often outperformed by phrase-based models. Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. Policy gradient training, in particular, leads to more accurate normalizations for long or unseen words. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1157/ |
https://www.aclweb.org/anthology/P19-1157 | |
PWC | https://paperswithcode.com/paper/historical-text-normalization-with-delayed |
Repo | |
Framework | |
Unsupervised Data Augmentation for Less-Resourced Languages with no Standardized Spelling
Title | Unsupervised Data Augmentation for Less-Resourced Languages with no Standardized Spelling |
Authors | Alice Millour, Kar{"e}n Fort |
Abstract | Building representative linguistic resources and NLP tools for non-standardized languages is challenging: when spelling is not determined by a norm, multiple written forms can be encountered for a given word, inducing a large proportion of out-of-vocabulary words. To embrace this diversity, we propose a methodology based on crowdsourced alternative spellings we use to extract rules applied to match OOV words with one of their spelling variants. This virtuous process enables the unsupervised augmentation of multi-variant lexicons without expert rule definition. We apply this multilingual methodology on Alsatian, a French regional language and provide an intrinsic evaluation of the correctness of the variants pairs, and an extrinsic evaluation on a downstream task. We show that in a low-resource scenario, 145 inital pairs can lead to the generation of 876 additional variant pairs, and a diminution of OOV words improving the part-of-speech tagging performance by 1 to 4{%}. |
Tasks | Data Augmentation, Part-Of-Speech Tagging |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1090/ |
https://www.aclweb.org/anthology/R19-1090 | |
PWC | https://paperswithcode.com/paper/unsupervised-data-augmentation-for-less |
Repo | |
Framework | |
Regret Bounds for Learning State Representations in Reinforcement Learning
Title | Regret Bounds for Learning State Representations in Reinforcement Learning |
Authors | Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard |
Abstract | We consider the problem of online reinforcement learning when several state representations (mapping histories to a discrete state space) are available to the learning agent. At least one of these representations is assumed to induce a Markov decision process (MDP), and the performance of the agent is measured in terms of cumulative regret against the optimal policy giving the highest average reward in this MDP representation. We propose an algorithm (UCB-MS) with O(sqrt(T)) regret in any communicating Markov decision process. The regret bound shows that UCB-MS automatically adapts to the Markov model. This improves over the currently known best results in the literature that gave regret bounds of order O(T^(2/3)). |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9435-regret-bounds-for-learning-state-representations-in-reinforcement-learning |
http://papers.nips.cc/paper/9435-regret-bounds-for-learning-state-representations-in-reinforcement-learning.pdf | |
PWC | https://paperswithcode.com/paper/regret-bounds-for-learning-state |
Repo | |
Framework | |
Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes
Title | Which Way Are You Going? Imitative Decision Learning for Path Forecasting in Dynamic Scenes |
Authors | Yuke Li |
Abstract | Path forecasting is a pivotal step toward understanding dynamic scenes and an emerging topic in the computer vi- sion field. This task is challenging due to the multimodal nature of the future, namely, given a partial history, there is more than one plausible prediction. Yet, the state-of-the-art methods seem not fully responsive to this innate variabil- ity. Hence, how to better foresee the forthcoming trajectory in dynamic scenes has to be more thoroughly pursued. To this end, we propose a novel Imitative Decision Learning (IDL) approach. It delves deeper into the key that inher- ently characterizes the multimodality - the latent decision. The proposed IDL first infers the distribution of such latent decisions by learning from moving histories. A policy is then generated by taking the sampled latent decision into account to predict the future. Different plausible upcoming paths corresponds to each sampled latent decision. This ap- proach significantly differs from the mainstream literature that relies on a predefined latent variable to extrapolate di- verse predictions. In order to augment the understanding of the latent decision and resultant mutimodal future, we in- vestigate their connection through mutual information op- timization. Moreover, the proposed IDL integrates spatial and temporal dependencies into one single framework, in contrast to handling them with two-step settings. As a re- sult, our approach enables simultaneous anticipation of the paths of all pedestrians in the scene. We assess our pro- posal on the large-scale SAP, ETH and UCY datasets. The experiments show that IDL introduces considerable margin improvements with respect to recent leading studies. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Li_Which_Way_Are_You_Going_Imitative_Decision_Learning_for_Path_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Li_Which_Way_Are_You_Going_Imitative_Decision_Learning_for_Path_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/which-way-are-you-going-imitative-decision |
Repo | |
Framework | |
ANYTIME MINIBATCH: EXPLOITING STRAGGLERS IN ONLINE DISTRIBUTED OPTIMIZATION
Title | ANYTIME MINIBATCH: EXPLOITING STRAGGLERS IN ONLINE DISTRIBUTED OPTIMIZATION |
Authors | Nuwan Ferdinand, Haider Al-Lawati, Stark Draper, Matthew Nokleby |
Abstract | Distributed optimization is vital in solving large-scale machine learning problems. A widely-shared feature of distributed optimization techniques is the requirement that all nodes complete their assigned tasks in each computational epoch before the system can proceed to the next epoch. In such settings, slow nodes, called stragglers, can greatly slow progress. To mitigate the impact of stragglers, we propose an online distributed optimization method called Anytime Minibatch. In this approach, all nodes are given a fixed time to compute the gradients of as many data samples as possible. The result is a variable per-node minibatch size. Workers then get a fixed communication time to average their minibatch gradients via several rounds of consensus, which are then used to update primal variables via dual averaging. Anytime Minibatch prevents stragglers from holding up the system without wasting the work that stragglers can complete. We present a convergence analysis and analyze the wall time performance. Our numerical results show that our approach is up to 1.5 times faster in Amazon EC2 and it is up to five times faster when there is greater variability in compute nodes performance. |
Tasks | Distributed Optimization |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rkzDIiA5YQ |
https://openreview.net/pdf?id=rkzDIiA5YQ | |
PWC | https://paperswithcode.com/paper/anytime-minibatch-exploiting-stragglers-in |
Repo | |
Framework | |
Adversarial Removal of Demographic Attributes Revisited
Title | Adversarial Removal of Demographic Attributes Revisited |
Authors | Maria Barrett, Yova Kementchedjhieva, Yanai Elazar, Desmond Elliott, Anders S{\o}gaard |
Abstract | Elazar and Goldberg (2018) showed that protected attributes can be extracted from the representations of a debiased neural network for mention detection at above-chance levels, by evaluating a diagnostic classifier on a held-out subsample of the data it was trained on. We revisit their experiments and conduct a series of follow-up experiments showing that, in fact, the diagnostic classifier generalizes poorly to both new in-domain samples and new domains, indicating that it relies on correlations specific to their particular data sample. We further show that a diagnostic classifier trained on the biased baseline neural network also does not generalize to new samples. In other words, the biases detected in Elazar and Goldberg (2018) seem restricted to their particular data sample, and would therefore not bias the decisions of the model on new samples, whether in-domain or out-of-domain. In light of this, we discuss better methodologies for detecting bias in our models. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1662/ |
https://www.aclweb.org/anthology/D19-1662 | |
PWC | https://paperswithcode.com/paper/adversarial-removal-of-demographic-attributes-2 |
Repo | |
Framework | |
Analysis and classification of heart diseases using heartbeat features and machine learning algorithms
Title | Analysis and classification of heart diseases using heartbeat features and machine learning algorithms |
Authors | Fajr Ibrahem Alarsan, Mamoon Younes |
Abstract | This study proposed an ECG (Electrocardiogram) classification approach using machine learning based on several ECG features. An electrocardiogram (ECG) is a signal that measures the electric activity of the heart. The proposed approach is implemented using ML-libs and Scala language on Apache Spark framework; MLlib is Apache Spark’s scalable machine learning library. The key challenge in ECG classification is to handle the irregularities in the ECG signals which is very important to detect the patient status. Therefore, we have proposed an efficient approach to classify ECG signals with high accuracy Each heartbeat is a combination of action impulse waveforms produced by different specialized cardiac heart tissues. Heartbeats classification faces some difficulties because these waveforms differ from person to another, they are described by some features. These features are the inputs of machine learning algorithm. In general, using Spark–Scala tools simplifies the usage of many algorithms such as machine-learning (ML) algorithms. On other hand, Spark–Scala is preferred to be used more than other tools when size of processing data is too large. In our case, we have used a dataset with 205,146 records to evaluate the performance of our approach. Machine learning libraries in Spark–Scala provide easy ways to implement many classification algorithms (Decision Tree, Random Forests, Gradient-Boosted Trees (GDB), etc.). The proposed method is evaluated and validated on baseline MIT-BIH Arrhythmia and MIT-BIH Supraventricular Arrhythmia database. The results show that our approach achieved an overall accuracy of 96.75% using GDB Tree algorithm and 97.98% using random Forest for binary classification. For multi class classification, it achieved to 98.03% accuracy using Random Forest, Gradient Boosting tree supports only binary classification. |
Tasks | ECG Classification, Electrocardiography (ECG), Heartbeat Classification |
Published | 2019-08-31 |
URL | https://doi.org/10.1186/s40537-019-0244-x |
https://journalofbigdata.springeropen.com/track/pdf/10.1186/s40537-019-0244-x | |
PWC | https://paperswithcode.com/paper/analysis-and-classification-of-heart-diseases |
Repo | |
Framework | |
From Research to Production and Back: Ludicrously Fast Neural Machine Translation
Title | From Research to Production and Back: Ludicrously Fast Neural Machine Translation |
Authors | Young Jin Kim, Marcin Junczys-Dowmunt, Hany Hassan, Alham Fikri Aji, Kenneth Heafield, Roman Grundkiewicz, Nikolay Bogoychev |
Abstract | This paper describes the submissions of the {``}Marian{''} team to the WNGT 2019 efficiency shared task. Taking our dominating submissions to the previous edition of the shared task as a starting point, we develop improved teacher-student training via multi-agent dual-learning and noisy backward-forward translation for Transformer-based student models. For efficient CPU-based decoding, we propose pre-packed 8-bit matrix products, improved batched decoding, cache-friendly student architectures with parameter sharing and light-weight RNN-based decoder architectures. GPU-based decoding benefits from the same architecture changes, from pervasive 16-bit inference and concurrent streams. These modifications together with profiler-based C++ code optimization allow us to push the Pareto frontier established during the 2018 edition towards 24x (CPU) and 14x (GPU) faster models at comparable or higher BLEU values. Our fastest CPU model is more than 4x faster than last year{'}s fastest submission at more than 3 points higher BLEU. Our fastest GPU model at 1.5 seconds translation time is slightly faster than last year{'}s fastest RNN-based submissions, but outperforms them by more than 4 BLEU and 10 BLEU points respectively. | |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5632/ |
https://www.aclweb.org/anthology/D19-5632 | |
PWC | https://paperswithcode.com/paper/from-research-to-production-and-back |
Repo | |
Framework | |
Transformer and seq2seq model for Paraphrase Generation
Title | Transformer and seq2seq model for Paraphrase Generation |
Authors | Elozino Egonmwan, Yllias Chali |
Abstract | Paraphrase generation aims to improve the clarity of a sentence by using different wording that convey similar meaning. For better quality of generated paraphrases, we propose a framework that combines the effectiveness of two models {–} transformer and sequence-to-sequence (seq2seq). We design a two-layer stack of encoders. The first layer is a transformer model containing 6 stacked identical layers with multi-head self attention, while the second-layer is a seq2seq model with gated recurrent units (GRU-RNN). The transformer encoder layer learns to capture long-term dependencies, together with syntactic and semantic properties of the input sentence. This rich vector representation learned by the transformer serves as input to the GRU-RNN encoder responsible for producing the state vector for decoding. Experimental results on two datasets-QUORA and MSCOCO using our framework, produces a new benchmark for paraphrase generation. |
Tasks | Paraphrase Generation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5627/ |
https://www.aclweb.org/anthology/D19-5627 | |
PWC | https://paperswithcode.com/paper/transformer-and-seq2seq-model-for-paraphrase |
Repo | |
Framework | |
Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection
Title | Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection |
Authors | Hang Xu, Chenhan Jiang, Xiaodan Liang, Liang Lin, Zhenguo Li |
Abstract | In this paper, we address the large-scale object detection problem with thousands of categories, which poses severe challenges due to long-tail data distributions, heavy occlusions, and class ambiguities. However, the dominant object detection paradigm is limited by treating each object region separately without considering crucial semantic dependencies among objects. In this work, we introduce a novel Reasoning-RCNN to endow any detection networks the capability of adaptive global reasoning over all object regions by exploiting diverse human commonsense knowledge. Instead of only propagating the visual features on the image directly, we evolve the high-level semantic representations of all categories globally to avoid distracted or poor visual features in the image. Specifically, built on feature representations of basic detection network, the proposed network first generates a global semantic pool by collecting the weights of previous classification layer for each category, and then adaptively enhances each object features via attending different semantic contexts in the global semantic pool. Rather than propagating information from all semantic information that may be noisy, our adaptive global reasoning automatically discovers most relative categories for feature evolving. Our Reasoning-RCNN is light-weight and flexible enough to enhance any detection backbone networks, and extensible for integrating any knowledge resources. Solid experiments on object detection benchmarks show the superiority of our Reasoning-RCNN, e.g. achieving around 16% improvement on VisualGenome, 37% on ADE in terms of mAP and 15% improvement on COCO. |
Tasks | Object Detection |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_Reasoning-RCNN_Unifying_Adaptive_Global_Reasoning_Into_Large-Scale_Object_Detection_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Xu_Reasoning-RCNN_Unifying_Adaptive_Global_Reasoning_Into_Large-Scale_Object_Detection_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/reasoning-rcnn-unifying-adaptive-global |
Repo | |
Framework | |