October 15, 2019

2543 words 12 mins read

Paper Group NANR 100

Paper Group NANR 100

HitNet: Hybrid Ternary Recurrent Neural Network. Transition-based Neural RST Parsing with Implicit Syntax Features. Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution. Latent Entities Extraction: How to Extract Entities that Do Not Appear in the Text?. Adaptive Multi-pass Decoder for Neural Machine Translation. …

HitNet: Hybrid Ternary Recurrent Neural Network

Title HitNet: Hybrid Ternary Recurrent Neural Network
Authors Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, Yuan Xie
Abstract Quantization is a promising technique to reduce the model size, memory footprint, and massive computation operations of recurrent neural networks (RNNs) for embedded devices with limited resources. Although extreme low-bit quantization has achieved impressive success on convolutional neural networks, it still suffers from huge accuracy degradation on RNNs with the same low-bit precision. In this paper, we first investigate the accuracy degradation on RNN models under different quantization schemes, and the distribution of tensor values in the full precision model. Our observation reveals that due to the difference between the distributions of weights and activations, different quantization methods are suitable for different parts of models. Based on our observation, we propose HitNet, a hybrid ternary recurrent neural network, which bridges the accuracy gap between the full precision model and the quantized model. In HitNet, we develop a hybrid quantization method to quantize weights and activations. Moreover, we introduce a sloping factor motivated by prior work on Boltzmann machine to activation functions, further closing the accuracy gap between the full precision model and the quantized model. Overall, our HitNet can quantize RNN models into ternary values, {-1, 0, 1}, outperforming the state-of-the-art quantization methods on RNN models significantly. We test it on typical RNN models, such as Long-Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), on which the results outperform previous work significantly. For example, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 (the state-of-the-art result to the best of our knowledge) to 110.3 with a full precision model in 97.2, and a ternary GRU from 142 to 113.5 with a full precision model in 102.7.
Tasks Quantization
Published 2018-12-01
URL http://papers.nips.cc/paper/7341-hitnet-hybrid-ternary-recurrent-neural-network
PDF http://papers.nips.cc/paper/7341-hitnet-hybrid-ternary-recurrent-neural-network.pdf
PWC https://paperswithcode.com/paper/hitnet-hybrid-ternary-recurrent-neural
Repo
Framework

Transition-based Neural RST Parsing with Implicit Syntax Features

Title Transition-based Neural RST Parsing with Implicit Syntax Features
Authors Nan Yu, Meishan Zhang, Guohong Fu
Abstract Syntax has been a useful source of information for statistical RST discourse parsing. Under the neural setting, a common approach integrates syntax by a recursive neural network (RNN), requiring discrete output trees produced by a supervised syntax parser. In this paper, we propose an implicit syntax feature extraction approach, using hidden-layer vectors extracted from a neural syntax parser. In addition, we propose a simple transition-based model as the baseline, further enhancing it with dynamic oracle. Experiments on the standard dataset show that our baseline model with dynamic oracle is highly competitive. When implicit syntax features are integrated, we are able to obtain further improvements, better than using explicit Tree-RNN.
Tasks Word Embeddings
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1047/
PDF https://www.aclweb.org/anthology/C18-1047
PWC https://paperswithcode.com/paper/transition-based-neural-rst-parsing-with
Repo
Framework

Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

Title Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution
Authors Longquan Dai, Liang Tang, Yuan Xie, Jinhui Tang
Abstract The high-dimensional convolution is widely used in various disciplines but has a serious performance problem due to its high computational complexity. Over the decades, people took a handmade approach to design fast algorithms for the Gaussian convolution. Recently, requirements for various non-Gaussian convolutions have emerged and are continuously getting higher. However, the handmade acceleration approach is no longer feasible for so many different convolutions since it is a time-consuming and painstaking job. Instead, we propose an Acceleration Network (AccNet) which turns the work of designing new fast algorithms to training the AccNet. This is done by: 1, interpreting splatting, blurring, slicing operations as convolutions; 2, turning these convolutions to $g$CP layers to build AccNet. After training, the activation function $g$ together with AccNet weights automatically define the new splatting, blurring and slicing operations. Experiments demonstrate AccNet is able to design acceleration algorithms for a ton of convolutions including Gaussian/non-Gaussian convolutions and produce state-of-the-art results.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7420-designing-by-training-acceleration-neural-network-for-fast-high-dimensional-convolution
PDF http://papers.nips.cc/paper/7420-designing-by-training-acceleration-neural-network-for-fast-high-dimensional-convolution.pdf
PWC https://paperswithcode.com/paper/designing-by-training-acceleration-neural
Repo
Framework

Latent Entities Extraction: How to Extract Entities that Do Not Appear in the Text?

Title Latent Entities Extraction: How to Extract Entities that Do Not Appear in the Text?
Authors Eylon Shoshan, Kira Radinsky
Abstract Named-entity Recognition (NER) is an important task in the NLP field , and is widely used to solve many challenges. However, in many scenarios, not all of the entities are explicitly mentioned in the text. Sometimes they could be inferred from the context or from other indicative words. Consider the following sentence: {``}CMA can easily hydrolyze into free acetic acid.{''} Although water is not mentioned explicitly, one can infer that H2O is an entity involved in the process. In this work, we present the problem of Latent Entities Extraction (LEE). We present several methods for determining whether entities are discussed in a text, even though, potentially, they are not explicitly written. Specifically, we design a neural model that handles extraction of multiple entities jointly. We show that our model, along with multi-task learning approach and a novel task grouping algorithm, reaches high performance in identifying latent entities. Our experiments are conducted on a large biological dataset from the biochemical field. The dataset contains text descriptions of biological processes, and for each process, all of the involved entities in the process are labeled, including implicitly mentioned ones. We believe LEE is a task that will significantly improve many NER and subsequent applications and improve text understanding and inference. |
Tasks Multi-Task Learning, Named Entity Recognition, Question Answering
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-1020/
PDF https://www.aclweb.org/anthology/K18-1020
PWC https://paperswithcode.com/paper/latent-entities-extraction-how-to-extract
Repo
Framework

Adaptive Multi-pass Decoder for Neural Machine Translation

Title Adaptive Multi-pass Decoder for Neural Machine Translation
Authors Xinwei Geng, Xiaocheng Feng, Bing Qin, Ting Liu
Abstract Although end-to-end neural machine translation (NMT) has achieved remarkable progress in the recent years, the idea of adopting multi-pass decoding mechanism into conventional NMT is not well explored. In this paper, we propose a novel architecture called adaptive multi-pass decoder, which introduces a flexible multi-pass polishing mechanism to extend the capacity of NMT via reinforcement learning. More specifically, we adopt an extra policy network to automatically choose a suitable and effective number of decoding passes, according to the complexity of source sentences and the quality of the generated translations. Extensive experiments on Chinese-English translation demonstrate the effectiveness of our proposed adaptive multi-pass decoder upon the conventional NMT with a significant improvement about 1.55 BLEU.
Tasks Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1048/
PDF https://www.aclweb.org/anthology/D18-1048
PWC https://paperswithcode.com/paper/adaptive-multi-pass-decoder-for-neural
Repo
Framework

Fine-Grained Video Captioning for Sports Narrative

Title Fine-Grained Video Captioning for Sports Narrative
Authors Huanyu Yu, Shuo Cheng, Bingbing Ni, Minsi Wang, Jian Zhang, Xiaokang Yang
Abstract Despite recent emergence of video caption methods, how to generate fine-grained video descriptions (i.e., long and detailed commentary about individual movements of multiple subjects as well as their frequent interactions) is far from being solved, which however has great applications such as automatic sports narrative. To this end, this work makes the following contributions. First, to facilitate this novel research of fine-grained video caption, we collected a novel dataset called Fine-grained Sports Narrative dataset (FSN) that contains 2K sports videos with ground-truth narratives from YouTube.com. Second, we develop a novel performance evaluation metric named Fine-grained Captioning Evaluation (FCE) to cope with this novel task. Considered as an extension of the widely used METEOR, it measures not only the linguistic performance but also whether the action details and their temporal orders are correctly described. Third, we propose a new framework for fine-grained sports narrative task. This network features three branches: 1) a spatio-temporal entity localization and role discovering sub-network; 2) a fine-grained action modeling sub-network for local skeleton motion description; and 3) a group relationship modeling sub-network to model interactions between players. We further fuse the features and decode them into long narratives by a hierarchically recurrent structure. Extensive experiments on the FSN dataset demonstrates the validity of the proposed framework for fine-grained video caption.
Tasks Video Captioning
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Yu_Fine-Grained_Video_Captioning_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Yu_Fine-Grained_Video_Captioning_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/fine-grained-video-captioning-for-sports
Repo
Framework

Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN

Title Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN
Authors Liyuan Zhang, Huamin Yang, Zhengang Jiang
Abstract Background Imbalanced data classification is an inevitable problem in medical intelligent diagnosis. Most of real-world biomedical datasets are usually along with limited samples and high-dimensional feature. This seriously affects the classification performance of the model and causes erroneous guidance for the diagnosis of diseases. Exploring an effective classification method for imbalanced and limited biomedical dataset is a challenging task. Methods In this paper, we propose a novel multilayer extreme learning machine (ELM) classification model combined with dynamic generative adversarial net (GAN) to tackle limited and imbalanced biomedical data. Firstly, principal component analysis is utilized to remove irrelevant and redundant features. Meanwhile, more meaningful pathological features are extracted. After that, dynamic GAN is designed to generate the realistic-looking minority class samples, thereby balancing the class distribution and avoiding overfitting effectively. Finally, a self-adaptive multilayer ELM is proposed to classify the balanced dataset. The analytic expression for the numbers of hidden layer and node is determined by quantitatively establishing the relationship between the change of imbalance ratio and the hyper-parameters of the model. Reducing interactive parameters adjustment makes the classification model more robust. Results To evaluate the classification performance of the proposed method, numerical experiments are conducted on four real-world biomedical datasets. The proposed method can generate authentic minority class samples and self-adaptively select the optimal parameters of learning model. By comparing with W-ELM, SMOTE-ELM, and H-ELM methods, the quantitative experimental results demonstrate that our method can achieve better classification performance and higher computational efficiency in terms of ROC, AUC, G-mean, and F-measure metrics. Conclusions Our study provides an effective solution for imbalanced biomedical data classification under the condition of limited samples and high-dimensional feature. The proposed method could offer a theoretical basis for computer-aided diagnosis. It has the potential to be applied in biomedical clinical practice.
Tasks
Published 2018-12-17
URL https://link.springer.com/article/10.1186/s12938-018-0604-3
PDF https://link.springer.com/content/pdf/10.1186%2Fs12938-018-0604-3.pdf
PWC https://paperswithcode.com/paper/imbalanced-biomedical-data-classification
Repo
Framework

Cross-lingual Decompositional Semantic Parsing

Title Cross-lingual Decompositional Semantic Parsing
Authors Sheng Zhang, Xutai Ma, Rachel Rudinger, Kevin Duh, Benjamin Van Durme
Abstract We introduce the task of cross-lingual decompositional semantic parsing: mapping content provided in a source language into a decompositional semantic analysis based on a target language. We present: (1) a form of decompositional semantic analysis designed to allow systems to target varying levels of structural complexity (shallow to deep analysis), (2) an evaluation metric to measure the similarity between system output and reference semantic analysis, (3) an end-to-end model with a novel annotating mechanism that supports intra-sentential coreference, and (4) an evaluation dataset on which our model outperforms strong baselines by at least 1.75 F1 score.
Tasks Semantic Parsing
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1194/
PDF https://www.aclweb.org/anthology/D18-1194
PWC https://paperswithcode.com/paper/cross-lingual-decompositional-semantic
Repo
Framework

Language Production Dynamics with Recurrent Neural Networks

Title Language Production Dynamics with Recurrent Neural Networks
Authors Jes{'u}s Calvillo, Matthew Crocker
Abstract We present an analysis of the internal mechanism of the recurrent neural model of sentence production presented by Calvillo et al. (2016). The results show clear patterns of computation related to each layer in the network allowing to infer an algorithmic account, where the semantics activates the semantically related words, then each word generated at each time step activates syntactic and semantic constraints on possible continuations, while the recurrence preserves information through time. We propose that such insights could generalize to other models with similar architecture, including some used in computational linguistics for language modeling, machine translation and image caption generation.
Tasks Language Modelling, Machine Translation
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2803/
PDF https://www.aclweb.org/anthology/W18-2803
PWC https://paperswithcode.com/paper/language-production-dynamics-with-recurrent
Repo
Framework

Implicational Universals in Stochastic Constraint-Based Phonology

Title Implicational Universals in Stochastic Constraint-Based Phonology
Authors Giorgio Magri
Abstract This paper focuses on the most basic implicational universals in phonological theory, called T-orders after Anttila and Andrus (2006). It shows that the T-orders predicted by stochastic (and partial order) Optimality Theory coincide with those predicted by categorical OT. Analogously, the T-orders predicted by stochastic Harmonic Grammar coincide with those predicted by categorical HG. In other words, these stochastic constraint-based frameworks do not tamper with the typological structure induced by the original categorical frameworks.
Tasks
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1364/
PDF https://www.aclweb.org/anthology/D18-1364
PWC https://paperswithcode.com/paper/implicational-universals-in-stochastic
Repo
Framework

Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data

Title Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data
Authors Minyoung Kim
Abstract The Cox process is a flexible event model that can account for uncertainty of the intensity function in the Poisson process. However, previous approaches make strong assumptions in terms of time stationarity, potentially failing to generalize when the data do not conform to the assumed stationarity conditions. In this paper we bring up two most popular Cox models representing two extremes, and propose a novel semi-stationary Cox process model that can take benefits from both models. Our model has a set of Gaussian process latent functions governed by a latent stationary Markov process where we provide analytic derivations for the variational inference. Empirical evaluations on several synthetic and real-world events data including the football shot attempts and daily earthquakes, demonstrate that the proposed model is promising, can yield improved generalization performance over existing approaches.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2047
PDF http://proceedings.mlr.press/v80/kim18a/kim18a.pdf
PWC https://paperswithcode.com/paper/markov-modulated-gaussian-cox-processes-for
Repo
Framework

CheckYourMeal!: diet management with NLG

Title CheckYourMeal!: diet management with NLG
Authors Luca Anselma, Simone Donetti, Aless Mazzei, ro, Andrea Pirone
Abstract
Tasks Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6710/
PDF https://www.aclweb.org/anthology/W18-6710
PWC https://paperswithcode.com/paper/checkyourmeal-diet-management-with-nlg
Repo
Framework

Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)

Title Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)
Authors
Abstract
Tasks Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6700/
PDF https://www.aclweb.org/anthology/W18-6700
PWC https://paperswithcode.com/paper/proceedings-of-the-workshop-on-intelligent
Repo
Framework

Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy

Title Crowdsourcing-based Annotation of the Accounting Registers of the Italian Comedy
Authors Adeline Granet, Benjamin Hervy, Geoffrey Roman-Jimenez, Marouane Hachicha, Emmanuel Morin, Harold Mouch{`e}re, Solen Quiniou, Guillaume Raschia, Fran{\c{c}}oise Rubellin, Christian Viard-Gaudin
Abstract
Tasks Information Retrieval
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1069/
PDF https://www.aclweb.org/anthology/L18-1069
PWC https://paperswithcode.com/paper/crowdsourcing-based-annotation-of-the
Repo
Framework

Forms of Anaphoric Reference to Organisational Named Entities: Hoping to widen appeal, they diversified

Title Forms of Anaphoric Reference to Organisational Named Entities: Hoping to widen appeal, they diversified
Authors Christian Hardmeier, Luca Bevacqua, Sharid Lo{'a}iciga, Hannah Rohde
Abstract Proper names of organisations are a special case of collective nouns. Their meaning can be conceptualised as a collective unit or as a plurality of persons, allowing for different morphological marking of coreferent anaphoric pronouns. This paper explores the variability of references to organisation names with 1) a corpus analysis and 2) two crowd-sourced story continuation experiments. The first shows that the preference for singular vs. plural conceptualisation is dependent on the level of formality of a text. In the second, we observe a strong preference for the plural they otherwise typical of informal speech. Using edited corpus data instead of constructed sentences as stimuli reduces this preference.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2406/
PDF https://www.aclweb.org/anthology/W18-2406
PWC https://paperswithcode.com/paper/forms-of-anaphoric-reference-to
Repo
Framework
comments powered by Disqus