October 16, 2019

2871 words 14 mins read

Paper Group NANR 9

Paper Group NANR 9

Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew. Improved Expressivity Through Dendritic Neural Networks. MCapsNet: Capsule Network for Text with Multi-Task Learning. POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION. Algebraic tests of gen …

Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew

Title Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew
Authors Adam Amram, Anat Ben David, Reut Tsarfaty
Abstract This paper empirically studies the effects of representation choices on neural sentiment analysis for Modern Hebrew, a morphologically rich language (MRL) for which no sentiment analyzer currently exists. We study two dimensions of representational choices: (i) the granularity of the input signal (token-based vs. morpheme-based), and (ii) the level of encoding of vocabulary items (string-based vs. character-based). We hypothesise that for MRLs, languages where multiple meaning-bearing elements may be carried by a single space-delimited token, these choices will have measurable effects on task perfromance, and that these effects may vary for different architectural designs {—} fully-connected, convolutional or recurrent. Specifically, we hypothesize that morpheme-based representations will have advantages in terms of their generalization capacity and task accuracy, due to their better OOV coverage. To empirically study these effects, we develop a new sentiment analysis benchmark for Hebrew, based on 12K social media comments, and provide two instances of these data: in token-based and morpheme-based settings. Our experiments show that representation choices empirical effects vary with architecture type. While fully-connected and convolutional networks slightly prefer token-based settings, RNNs benefit from a morpheme-based representation, in accord with the hypothesis that explicit morphological information may help generalize. Our endeavour also delivers the first state-of-the-art broad-coverage sentiment analyzer for Hebrew, with over 89{%} accuracy, alongside an established benchmark to further study the effects of linguistic representation choices on neural networks{'} task performance.
Tasks Sentiment Analysis, Text Classification
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1190/
PDF https://www.aclweb.org/anthology/C18-1190
PWC https://paperswithcode.com/paper/representations-and-architectures-in-neural
Repo
Framework

Improved Expressivity Through Dendritic Neural Networks

Title Improved Expressivity Through Dendritic Neural Networks
Authors Xundong Wu, Xiangwen Liu, Wei Li, Qing Wu
Abstract A typical biological neuron, such as a pyramidal neuron of the neocortex, receives thousands of afferent synaptic inputs on its dendrite tree and sends the efferent axonal output downstream. In typical artificial neural networks, dendrite trees are modeled as linear structures that funnel weighted synaptic inputs to the cell bodies. However, numerous experimental and theoretical studies have shown that dendritic arbors are far more than simple linear accumulators. That is, synaptic inputs can actively modulate their neighboring synaptic activities; therefore, the dendritic structures are highly nonlinear. In this study, we model such local nonlinearity of dendritic trees with our dendritic neural network (DENN) structure and apply this structure to typical machine learning tasks. Equipped with localized nonlinearities, DENNs can attain greater model expressivity than regular neural networks while maintaining efficient network inference. Such strength is evidenced by the increased fitting power when we train DENNs with supervised machine learning tasks. We also empirically show that the locality structure can improve the generalization performance of DENNs, as exemplified by DENNs outranking naive deep neural network architectures when tested on 121 classification tasks from the UCI machine learning repository.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/8029-improved-expressivity-through-dendritic-neural-networks
PDF http://papers.nips.cc/paper/8029-improved-expressivity-through-dendritic-neural-networks.pdf
PWC https://paperswithcode.com/paper/improved-expressivity-through-dendritic
Repo
Framework

MCapsNet: Capsule Network for Text with Multi-Task Learning

Title MCapsNet: Capsule Network for Text with Multi-Task Learning
Authors Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, Yaohui Jin
Abstract Multi-task learning has an ability to share the knowledge among related tasks and implicitly increase the training data. However, it has long been frustrated by the interference among tasks. This paper investigates the performance of capsule network for text, and proposes a capsule-based multi-task learning architecture, which is unified, simple and effective. With the advantages of capsules for feature clustering, proposed task routing algorithm can cluster the features for each task in the network, which helps reduce the interference among tasks. Experiments on six text classification datasets demonstrate the effectiveness of our models and their characteristics for feature clustering.
Tasks Multi-Task Learning, Text Classification
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1486/
PDF https://www.aclweb.org/anthology/D18-1486
PWC https://paperswithcode.com/paper/mcapsnet-capsule-network-for-text-with-multi
Repo
Framework

POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION

Title POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION
Authors Prannay Khosla, Preethi Jyothi, Vinay P. Namboodiri, Mukundhan Srinivasan
Abstract In this paper, we propose the generation of accented speech using generative adversarial networks. Through this work we make two main contributions a) The ability to condition latent representations while generating realistic speech samples b) The ability to efficiently generate long speech samples by using a novel latent variable transformation module that is trained using policy gradients. Previous methods are limited in being able to generate only relatively short samples or are not very efficient at generating long samples. The generated speech samples are validated through a number of various evaluation measures viz, a WGAN critic loss and through subjective scores on user evaluations against competitive speech synthesis baselines and detailed ablation analysis of the proposed model. The evaluations demonstrate that the model generates realistic long speech samples conditioned on accent efficiently.
Tasks Speech Synthesis
Published 2018-01-01
URL https://openreview.net/forum?id=rJ6iJmWCW
PDF https://openreview.net/pdf?id=rJ6iJmWCW
PWC https://paperswithcode.com/paper/policy-driven-generative-adversarial-networks
Repo
Framework

Algebraic tests of general Gaussian latent tree models

Title Algebraic tests of general Gaussian latent tree models
Authors Dennis Leung, Mathias Drton
Abstract We consider general Gaussian latent tree models in which the observed variables are not restricted to be leaves of the tree. Extending related recent work, we give a full semi-algebraic description of the set of covariance matrices of any such model. In other words, we find polynomial constraints that characterize when a matrix is the covariance matrix of a distribution in a given latent tree model. However, leveraging these constraints to test a given such model is often complicated by the number of constraints being large and by singularities of individual polynomials, which may invalidate standard approximations to relevant probability distributions. Illustrating with the star tree, we propose a new testing methodology that circumvents singularity issues by trading off some statistical estimation efficiency and handles cases with many constraints through recent advances on Gaussian approximation for maxima of sums of high-dimensional random vectors. Our test avoids the need to maximize the possibly multimodal likelihood function of such models and is applicable to models with larger number of variables. These points are illustrated in numerical experiments.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7867-algebraic-tests-of-general-gaussian-latent-tree-models
PDF http://papers.nips.cc/paper/7867-algebraic-tests-of-general-gaussian-latent-tree-models.pdf
PWC https://paperswithcode.com/paper/algebraic-tests-of-general-gaussian-latent
Repo
Framework

Proceedings of the Workshop on Figurative Language Processing

Title Proceedings of the Workshop on Figurative Language Processing
Authors
Abstract
Tasks
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-0900/
PDF https://www.aclweb.org/anthology/W18-0900
PWC https://paperswithcode.com/paper/proceedings-of-the-workshop-on-figurative
Repo
Framework
Title Hierarchical neural model with attention mechanisms for the classification of social media text related to mental health
Authors Julia Ive, George Gkotsis, Rina Dutta, Robert Stewart, Sumithra Velupillai
Abstract Mental health problems represent a major public health challenge. Automated analysis of text related to mental health is aimed to help medical decision-making, public health policies and to improve health care. Such analysis may involve text classification. Traditionally, automated classification has been performed mainly using machine learning methods involving costly feature engineering. Recently, the performance of those methods has been dramatically improved by neural methods. However, mainly Convolutional neural networks (CNNs) have been explored. In this paper, we apply a hierarchical Recurrent neural network (RNN) architecture with an attention mechanism on social media data related to mental health. We show that this architecture improves overall classification results as compared to previously reported results on the same data. Benefitting from the attention mechanism, it can also efficiently select text elements crucial for classification decisions, which can also be used for in-depth analysis.
Tasks Decision Making, Feature Engineering, Text Classification
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-0607/
PDF https://www.aclweb.org/anthology/W18-0607
PWC https://paperswithcode.com/paper/hierarchical-neural-model-with-attention
Repo
Framework

Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets

Title Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets
Authors Nathan Greenberg, Trapit Bansal, Patrick Verga, Andrew McCallum
Abstract Extracting typed entity mentions from text is a fundamental component to language understanding and reasoning. While there exist substantial labeled text datasets for multiple subsets of biomedical entity types{—}such as genes and proteins, or chemicals and diseases{—}it is rare to find large labeled datasets containing labels for all desired entity types together. This paper presents a method for training a single CRF extractor from multiple datasets with disjoint or partially overlapping sets of entity types. Our approach employs marginal likelihood training to insist on labels that are present in the data, while filling in {``}missing labels{''}. This allows us to leverage all the available data within a single model. In experimental results on the Biocreative V CDR (chemicals/diseases), Biocreative VI ChemProt (chemicals/proteins) and MedMentions (19 entity types) datasets, we show that joint training on multiple datasets improves NER F1 over training in isolation, and our methods achieve state-of-the-art results. |
Tasks Named Entity Recognition, Question Answering
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1306/
PDF https://www.aclweb.org/anthology/D18-1306
PWC https://paperswithcode.com/paper/marginal-likelihood-training-of-bilstm-crf
Repo
Framework

QuaSE: Sequence Editing under Quantifiable Guidance

Title QuaSE: Sequence Editing under Quantifiable Guidance
Authors Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, Tong Zhang
Abstract We propose the task of Quantifiable Sequence Editing (QuaSE): editing an input sequence to generate an output sequence that satisfies a given numerical outcome value measuring a certain property of the sequence, with the requirement of keeping the main content of the input sequence. For example, an input sequence could be a word sequence, such as review sentence and advertisement text. For a review sentence, the outcome could be the review rating; for an advertisement, the outcome could be the click-through rate. The major challenge in performing QuaSE is how to perceive the outcome-related wordings, and only edit them to change the outcome. In this paper, the proposed framework contains two latent factors, namely, outcome factor and content factor, disentangled from the input sentence to allow convenient editing to change the outcome and keep the content. Our framework explores the pseudo-parallel sentences by modeling their content similarity and outcome differences to enable a better disentanglement of the latent factors, which allows generating an output to better satisfy the desired outcome and keep the content. The dual reconstruction structure further enhances the capability of generating expected output by exploiting the couplings of latent factors of pseudo-parallel sentences. For evaluation, we prepared a dataset of Yelp review sentences with the ratings as outcome. Extensive experimental results are reported and discussed to elaborate the peculiarities of our framework.
Tasks Text Generation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1420/
PDF https://www.aclweb.org/anthology/D18-1420
PWC https://paperswithcode.com/paper/quase-sequence-editing-under-quantifiable
Repo
Framework

The MLLP-UPV German-English Machine Translation System for WMT18

Title The MLLP-UPV German-English Machine Translation System for WMT18
Authors Javier Iranzo-S{'a}nchez, Pau Baquero-Arnal, Gon{\c{c}}al V. Garc{'e}s D{'\i}az-Mun{'\i}o, Adri{`a} Mart{'\i}nez-Villaronga, Jorge Civera, Alfons Juan
Abstract This paper describes the statistical machine translation system built by the MLLP research group of Universitat Polit{`e}cnica de Val{`e}ncia for the German→English news translation shared task of the EMNLP 2018 Third Conference on Machine Translation (WMT18). We used an ensemble of Transformer architecture{–}based neural machine translation systems. To train our system under {``}constrained{''} conditions, we filtered the provided parallel data with a scoring technique using character-based language models, and we added parallel data based on synthetic source sentences generated from the provided monolingual corpora. |
Tasks Data Augmentation, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6414/
PDF https://www.aclweb.org/anthology/W18-6414
PWC https://paperswithcode.com/paper/the-mllp-upv-german-english-machine
Repo
Framework

Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages

Title Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages
Authors Nikola Ljube{\v{s}}i{'c}
Abstract This paper presents two systems taking part in the Morphosyntactic Tagging of Tweets shared task on Slovene, Croatian and Serbian data, organized inside the VarDial Evaluation Campaign. While one system relies on the traditional method for sequence labeling (conditional random fields), the other relies on its neural alternative (bidirectional long short-term memory). We investigate the similarities and differences of these two approaches, showing that both methods yield very good and quite similar results, with the neural model outperforming the traditional one more as the level of non-standardness of the text increases. Through an error analysis we show that the neural system is better at long-range dependencies, while the traditional system excels and slightly outperforms the neural system at the local ones. We present in the paper new state-of-the-art results in morphosyntactic annotation of non-standard text for Slovene, Croatian and Serbian.
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/W18-3917/
PDF https://www.aclweb.org/anthology/W18-3917
PWC https://paperswithcode.com/paper/comparing-crf-and-lstm-performance-on-the
Repo
Framework

Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation

Title Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation
Authors Yong Zhang, Weiming Dong, Bao-Gang Hu, Qiang Ji
Abstract Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction. Recent works have introduced deep neural networks for AU intensity estimation, but they require a large amount of intensity annotations. AU annotation needs strong domain expertise and it is expensive to construct a large database to learn deep models. We propose a novel knowledge-based semi-supervised deep convolutional neural network for AU intensity estimation with extremely limited AU annotations. Only the intensity annotations of peak and valley frames in training sequences are needed. To provide additional supervision for model learning, we exploit naturally existing constraints on AUs, including relative appearance similarity, temporal intensity ordering, facial symmetry, and contrastive appearance difference. Experimental evaluations are performed on two public benchmark databases. With around 2% of intensity annotations in FERA 2015 and around 1% in DISFA for training, our method can achieve comparable or even better performance than the state-of-the-art methods which use 100% of intensity annotations in the training set.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Weakly-Supervised_Deep_Convolutional_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Weakly-Supervised_Deep_Convolutional_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-deep-convolutional-neural
Repo
Framework

Practical Application of Domain Dependent Confidence Measurement for Spoken Language Understanding Systems

Title Practical Application of Domain Dependent Confidence Measurement for Spoken Language Understanding Systems
Authors Mahnoosh Mehrabani, David Thomson, Benjamin Stern
Abstract Spoken Language Understanding (SLU), which extracts semantic information from speech, is not flawless, specially in practical applications. The reliability of the output of an SLU system can be evaluated using a semantic confidence measure. Confidence measures are a solution to improve the quality of spoken dialogue systems, by rejecting low-confidence SLU results. In this study we discuss real-world applications of confidence scoring in a customer service scenario. We build confidence models for three major types of dialogue states that are considered as different domains: how may I help you, number capture, and confirmation. Practical challenges to train domain-dependent confidence models, including data limitations, are discussed, and it is shown that feature engineering plays an important role to improve performance. We explore a wide variety of predictor features based on speech recognition, intent classification, and high-level domain knowledge, and find the combined feature set with the best rejection performance for each application.
Tasks Feature Engineering, Intent Classification, Machine Translation, Speech Recognition, Spoken Dialogue Systems, Spoken Language Understanding
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-3023/
PDF https://www.aclweb.org/anthology/N18-3023
PWC https://paperswithcode.com/paper/practical-application-of-domain-dependent
Repo
Framework

Dynamic Task Prioritization for Multitask Learning

Title Dynamic Task Prioritization for Multitask Learning
Authors Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, Li Fei-Fei
Abstract We propose dynamic task prioritization for multitask learning. This allows a model to dynamically prioritize difficult tasks during training, where difficulty is inversely proportional to performance, and where difficulty changes over time. In contrast to curriculum learning, where easy tasks are prioritized above difficult tasks, we present several studies showing the importance of prioritizing difficult tasks first. We observe that imbalances in task difficulty can lead to unnecessary emphasis on easier tasks, thus neglecting and slowing progress on difficult tasks. Motivated by this finding, we introduce a notion of dynamic task prioritization to automatically prioritize more difficult tasks by adaptively adjusting the mixing weight of each task’s loss objective. Additional ablation studies show the impact of the task hierarchy, or the task ordering, when explicitly encoded in the network architecture. Our method outperforms existing multitask methods and demonstrates competitive results with modern single-task models on the COCO and MPII datasets.
Tasks
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Michelle_Guo_Focus_on_the_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Michelle_Guo_Focus_on_the_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/dynamic-task-prioritization-for-multitask
Repo
Framework

Self-Calibrating Polarising Radiometric Calibration

Title Self-Calibrating Polarising Radiometric Calibration
Authors Daniel Teo, Boxin Shi, Yinqiang Zheng, Sai-Kit Yeung
Abstract We present a self-calibrating polarising radiometric calibration method. From a set of images taken from a single viewpoint under different unknown polarising angles, we recover the inverse camera response function and the polarising angles relative to the first angle. The problem is solved in an integrated manner, recovering both of the unknowns simultaneously. The method exploits the fact that the intensity of polarised light should vary sinusoidally as the polarising filter is rotated, provided that the response is linear. It offers the first solution to demonstrate the possibility of radiometric calibration through polarisation. We evaluate the accuracy of our proposed method using synthetic data and real world objects captured using different cameras. The self-calibrated results were found to be comparable with those from multiple exposure sequence.
Tasks Calibration
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Teo_Self-Calibrating_Polarising_Radiometric_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Teo_Self-Calibrating_Polarising_Radiometric_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/self-calibrating-polarising-radiometric
Repo
Framework
comments powered by Disqus