Paper Group NANR 9
Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew. Improved Expressivity Through Dendritic Neural Networks. MCapsNet: Capsule Network for Text with Multi-Task Learning. POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION. Algebraic tests of gen …
Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew
Title | Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew |
Authors | Adam Amram, Anat Ben David, Reut Tsarfaty |
Abstract | This paper empirically studies the effects of representation choices on neural sentiment analysis for Modern Hebrew, a morphologically rich language (MRL) for which no sentiment analyzer currently exists. We study two dimensions of representational choices: (i) the granularity of the input signal (token-based vs. morpheme-based), and (ii) the level of encoding of vocabulary items (string-based vs. character-based). We hypothesise that for MRLs, languages where multiple meaning-bearing elements may be carried by a single space-delimited token, these choices will have measurable effects on task perfromance, and that these effects may vary for different architectural designs {—} fully-connected, convolutional or recurrent. Specifically, we hypothesize that morpheme-based representations will have advantages in terms of their generalization capacity and task accuracy, due to their better OOV coverage. To empirically study these effects, we develop a new sentiment analysis benchmark for Hebrew, based on 12K social media comments, and provide two instances of these data: in token-based and morpheme-based settings. Our experiments show that representation choices empirical effects vary with architecture type. While fully-connected and convolutional networks slightly prefer token-based settings, RNNs benefit from a morpheme-based representation, in accord with the hypothesis that explicit morphological information may help generalize. Our endeavour also delivers the first state-of-the-art broad-coverage sentiment analyzer for Hebrew, with over 89{%} accuracy, alongside an established benchmark to further study the effects of linguistic representation choices on neural networks{'} task performance. |
Tasks | Sentiment Analysis, Text Classification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1190/ |
https://www.aclweb.org/anthology/C18-1190 | |
PWC | https://paperswithcode.com/paper/representations-and-architectures-in-neural |
Repo | |
Framework | |
Improved Expressivity Through Dendritic Neural Networks
Title | Improved Expressivity Through Dendritic Neural Networks |
Authors | Xundong Wu, Xiangwen Liu, Wei Li, Qing Wu |
Abstract | A typical biological neuron, such as a pyramidal neuron of the neocortex, receives thousands of afferent synaptic inputs on its dendrite tree and sends the efferent axonal output downstream. In typical artificial neural networks, dendrite trees are modeled as linear structures that funnel weighted synaptic inputs to the cell bodies. However, numerous experimental and theoretical studies have shown that dendritic arbors are far more than simple linear accumulators. That is, synaptic inputs can actively modulate their neighboring synaptic activities; therefore, the dendritic structures are highly nonlinear. In this study, we model such local nonlinearity of dendritic trees with our dendritic neural network (DENN) structure and apply this structure to typical machine learning tasks. Equipped with localized nonlinearities, DENNs can attain greater model expressivity than regular neural networks while maintaining efficient network inference. Such strength is evidenced by the increased fitting power when we train DENNs with supervised machine learning tasks. We also empirically show that the locality structure can improve the generalization performance of DENNs, as exemplified by DENNs outranking naive deep neural network architectures when tested on 121 classification tasks from the UCI machine learning repository. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8029-improved-expressivity-through-dendritic-neural-networks |
http://papers.nips.cc/paper/8029-improved-expressivity-through-dendritic-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/improved-expressivity-through-dendritic |
Repo | |
Framework | |
MCapsNet: Capsule Network for Text with Multi-Task Learning
Title | MCapsNet: Capsule Network for Text with Multi-Task Learning |
Authors | Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, Yaohui Jin |
Abstract | Multi-task learning has an ability to share the knowledge among related tasks and implicitly increase the training data. However, it has long been frustrated by the interference among tasks. This paper investigates the performance of capsule network for text, and proposes a capsule-based multi-task learning architecture, which is unified, simple and effective. With the advantages of capsules for feature clustering, proposed task routing algorithm can cluster the features for each task in the network, which helps reduce the interference among tasks. Experiments on six text classification datasets demonstrate the effectiveness of our models and their characteristics for feature clustering. |
Tasks | Multi-Task Learning, Text Classification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1486/ |
https://www.aclweb.org/anthology/D18-1486 | |
PWC | https://paperswithcode.com/paper/mcapsnet-capsule-network-for-text-with-multi |
Repo | |
Framework | |
POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION
Title | POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION |
Authors | Prannay Khosla, Preethi Jyothi, Vinay P. Namboodiri, Mukundhan Srinivasan |
Abstract | In this paper, we propose the generation of accented speech using generative adversarial networks. Through this work we make two main contributions a) The ability to condition latent representations while generating realistic speech samples b) The ability to efficiently generate long speech samples by using a novel latent variable transformation module that is trained using policy gradients. Previous methods are limited in being able to generate only relatively short samples or are not very efficient at generating long samples. The generated speech samples are validated through a number of various evaluation measures viz, a WGAN critic loss and through subjective scores on user evaluations against competitive speech synthesis baselines and detailed ablation analysis of the proposed model. The evaluations demonstrate that the model generates realistic long speech samples conditioned on accent efficiently. |
Tasks | Speech Synthesis |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJ6iJmWCW |
https://openreview.net/pdf?id=rJ6iJmWCW | |
PWC | https://paperswithcode.com/paper/policy-driven-generative-adversarial-networks |
Repo | |
Framework | |
Algebraic tests of general Gaussian latent tree models
Title | Algebraic tests of general Gaussian latent tree models |
Authors | Dennis Leung, Mathias Drton |
Abstract | We consider general Gaussian latent tree models in which the observed variables are not restricted to be leaves of the tree. Extending related recent work, we give a full semi-algebraic description of the set of covariance matrices of any such model. In other words, we find polynomial constraints that characterize when a matrix is the covariance matrix of a distribution in a given latent tree model. However, leveraging these constraints to test a given such model is often complicated by the number of constraints being large and by singularities of individual polynomials, which may invalidate standard approximations to relevant probability distributions. Illustrating with the star tree, we propose a new testing methodology that circumvents singularity issues by trading off some statistical estimation efficiency and handles cases with many constraints through recent advances on Gaussian approximation for maxima of sums of high-dimensional random vectors. Our test avoids the need to maximize the possibly multimodal likelihood function of such models and is applicable to models with larger number of variables. These points are illustrated in numerical experiments. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7867-algebraic-tests-of-general-gaussian-latent-tree-models |
http://papers.nips.cc/paper/7867-algebraic-tests-of-general-gaussian-latent-tree-models.pdf | |
PWC | https://paperswithcode.com/paper/algebraic-tests-of-general-gaussian-latent |
Repo | |
Framework | |
Proceedings of the Workshop on Figurative Language Processing
Title | Proceedings of the Workshop on Figurative Language Processing |
Authors | |
Abstract | |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0900/ |
https://www.aclweb.org/anthology/W18-0900 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-workshop-on-figurative |
Repo | |
Framework | |
Hierarchical neural model with attention mechanisms for the classification of social media text related to mental health
Title | Hierarchical neural model with attention mechanisms for the classification of social media text related to mental health |
Authors | Julia Ive, George Gkotsis, Rina Dutta, Robert Stewart, Sumithra Velupillai |
Abstract | Mental health problems represent a major public health challenge. Automated analysis of text related to mental health is aimed to help medical decision-making, public health policies and to improve health care. Such analysis may involve text classification. Traditionally, automated classification has been performed mainly using machine learning methods involving costly feature engineering. Recently, the performance of those methods has been dramatically improved by neural methods. However, mainly Convolutional neural networks (CNNs) have been explored. In this paper, we apply a hierarchical Recurrent neural network (RNN) architecture with an attention mechanism on social media data related to mental health. We show that this architecture improves overall classification results as compared to previously reported results on the same data. Benefitting from the attention mechanism, it can also efficiently select text elements crucial for classification decisions, which can also be used for in-depth analysis. |
Tasks | Decision Making, Feature Engineering, Text Classification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0607/ |
https://www.aclweb.org/anthology/W18-0607 | |
PWC | https://paperswithcode.com/paper/hierarchical-neural-model-with-attention |
Repo | |
Framework | |
Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets
Title | Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets |
Authors | Nathan Greenberg, Trapit Bansal, Patrick Verga, Andrew McCallum |
Abstract | Extracting typed entity mentions from text is a fundamental component to language understanding and reasoning. While there exist substantial labeled text datasets for multiple subsets of biomedical entity types{—}such as genes and proteins, or chemicals and diseases{—}it is rare to find large labeled datasets containing labels for all desired entity types together. This paper presents a method for training a single CRF extractor from multiple datasets with disjoint or partially overlapping sets of entity types. Our approach employs marginal likelihood training to insist on labels that are present in the data, while filling in {``}missing labels{''}. This allows us to leverage all the available data within a single model. In experimental results on the Biocreative V CDR (chemicals/diseases), Biocreative VI ChemProt (chemicals/proteins) and MedMentions (19 entity types) datasets, we show that joint training on multiple datasets improves NER F1 over training in isolation, and our methods achieve state-of-the-art results. | |
Tasks | Named Entity Recognition, Question Answering |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1306/ |
https://www.aclweb.org/anthology/D18-1306 | |
PWC | https://paperswithcode.com/paper/marginal-likelihood-training-of-bilstm-crf |
Repo | |
Framework | |
QuaSE: Sequence Editing under Quantifiable Guidance
Title | QuaSE: Sequence Editing under Quantifiable Guidance |
Authors | Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, Tong Zhang |
Abstract | We propose the task of Quantifiable Sequence Editing (QuaSE): editing an input sequence to generate an output sequence that satisfies a given numerical outcome value measuring a certain property of the sequence, with the requirement of keeping the main content of the input sequence. For example, an input sequence could be a word sequence, such as review sentence and advertisement text. For a review sentence, the outcome could be the review rating; for an advertisement, the outcome could be the click-through rate. The major challenge in performing QuaSE is how to perceive the outcome-related wordings, and only edit them to change the outcome. In this paper, the proposed framework contains two latent factors, namely, outcome factor and content factor, disentangled from the input sentence to allow convenient editing to change the outcome and keep the content. Our framework explores the pseudo-parallel sentences by modeling their content similarity and outcome differences to enable a better disentanglement of the latent factors, which allows generating an output to better satisfy the desired outcome and keep the content. The dual reconstruction structure further enhances the capability of generating expected output by exploiting the couplings of latent factors of pseudo-parallel sentences. For evaluation, we prepared a dataset of Yelp review sentences with the ratings as outcome. Extensive experimental results are reported and discussed to elaborate the peculiarities of our framework. |
Tasks | Text Generation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1420/ |
https://www.aclweb.org/anthology/D18-1420 | |
PWC | https://paperswithcode.com/paper/quase-sequence-editing-under-quantifiable |
Repo | |
Framework | |
The MLLP-UPV German-English Machine Translation System for WMT18
Title | The MLLP-UPV German-English Machine Translation System for WMT18 |
Authors | Javier Iranzo-S{'a}nchez, Pau Baquero-Arnal, Gon{\c{c}}al V. Garc{'e}s D{'\i}az-Mun{'\i}o, Adri{`a} Mart{'\i}nez-Villaronga, Jorge Civera, Alfons Juan |
Abstract | This paper describes the statistical machine translation system built by the MLLP research group of Universitat Polit{`e}cnica de Val{`e}ncia for the German→English news translation shared task of the EMNLP 2018 Third Conference on Machine Translation (WMT18). We used an ensemble of Transformer architecture{–}based neural machine translation systems. To train our system under {``}constrained{''} conditions, we filtered the provided parallel data with a scoring technique using character-based language models, and we added parallel data based on synthetic source sentences generated from the provided monolingual corpora. | |
Tasks | Data Augmentation, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6414/ |
https://www.aclweb.org/anthology/W18-6414 | |
PWC | https://paperswithcode.com/paper/the-mllp-upv-german-english-machine |
Repo | |
Framework | |
Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages
Title | Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages |
Authors | Nikola Ljube{\v{s}}i{'c} |
Abstract | This paper presents two systems taking part in the Morphosyntactic Tagging of Tweets shared task on Slovene, Croatian and Serbian data, organized inside the VarDial Evaluation Campaign. While one system relies on the traditional method for sequence labeling (conditional random fields), the other relies on its neural alternative (bidirectional long short-term memory). We investigate the similarities and differences of these two approaches, showing that both methods yield very good and quite similar results, with the neural model outperforming the traditional one more as the level of non-standardness of the text increases. Through an error analysis we show that the neural system is better at long-range dependencies, while the traditional system excels and slightly outperforms the neural system at the local ones. We present in the paper new state-of-the-art results in morphosyntactic annotation of non-standard text for Slovene, Croatian and Serbian. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3917/ |
https://www.aclweb.org/anthology/W18-3917 | |
PWC | https://paperswithcode.com/paper/comparing-crf-and-lstm-performance-on-the |
Repo | |
Framework | |
Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation
Title | Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation |
Authors | Yong Zhang, Weiming Dong, Bao-Gang Hu, Qiang Ji |
Abstract | Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction. Recent works have introduced deep neural networks for AU intensity estimation, but they require a large amount of intensity annotations. AU annotation needs strong domain expertise and it is expensive to construct a large database to learn deep models. We propose a novel knowledge-based semi-supervised deep convolutional neural network for AU intensity estimation with extremely limited AU annotations. Only the intensity annotations of peak and valley frames in training sequences are needed. To provide additional supervision for model learning, we exploit naturally existing constraints on AUs, including relative appearance similarity, temporal intensity ordering, facial symmetry, and contrastive appearance difference. Experimental evaluations are performed on two public benchmark databases. With around 2% of intensity annotations in FERA 2015 and around 1% in DISFA for training, our method can achieve comparable or even better performance than the state-of-the-art methods which use 100% of intensity annotations in the training set. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Weakly-Supervised_Deep_Convolutional_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Weakly-Supervised_Deep_Convolutional_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-deep-convolutional-neural |
Repo | |
Framework | |
Practical Application of Domain Dependent Confidence Measurement for Spoken Language Understanding Systems
Title | Practical Application of Domain Dependent Confidence Measurement for Spoken Language Understanding Systems |
Authors | Mahnoosh Mehrabani, David Thomson, Benjamin Stern |
Abstract | Spoken Language Understanding (SLU), which extracts semantic information from speech, is not flawless, specially in practical applications. The reliability of the output of an SLU system can be evaluated using a semantic confidence measure. Confidence measures are a solution to improve the quality of spoken dialogue systems, by rejecting low-confidence SLU results. In this study we discuss real-world applications of confidence scoring in a customer service scenario. We build confidence models for three major types of dialogue states that are considered as different domains: how may I help you, number capture, and confirmation. Practical challenges to train domain-dependent confidence models, including data limitations, are discussed, and it is shown that feature engineering plays an important role to improve performance. We explore a wide variety of predictor features based on speech recognition, intent classification, and high-level domain knowledge, and find the combined feature set with the best rejection performance for each application. |
Tasks | Feature Engineering, Intent Classification, Machine Translation, Speech Recognition, Spoken Dialogue Systems, Spoken Language Understanding |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3023/ |
https://www.aclweb.org/anthology/N18-3023 | |
PWC | https://paperswithcode.com/paper/practical-application-of-domain-dependent |
Repo | |
Framework | |
Dynamic Task Prioritization for Multitask Learning
Title | Dynamic Task Prioritization for Multitask Learning |
Authors | Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, Li Fei-Fei |
Abstract | We propose dynamic task prioritization for multitask learning. This allows a model to dynamically prioritize difficult tasks during training, where difficulty is inversely proportional to performance, and where difficulty changes over time. In contrast to curriculum learning, where easy tasks are prioritized above difficult tasks, we present several studies showing the importance of prioritizing difficult tasks first. We observe that imbalances in task difficulty can lead to unnecessary emphasis on easier tasks, thus neglecting and slowing progress on difficult tasks. Motivated by this finding, we introduce a notion of dynamic task prioritization to automatically prioritize more difficult tasks by adaptively adjusting the mixing weight of each task’s loss objective. Additional ablation studies show the impact of the task hierarchy, or the task ordering, when explicitly encoded in the network architecture. Our method outperforms existing multitask methods and demonstrates competitive results with modern single-task models on the COCO and MPII datasets. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Michelle_Guo_Focus_on_the_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Michelle_Guo_Focus_on_the_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-task-prioritization-for-multitask |
Repo | |
Framework | |
Self-Calibrating Polarising Radiometric Calibration
Title | Self-Calibrating Polarising Radiometric Calibration |
Authors | Daniel Teo, Boxin Shi, Yinqiang Zheng, Sai-Kit Yeung |
Abstract | We present a self-calibrating polarising radiometric calibration method. From a set of images taken from a single viewpoint under different unknown polarising angles, we recover the inverse camera response function and the polarising angles relative to the first angle. The problem is solved in an integrated manner, recovering both of the unknowns simultaneously. The method exploits the fact that the intensity of polarised light should vary sinusoidally as the polarising filter is rotated, provided that the response is linear. It offers the first solution to demonstrate the possibility of radiometric calibration through polarisation. We evaluate the accuracy of our proposed method using synthetic data and real world objects captured using different cameras. The self-calibrated results were found to be comparable with those from multiple exposure sequence. |
Tasks | Calibration |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Teo_Self-Calibrating_Polarising_Radiometric_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Teo_Self-Calibrating_Polarising_Radiometric_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/self-calibrating-polarising-radiometric |
Repo | |
Framework | |