Paper Group NANR 166
Scalable Generative Models for Multi-label Learning with Missing Labels. Measuring the Limit of Semantic Divergence for English Tweets.. Detecting negation scope is easy, except when it isn’t. Learning to Negate Adjectives with Bilinear Models. Don’t understand a measure? Learn it: Structured Prediction for Coreference Resolution optimizing its mea …
Scalable Generative Models for Multi-label Learning with Missing Labels
Title | Scalable Generative Models for Multi-label Learning with Missing Labels |
Authors | Vikas Jain, Nirbhay Modhe, Piyush Rai |
Abstract | We present a scalable, generative framework for multi-label learning with missing labels. Our framework consists of a latent factor model for the binary label matrix, which is coupled with an exposure model to account for label missingness (i.e., whether a zero in the label matrix is indeed a zero or denotes a missing observation). The underlying latent factor model also assumes that the low-dimensional embeddings of each label vector are directly conditioned on the respective feature vector of that example. Our generative framework admits a simple inference procedure, such that the parameter estimation reduces to a sequence of simple weighted least-square regression problems, each of which can be solved easily, efficiently, and in parallel. Moreover, inference can also be performed in an online fashion using mini-batches of training examples, which makes our framework scalable for large data sets, even when using moderate computational resources. We report both quantitative and qualitative results for our framework on several benchmark data sets, comparing it with a number of state-of-the-art methods. |
Tasks | Multi-Label Learning |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=708 |
http://proceedings.mlr.press/v70/jain17a/jain17a.pdf | |
PWC | https://paperswithcode.com/paper/scalable-generative-models-for-multi-label |
Repo | |
Framework | |
Measuring the Limit of Semantic Divergence for English Tweets.
Title | Measuring the Limit of Semantic Divergence for English Tweets. |
Authors | Dwijen Rudrapal, Amitava Das |
Abstract | In human language, an expression could be conveyed in many ways by different people. Even that the same person may express same sentence quite differently when addressing different audiences, using different modalities, or using different syntactic variations or may use different set of vocabulary. The possibility of such endless surface form of text while the meaning of the text remains almost same, poses many challenges for Natural Language Processing (NLP) systems like question-answering system, machine translation system and text summarization. This research paper is an endeavor to understand the characteristic of such endless semantic divergence. In this research work we develop a corpus of 1525 semantic divergent sentences for 200 English tweets. |
Tasks | Machine Translation, Question Answering, Text Summarization |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1080/ |
https://doi.org/10.26615/978-954-452-049-6_080 | |
PWC | https://paperswithcode.com/paper/measuring-the-limit-of-semantic-divergence |
Repo | |
Framework | |
Detecting negation scope is easy, except when it isn’t
Title | Detecting negation scope is easy, except when it isn’t |
Authors | Federico Fancellu, Adam Lopez, Bonnie Webber, Hangfeng He |
Abstract | Several corpora have been annotated with negation scope{—}the set of words whose meaning is negated by a cue like the word {``}not{''}{—}leading to the development of classifiers that detect negation scope with high accuracy. We show that for nearly all of these corpora, this high accuracy can be attributed to a single fact: they frequently annotate negation scope as a single span of text delimited by punctuation. For negation scopes not of this form, detection accuracy is low and under-sampling the easy training examples does not substantially improve accuracy. We demonstrate that this is partly an artifact of annotation guidelines, and we argue that future negation scope annotation efforts should focus on these more difficult cases. | |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2010/ |
https://www.aclweb.org/anthology/E17-2010 | |
PWC | https://paperswithcode.com/paper/detecting-negation-scope-is-easy-except-when |
Repo | |
Framework | |
Learning to Negate Adjectives with Bilinear Models
Title | Learning to Negate Adjectives with Bilinear Models |
Authors | Laura Rimell, Am Mabona, la, Luana Bulat, Douwe Kiela |
Abstract | We learn a mapping that negates adjectives by predicting an adjective{'}s antonym in an arbitrary word embedding model. We show that both linear models and neural networks improve on this task when they have access to a vector representing the semantic domain of the input word, e.g. a centroid of temperature words when predicting the antonym of {`}cold{'}. We introduce a continuous class-conditional bilinear neural network which is able to negate adjectives with high precision. | |
Tasks | Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2012/ |
https://www.aclweb.org/anthology/E17-2012 | |
PWC | https://paperswithcode.com/paper/learning-to-negate-adjectives-with-bilinear |
Repo | |
Framework | |
Don’t understand a measure? Learn it: Structured Prediction for Coreference Resolution optimizing its measures
Title | Don’t understand a measure? Learn it: Structured Prediction for Coreference Resolution optimizing its measures |
Authors | Iryna Haponchyk, Aless Moschitti, ro |
Abstract | An interesting aspect of structured prediction is the evaluation of an output structure against the gold standard. Especially in the loss-augmented setting, the need of finding the max-violating constraint has severely limited the expressivity of effective loss functions. In this paper, we trade off exact computation for enabling the use and study of more complex loss functions for coreference resolution. Most interestingly, we show that such functions can be (i) automatically learned also from controversial but commonly accepted coreference measures, e.g., MELA, and (ii) successfully used in learning algorithms. The accurate model comparison on the standard CoNLL-2012 setting shows the benefit of more expressive loss functions. |
Tasks | Coreference Resolution, Structured Prediction |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1094/ |
https://www.aclweb.org/anthology/P17-1094 | |
PWC | https://paperswithcode.com/paper/dont-understand-a-measure-learn-it-structured |
Repo | |
Framework | |
Small but Mighty: Affective Micropatterns for Quantifying Mental Health from Social Media Language
Title | Small but Mighty: Affective Micropatterns for Quantifying Mental Health from Social Media Language |
Authors | Kate Loveys, Patrick Crutchley, Emily Wyatt, Glen Coppersmith |
Abstract | Many psychological phenomena occur in small time windows, measured in minutes or hours. However, most computational linguistic techniques look at data on the order of weeks, months, or years. We explore micropatterns in sequences of messages occurring over a short time window for their prevalence and power for quantifying psychological phenomena, specifically, patterns in affect. We examine affective micropatterns in social media posts from users with anxiety, eating disorders, panic attacks, schizophrenia, suicidality, and matched controls. |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-3110/ |
https://www.aclweb.org/anthology/W17-3110 | |
PWC | https://paperswithcode.com/paper/small-but-mighty-affective-micropatterns-for |
Repo | |
Framework | |
Proceedings of the First Workshop on Neural Machine Translation
Title | Proceedings of the First Workshop on Neural Machine Translation |
Authors | |
Abstract | |
Tasks | Machine Translation |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-3200/ |
https://www.aclweb.org/anthology/W17-3200 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-neural |
Repo | |
Framework | |
Exploiting Structure in Parsing to 1-Endpoint-Crossing Graphs
Title | Exploiting Structure in Parsing to 1-Endpoint-Crossing Graphs |
Authors | Robin Kurtz, Marco Kuhlmann |
Abstract | Deep dependency parsing can be cast as the search for maximum acyclic subgraphs in weighted digraphs. Because this search problem is intractable in the general case, we consider its restriction to the class of 1-endpoint-crossing (1ec) graphs, which has high coverage on standard data sets. Our main contribution is a characterization of 1ec graphs as a subclass of the graphs with pagenumber at most 3. Building on this we show how to extend an existing parsing algorithm for 1-endpoint-crossing trees to the full class. While the runtime complexity of the extended algorithm is polynomial in the length of the input sentence, it features a large constant, which poses a challenge for practical implementations. |
Tasks | Dependency Parsing |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6312/ |
https://www.aclweb.org/anthology/W17-6312 | |
PWC | https://paperswithcode.com/paper/exploiting-structure-in-parsing-to-1-endpoint |
Repo | |
Framework | |
Hybrid Approach for Marathi Named Entity Recognition
Title | Hybrid Approach for Marathi Named Entity Recognition |
Authors | Nita Patil, Ajay Patil, B.V. Pawar |
Abstract | |
Tasks | Named Entity Recognition |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-7514/ |
https://www.aclweb.org/anthology/W17-7514 | |
PWC | https://paperswithcode.com/paper/hybrid-approach-for-marathi-named-entity |
Repo | |
Framework | |
Neural Lattice Search for Domain Adaptation in Machine Translation
Title | Neural Lattice Search for Domain Adaptation in Machine Translation |
Authors | Huda Khayrallah, Gaurav Kumar, Kevin Duh, Matt Post, Philipp Koehn |
Abstract | Domain adaptation is a major challenge for neural machine translation (NMT). Given unknown words or new domains, NMT systems tend to generate fluent translations at the expense of adequacy. We present a stack-based lattice search algorithm for NMT and show that constraining its search space with lattices generated by phrase-based machine translation (PBMT) improves robustness. We report consistent BLEU score gains across four diverse domain adaptation tasks involving medical, IT, Koran, or subtitles texts. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2004/ |
https://www.aclweb.org/anthology/I17-2004 | |
PWC | https://paperswithcode.com/paper/neural-lattice-search-for-domain-adaptation |
Repo | |
Framework | |
Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information
Title | Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information |
Authors | Go Inoue, Hiroyuki Shindo, Yuji Matsumoto |
Abstract | Part-of-speech (POS) tagging for morphologically rich languages such as Arabic is a challenging problem because of their enormous tag sets. One reason for this is that in the tagging scheme for such languages, a complete POS tag is formed by combining tags from multiple tag sets defined for each morphosyntactic category. Previous approaches in Arabic POS tagging applied one model for each morphosyntactic tagging task, without utilizing shared information between the tasks. In this paper, we propose an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework. We also propose a method of incorporating tag dictionary information into our neural models by combining word representations with representations of the sets of possible tags. Our experiments showed that the joint model with tag dictionary information results in an accuracy of 91.38{%} on the Penn Arabic Treebank data set, with an absolute improvement of 2.11{%} over the current state-of-the-art tagger. |
Tasks | Multi-Task Learning, Part-Of-Speech Tagging, Transliteration |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-1042/ |
https://www.aclweb.org/anthology/K17-1042 | |
PWC | https://paperswithcode.com/paper/joint-prediction-of-morphosyntactic |
Repo | |
Framework | |
Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists
Title | Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists |
Authors | Gerhard J{"a}ger, Johann-Mattis List, Pavel Sofroniev |
Abstract | Most current approaches in phylogenetic linguistics require as input multilingual word lists partitioned into sets of etymologically related words (cognates). Cognate identification is so far done manually by experts, which is time consuming and as of yet only available for a small number of well-studied language families. Automatizing this step will greatly expand the empirical scope of phylogenetic methods in linguistics, as raw wordlists (in phonetic transcription) are much easier to obtain than wordlists in which cognate words have been fully identified and annotated, even for under-studied languages. A couple of different methods have been proposed in the past, but they are either disappointing regarding their performance or not applicable to larger datasets. Here we present a new approach that uses support vector machines to unify different state-of-the-art methods for phonetic alignment and cognate detection within a single framework. Training and evaluating these method on a typologically broad collection of gold-standard data shows it to be superior to the existing state of the art. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1113/ |
https://www.aclweb.org/anthology/E17-1113 | |
PWC | https://paperswithcode.com/paper/using-support-vector-machines-and-state-of |
Repo | |
Framework | |
Aspect Extraction from Product Reviews Using Category Hierarchy Information
Title | Aspect Extraction from Product Reviews Using Category Hierarchy Information |
Authors | Yinfei Yang, Cen Chen, Minghui Qiu, Forrest Bao |
Abstract | Aspect extraction abstracts the common properties of objects from corpora discussing them, such as reviews of products. Recent work on aspect extraction is leveraging the hierarchical relationship between products and their categories. However, such effort focuses on the aspects of child categories but ignores those from parent categories. Hence, we propose an LDA-based generative topic model inducing the two-layer categorical information (CAT-LDA), to balance the aspects of both a parent category and its child categories. Our hypothesis is that child categories inherit aspects from parent categories, controlled by the hierarchy between them. Experimental results on 5 categories of Amazon.com products show that both common aspects of parent category and the individual aspects of sub-categories can be extracted to align well with the common sense. We further evaluate the manually extracted aspects of 16 products, resulting in an average hit rate of 79.10{%}. |
Tasks | Aspect Extraction, Common Sense Reasoning, Opinion Mining |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2107/ |
https://www.aclweb.org/anthology/E17-2107 | |
PWC | https://paperswithcode.com/paper/aspect-extraction-from-product-reviews-using |
Repo | |
Framework | |
An Empirical Study on The Properties of Random Bases for Kernel Methods
Title | An Empirical Study on The Properties of Random Bases for Kernel Methods |
Authors | Maximilian Alber, Pieter-Jan Kindermans, Kristof Schütt, Klaus-Robert Müller, Fei Sha |
Abstract | Kernel machines as well as neural networks possess universal function approximation properties. Nevertheless in practice their ways of choosing the appropriate function class differ. Specifically neural networks learn a representation by adapting their basis functions to the data and the task at hand, while kernel methods typically use a basis that is not adapted during training. In this work, we contrast random features of approximated kernel machines with learned features of neural networks. Our analysis reveals how these random and adaptive basis functions affect the quality of learning. Furthermore, we present basis adaptation schemes that allow for a more compact representation, while retaining the generalization properties of kernel machines. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6869-an-empirical-study-on-the-properties-of-random-bases-for-kernel-methods |
http://papers.nips.cc/paper/6869-an-empirical-study-on-the-properties-of-random-bases-for-kernel-methods.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-on-the-properties-of |
Repo | |
Framework | |
Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis
Title | Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis |
Authors | Jian Zhao, Lin Xiong, Panasonic Karlekar Jayashree, Jianshu Li, Fang Zhao, Zhecan Wang, Panasonic Sugiri Pranata, Panasonic Shengmei Shen, Shuicheng Yan, Jiashi Feng |
Abstract | Synthesizing realistic profile faces is promising for more efficiently training deep pose-invariant models for large-scale unconstrained face recognition, by populating samples with extreme poses and avoiding tedious annotations. However, learning from synthetic faces may not achieve the desired performance due to the discrepancy between distributions of the synthetic and real face images. To narrow this gap, we propose a Dual-Agent Generative Adversarial Network (DA-GAN) model, which can improve the realism of a face simulator’s output using unlabeled real faces, while preserving the identity information during the realism refinement. The dual agents are specifically designed for distinguishing real v.s. fake and identities simultaneously. In particular, we employ an off-the-shelf 3D face model as a simulator to generate profile face images with varying poses. DA-GAN leverages a fully convolutional network as the generator to generate high-resolution images and an auto-encoder as the discriminator with the dual agents. Besides the novel architecture, we make several key modifications to the standard GAN to preserve pose and texture, preserve identity and stabilize training process: (i) a pose perception loss; (ii) an identity perception loss; (iii) an adversarial loss with a boundary equilibrium regularization term. Experimental results show that DA-GAN not only presents compelling perceptual results but also significantly outperforms state-of-the-arts on the large-scale and challenging NIST IJB-A unconstrained face recognition benchmark. In addition, the proposed DA-GAN is also promising as a new approach for solving generic transfer learning problems more effectively. |
Tasks | Face Generation, Face Recognition, Face Verification, Transfer Learning |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6612-dual-agent-gans-for-photorealistic-and-identity-preserving-profile-face-synthesis |
http://papers.nips.cc/paper/6612-dual-agent-gans-for-photorealistic-and-identity-preserving-profile-face-synthesis.pdf | |
PWC | https://paperswithcode.com/paper/dual-agent-gans-for-photorealistic-and |
Repo | |
Framework | |