Paper Group ANR 95
LightRel SemEval-2018 Task 7: Lightweight and Fast Relation Classification. Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification. Scoring Formulation for Multi-Condition Joint PLDA. Local Binary Pattern Networks. Generalizable Meta-Heuristic based on Temporal Estimation of Rewards for Large Scale Blackbox Opti …
LightRel SemEval-2018 Task 7: Lightweight and Fast Relation Classification
Title | LightRel SemEval-2018 Task 7: Lightweight and Fast Relation Classification |
Authors | Tyler Renslow, Günter Neumann |
Abstract | We present LightRel, a lightweight and fast relation classifier. Our goal is to develop a high baseline for different relation extraction tasks. By defining only very few data-internal, word-level features and external knowledge sources in the form of word clusters and word embeddings, we train a fast and simple linear classifier. |
Tasks | Relation Classification, Relation Extraction, Word Embeddings |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.08426v1 |
http://arxiv.org/pdf/1804.08426v1.pdf | |
PWC | https://paperswithcode.com/paper/lightrel-semeval-2018-task-7-lightweight-and |
Repo | |
Framework | |
Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification
Title | Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification |
Authors | Yizhong Wang, Sujian Li, Jingfeng Yang, Xu Sun, Houfeng Wang |
Abstract | Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text. To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information. In this work, we explore the idea of incorporating syntactic parse tree into neural networks. Specifically, we employ the Tree-LSTM model and Tree-GRU model, which are based on the tree structure, to encode the arguments in a relation. Moreover, we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks. Experimental results show that our method achieves state-of-the-art performance on PDTB corpus. |
Tasks | Implicit Discourse Relation Classification, Relation Classification, Semantic Composition |
Published | 2018-03-03 |
URL | http://arxiv.org/abs/1803.01165v1 |
http://arxiv.org/pdf/1803.01165v1.pdf | |
PWC | https://paperswithcode.com/paper/tag-enhanced-tree-structured-neural-networks |
Repo | |
Framework | |
Scoring Formulation for Multi-Condition Joint PLDA
Title | Scoring Formulation for Multi-Condition Joint PLDA |
Authors | Luciana Ferrer |
Abstract | The joint PLDA model, is a generalization of PLDA where the nuisance variable is no longer considered independent across samples, but potentially shared (tied) across samples that correspond to the same nuisance condition. The original work considered a single nuisance condition, deriving the EM and scoring formulas for this scenario. In this document, we show how to obtain likelihood ratios for scoring when multiple nuisance conditions are allowed in the model. |
Tasks | |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03684v1 |
http://arxiv.org/pdf/1803.03684v1.pdf | |
PWC | https://paperswithcode.com/paper/scoring-formulation-for-multi-condition-joint |
Repo | |
Framework | |
Local Binary Pattern Networks
Title | Local Binary Pattern Networks |
Authors | Jeng-Hau Lin, Yunfan Yang, Rajesh Gupta, Zhuowen Tu |
Abstract | Memory and computation efficient deep learning architec- tures are crucial to continued proliferation of machine learning capabili- ties to new platforms and systems. Binarization of operations in convo- lutional neural networks has shown promising results in reducing model size and computing efficiency. In this paper, we tackle the problem us- ing a strategy different from the existing literature by proposing local binary pattern networks or LBPNet, that is able to learn and perform binary operations in an end-to-end fashion. LBPNet1 uses local binary comparisons and random projection in place of conventional convolu- tion (or approximation of convolution) operations. These operations can be implemented efficiently on different platforms including direct hard- ware implementation. We applied LBPNet and its variants on standard benchmarks. The results are promising across benchmarks while provid- ing an important means to improve memory and speed efficiency that is particularly suited for small footprint devices and hardware accelerators. |
Tasks | |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.07125v2 |
http://arxiv.org/pdf/1803.07125v2.pdf | |
PWC | https://paperswithcode.com/paper/local-binary-pattern-networks |
Repo | |
Framework | |
Generalizable Meta-Heuristic based on Temporal Estimation of Rewards for Large Scale Blackbox Optimization
Title | Generalizable Meta-Heuristic based on Temporal Estimation of Rewards for Large Scale Blackbox Optimization |
Authors | Mingde Zhao, Hongwei Ge, Yi Lian, Kai Zhang |
Abstract | The generalization abilities of heuristic optimizers may deteriorate with the increment of the search space dimensionality. To achieve generalized performance across Large Scale Blackbox Optimization (LSBO) tasks, it ispossible to ensemble several heuristics and devise a meta-heuristic to control their initiation. This paper first proposes a methodology of transforming LSBO problems into online decision processes to maximize efficiency of resource utilization. Then, using the perspective of multi-armed bandits with non-stationary reward distributions, we propose a meta-heuristic based on Temporal Estimation of Rewards (TER) to address such decision process. TER uses a window for temporal credit assignment and Boltzmann exploration to balance the exploration-exploitation tradeoff. The prior-free TER generalizes across LSBO tasks with flexibility for different types of limited computational resources (e.g. time, money, etc.) and is easy to be adapted to new tasks for its simplicity and easy interface for heuristic articulation. Tests on the benchmarks validate the problem formulation and suggest significant effectiveness: when TER is articulated with three heuristics, competitive performance is reported across different sets of benchmark problems with search dimensions up to 10000. |
Tasks | Multi-Armed Bandits |
Published | 2018-12-17 |
URL | https://arxiv.org/abs/1812.06585v2 |
https://arxiv.org/pdf/1812.06585v2.pdf | |
PWC | https://paperswithcode.com/paper/online-decisioning-meta-heuristic-framework |
Repo | |
Framework | |
Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns
Title | Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns |
Authors | Youngjun Cho, Nadia Bianchi-Berthouze, Nicolai Marquardt, Simon J. Julier |
Abstract | We introduce Deep Thermal Imaging, a new approach for close-range automatic recognition of materials to enhance the understanding of people and ubiquitous technologies of their proximal environment. Our approach uses a low-cost mobile thermal camera integrated into a smartphone to capture thermal textures. A deep neural network classifies these textures into material types. This approach works effectively without the need for ambient light sources or direct contact with materials. Furthermore, the use of a deep learning network removes the need to handcraft the set of features for different materials. We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments. Our approach produced recognition accuracies above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584 images of 17 outdoor materials. We conclude by discussing its potentials for real-time use in HCI applications and future directions. |
Tasks | |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02310v1 |
http://arxiv.org/pdf/1803.02310v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-thermal-imaging-proximate-material-type |
Repo | |
Framework | |
Named Entity Analysis and Extraction with Uncommon Words
Title | Named Entity Analysis and Extraction with Uncommon Words |
Authors | Xiaoshi Zhong, Erik Cambria, Jagath C. Rajapakse |
Abstract | Most previous research treats named entity extraction and classification as an end-to-end task. We argue that the two sub-tasks should be addressed separately. Entity extraction lies at the level of syntactic analysis while entity classification lies at the level of semantic analysis. According to Noam Chomsky’s “Syntactic Structures,” pp. 93-94 (Chomsky 1957), syntax is not appealed to semantics and semantics does not affect syntax. We analyze two benchmark datasets for the characteristics of named entities, finding that uncommon words can distinguish named entities from common text; where uncommon words are the words that hardly appear in common text and they are mainly the proper nouns. Experiments validate that lexical and syntactic features achieve state-of-the-art performance on entity extraction and that semantic features do not further improve the extraction performance, in both of our model and the state-of-the-art baselines. With Chomsky’s view, we also explain the failure of joint syntactic and semantic parsings in other works. |
Tasks | Entity Extraction |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.06818v2 |
http://arxiv.org/pdf/1810.06818v2.pdf | |
PWC | https://paperswithcode.com/paper/named-entity-analysis-and-extraction-with |
Repo | |
Framework | |
Similarity measure for Public Persons
Title | Similarity measure for Public Persons |
Authors | Andreas Stöckl |
Abstract | For the webportal “Who is in the News!” with statistics about the appearence of persons in written news we developed an extension, which measures the relationship of public persons depending on a time parameter, as the relationship may vary over time. On a training corpus of English and German news articles we built a measure by extracting the persons occurrence in the text via pretrained named entity extraction and then construct time series of counts for each person. Pearson correlation over a sliding window is then used to measure the relation of two persons. |
Tasks | Entity Extraction, Time Series |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06083v1 |
http://arxiv.org/pdf/1809.06083v1.pdf | |
PWC | https://paperswithcode.com/paper/similarity-measure-for-public-persons |
Repo | |
Framework | |
A Multilingual Information Extraction Pipeline for Investigative Journalism
Title | A Multilingual Information Extraction Pipeline for Investigative Journalism |
Authors | Gregor Wiedemann, Seid Muhie Yimam, Chris Biemann |
Abstract | We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism. The pipeline serves as a new input processor for the upcoming major release of our New/s/leak 2.0 software, which we develop in cooperation with a large German news organization. The use case is that journalists receive a large collection of files up to several Gigabytes containing unknown contents. Collections may originate either from official disclosures of documents, e.g. Freedom of Information Act requests, or unofficial data leaks. Our software prepares a visually-aided exploration of the collection to quickly learn about potential stories contained in the data. It is based on the automatic extraction of entities and their co-occurrence in documents. In contrast to comparable projects, we focus on the following three major requirements particularly serving the use case of investigative journalism in cross-border collaborations: 1) composition of multiple state-of-the-art NLP tools for entity extraction, 2) support of multi-lingual document sets up to 40 languages, 3) fast and easy-to-use extraction of full-text, metadata and entities from various file formats. |
Tasks | Entity Extraction |
Published | 2018-09-01 |
URL | http://arxiv.org/abs/1809.00221v1 |
http://arxiv.org/pdf/1809.00221v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multilingual-information-extraction |
Repo | |
Framework | |
Structure-Preserving Transformation: Generating Diverse and Transferable Adversarial Examples
Title | Structure-Preserving Transformation: Generating Diverse and Transferable Adversarial Examples |
Authors | Dan Peng, Zizhan Zheng, Xiaofeng Zhang |
Abstract | Adversarial examples are perturbed inputs designed to fool machine learning models. Most recent works on adversarial examples for image classification focus on directly modifying pixels with minor perturbations. A common requirement in all these works is that the malicious perturbations should be small enough (measured by an L_p norm for some p) so that they are imperceptible to humans. However, small perturbations can be unnecessarily restrictive and limit the diversity of adversarial examples generated. Further, an L_p norm based distance metric ignores important structure patterns hidden in images that are important to human perception. Consequently, even the minor perturbation introduced in recent works often makes the adversarial examples less natural to humans. More importantly, they often do not transfer well and are therefore less effective when attacking black-box models especially for those protected by a defense mechanism. In this paper, we propose a structure-preserving transformation (SPT) for generating natural and diverse adversarial examples with extremely high transferability. The key idea of our approach is to allow perceptible deviation in adversarial examples while keeping structure patterns that are central to a human classifier. Empirical results on the MNIST and the fashion-MNIST datasets show that adversarial examples generated by our approach can easily bypass strong adversarial training. Further, they transfer well to other target models with no loss or little loss of successful attack rate. |
Tasks | Image Classification |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02786v3 |
http://arxiv.org/pdf/1809.02786v3.pdf | |
PWC | https://paperswithcode.com/paper/structure-preserving-transformation |
Repo | |
Framework | |
Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning
Title | Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning |
Authors | Kunkun Pang, Mingzhi Dong, Yang Wu, Timothy Hospedales |
Abstract | Active learning (AL) aims to enable training high performance classifiers with low annotation cost by predicting which subset of unlabelled instances would be most beneficial to label. The importance of AL has motivated extensive research, proposing a wide variety of manually designed AL algorithms with diverse theoretical and intuitive motivations. In contrast to this body of research, we propose to treat active learning algorithm design as a meta-learning problem and learn the best criterion from data. We model an active learning algorithm as a deep neural network that inputs the base learner state and the unlabelled point set and predicts the best point to annotate next. Training this active query policy network with reinforcement learning, produces the best non-myopic policy for a given dataset. The key challenge in achieving a general solution to AL then becomes that of learner generalisation, particularly across heterogeneous datasets. We propose a multi-task dataset-embedding approach that allows dataset-agnostic active learners to be trained. Our evaluation shows that AL algorithms trained in this way can directly generalise across diverse problems. |
Tasks | Active Learning, Meta-Learning |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04798v1 |
http://arxiv.org/pdf/1806.04798v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-transferable-active-learning |
Repo | |
Framework | |
Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment
Title | Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment |
Authors | Masatoshi Tsuchiya |
Abstract | The quality of training data is one of the crucial problems when a learning-centered approach is employed. This paper proposes a new method to investigate the quality of a large corpus designed for the recognizing textual entailment (RTE) task. The proposed method, which is inspired by a statistical hypothesis test, consists of two phases: the first phase is to introduce the predictability of textual entailment labels as a null hypothesis which is extremely unacceptable if a target corpus has no hidden bias, and the second phase is to test the null hypothesis using a Naive Bayes model. The experimental result of the Stanford Natural Language Inference (SNLI) corpus does not reject the null hypothesis. Therefore, it indicates that the SNLI corpus has a hidden bias which allows prediction of textual entailment labels from hypothesis sentences even if no context information is given by a premise sentence. This paper also presents the performance impact of NN models for RTE caused by this hidden bias. |
Tasks | Natural Language Inference |
Published | 2018-04-22 |
URL | http://arxiv.org/abs/1804.08117v1 |
http://arxiv.org/pdf/1804.08117v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-impact-caused-by-hidden-bias-of |
Repo | |
Framework | |
Neural Disease Named Entity Extraction with Character-based BiLSTM+CRF in Japanese Medical Text
Title | Neural Disease Named Entity Extraction with Character-based BiLSTM+CRF in Japanese Medical Text |
Authors | Ken Yano |
Abstract | We propose an ‘end-to-end’ character-based recurrent neural network that extracts disease named entities from a Japanese medical text and simultaneously judges its modality as either positive or negative; i.e., the mentioned disease or symptom is affirmed or negated. The motivation to adopt neural networks is to learn effective lexical and structural representation features for Entity Recognition and also for Positive/Negative classification from an annotated corpora without explicitly providing any rule-based or manual feature sets. We confirmed the superiority of our method over previous char-based CRF or SVM methods in the results. |
Tasks | Entity Extraction |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03648v1 |
http://arxiv.org/pdf/1806.03648v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-disease-named-entity-extraction-with |
Repo | |
Framework | |
QRFA: A Data-Driven Model of Information-Seeking Dialogues
Title | QRFA: A Data-Driven Model of Information-Seeking Dialogues |
Authors | Svitlana Vakulenko, Kate Revoredo, Claudio Di Ciccio, Maarten de Rijke |
Abstract | Understanding the structure of interaction processes helps us to improve information-seeking dialogue systems. Analyzing an interaction process boils down to discovering patterns in sequences of alternating utterances exchanged between a user and an agent. Process mining techniques have been successfully applied to analyze structured event logs, discovering the underlying process models or evaluating whether the observed behavior is in conformance with the known process. In this paper, we apply process mining techniques to discover patterns in conversational transcripts and extract a new model of information-seeking dialogues, QRFA, for Query, Request, Feedback, Answer. Our results are grounded in an empirical evaluation across multiple conversational datasets from different domains, which was never attempted before. We show that the QRFA model better reflects conversation flows observed in real information-seeking conversations than models proposed previously. Moreover, QRFA allows us to identify malfunctioning in dialogue system transcripts as deviations from the expected conversation flow described by the model via conformance analysis. |
Tasks | |
Published | 2018-12-27 |
URL | http://arxiv.org/abs/1812.10720v1 |
http://arxiv.org/pdf/1812.10720v1.pdf | |
PWC | https://paperswithcode.com/paper/qrfa-a-data-driven-model-of-information |
Repo | |
Framework | |
Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation
Title | Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation |
Authors | Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang, Juang |
Abstract | The teacher-student (T/S) learning has been shown effective in unsupervised domain adaptation [1]. It is a form of transfer learning, not in terms of the transfer of recognition decisions, but the knowledge of posteriori probabilities in the source domain as evaluated by the teacher model. It learns to handle the speaker and environment variability inherent in and restricted to the speech signal in the target domain without proactively addressing the robustness to other likely conditions. Performance degradation may thus ensue. In this work, we advance T/S learning by proposing adversarial T/S learning to explicitly achieve condition-robust unsupervised domain adaptation. In this method, a student acoustic model and a condition classifier are jointly optimized to minimize the Kullback-Leibler divergence between the output distributions of the teacher and student models, and simultaneously, to min-maximize the condition classification loss. A condition-invariant deep feature is learned in the adapted student model through this procedure. We further propose multi-factorial adversarial T/S learning which suppresses condition variabilities caused by multiple factors simultaneously. Evaluated with the noisy CHiME-3 test set, the proposed methods achieve relative word error rate improvements of 44.60% and 5.38%, respectively, over a clean source model and a strong T/S learning baseline model. |
Tasks | Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00644v2 |
http://arxiv.org/pdf/1804.00644v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-teacher-student-learning-for |
Repo | |
Framework | |