Paper Group NANR 10
Make SVM great again with Siamese kernel for few-shot learning. Learning objects from pixels. Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning. Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation. Report of NEWS 2018 Named Entity Transliteration Shared Task. Why does PairDiff work? - A Mathemat …
Make SVM great again with Siamese kernel for few-shot learning
Title | Make SVM great again with Siamese kernel for few-shot learning |
Authors | Bence Tilk |
Abstract | While deep neural networks have shown outstanding results in a wide range of applications, learning from a very limited number of examples is still a challenging task. Despite the difficulties of the few-shot learning, metric-learning techniques showed the potential of the neural networks for this task. While these methods perform well, they don’t provide satisfactory results. In this work, the idea of metric-learning is extended with Support Vector Machines (SVM) working mechanism, which is well known for generalization capabilities on a small dataset. Furthermore, this paper presents an end-to-end learning framework for training adaptive kernel SVMs, which eliminates the problem of choosing a correct kernel and good features for SVMs. Next, the one-shot learning problem is redefined for audio signals. Then the model was tested on vision task (using Omniglot dataset) and speech task (using TIMIT dataset) as well. Actually, the algorithm using Omniglot dataset improved accuracy from 98.1% to 98.5% on the one-shot classification task and from 98.9% to 99.3% on the few-shot classification task. |
Tasks | Few-Shot Learning, Metric Learning, Omniglot, One-Shot Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1EVwkqTW |
https://openreview.net/pdf?id=B1EVwkqTW | |
PWC | https://paperswithcode.com/paper/make-svm-great-again-with-siamese-kernel-for |
Repo | |
Framework | |
Learning objects from pixels
Title | Learning objects from pixels |
Authors | David Saxton |
Abstract | We show how discrete objects can be learnt in an unsupervised fashion from pixels, and how to perform reinforcement learning using this object representation. More precisely, we construct a differentiable mapping from an image to a discrete tabular list of objects, where each object consists of a differentiable position, feature vector, and scalar presence value that allows the representation to be learnt using an attention mechanism. Applying this mapping to Atari games, together with an interaction net-style architecture for calculating quantities from objects, we construct agents that can play Atari games using objects learnt in an unsupervised fashion. During training, many natural objects emerge, such as the ball and paddles in Pong, and the submarine and fish in Seaquest. This gives the first reinforcement learning agent for Atari with an interpretable object representation, and opens the avenue for agents that can conduct object-based exploration and generalization. |
Tasks | Atari Games |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJDUjKeA- |
https://openreview.net/pdf?id=HJDUjKeA- | |
PWC | https://paperswithcode.com/paper/learning-objects-from-pixels |
Repo | |
Framework | |
Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning
Title | Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning |
Authors | Pravallika Etoori, Manoj Chinnakotla, Radhika Mamidi |
Abstract | Spelling correction is a well-known task in Natural Language Processing (NLP). Automatic spelling correction is important for many NLP applications like web search engines, text summarization, sentiment analysis etc. Most approaches use parallel data of noisy and correct word mappings from different sources as training data for automatic spelling correction. Indic languages are resource-scarce and do not have such parallel data due to low volume of queries and non-existence of such prior implementations. In this paper, we show how to build an automatic spelling corrector for resource-scarce languages. We propose a sequence-to-sequence deep learning model which trains end-to-end. We perform experiments on synthetic datasets created for Indic languages, Hindi and Telugu, by incorporating the spelling mistakes committed at character level. A comparative evaluation shows that our model is competitive with the existing spell checking and correction techniques for Indic languages. |
Tasks | Machine Translation, Sentiment Analysis, Spelling Correction, Text Summarization |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-3021/ |
https://www.aclweb.org/anthology/P18-3021 | |
PWC | https://paperswithcode.com/paper/automatic-spelling-correction-for-resource |
Repo | |
Framework | |
Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation
Title | Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation |
Authors | Zhiting Hu, Zichao Yang, Tiancheng Zhao, Haoran Shi, Junxian He, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Lianhui Qin, Devendra Singh Chaplot, Bowen Tan, Xingjiang Yu, Eric Xing |
Abstract | We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks. Different from many existing toolkits that are specialized for specific applications (e.g., neural machine translation), Texar is designed to be highly flexible and versatile. This is achieved by abstracting the common patterns underlying the diverse tasks and methodologies, creating a library of highly reusable modules and functionalities, and enabling arbitrary model architectures and various algorithmic paradigms. The features make Texar particularly suitable for technique sharing and generalization across different text generation applications. The toolkit emphasizes heavily on extensibility and modularized system design, so that components can be freely plugged in or swapped out. We conduct extensive experiments and case studies to demonstrate the use and advantage of the toolkit. |
Tasks | Image Captioning, Machine Translation, Text Generation, Text Summarization |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2503/ |
https://www.aclweb.org/anthology/W18-2503 | |
PWC | https://paperswithcode.com/paper/texar-a-modularized-versatile-and-extensible |
Repo | |
Framework | |
Report of NEWS 2018 Named Entity Transliteration Shared Task
Title | Report of NEWS 2018 Named Entity Transliteration Shared Task |
Authors | Nancy Chen, Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li |
Abstract | This report presents the results from the Named Entity Transliteration Shared Task conducted as part of The Seventh Named Entities Workshop (NEWS 2018) held at ACL 2018 in Melbourne, Australia. Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts. A total of 6 teams from 8 different institutions participated in the evaluation, submitting 424 runs, involving different transliteration methodologies. Four performance metrics were used to report the evaluation results. The NEWS shared task on machine transliteration has successfully achieved its objectives by providing a common ground for the research community to conduct comparative evaluations of state-of-the-art technologies that will benefit the future research and development in this area. |
Tasks | Information Retrieval, Machine Translation, Transliteration |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2409/ |
https://www.aclweb.org/anthology/W18-2409 | |
PWC | https://paperswithcode.com/paper/report-of-news-2018-named-entity |
Repo | |
Framework | |
Why does PairDiff work? - A Mathematical Analysis of Bilinear Relational Compositional Operators for Analogy Detection
Title | Why does PairDiff work? - A Mathematical Analysis of Bilinear Relational Compositional Operators for Analogy Detection |
Authors | Huda Hakami, Kohei Hayashi, Danushka Bollegala |
Abstract | Representing the semantic relations that exist between two given words (or entities) is an important first step in a wide-range of NLP applications such as analogical reasoning, knowledge base completion and relational information retrieval. A simple, yet surprisingly accurate method for representing a relation between two words is to compute the vector offset (PairDiff) between their corresponding word embeddings. Despite the empirical success, it remains unclear as to whether PairDiff is the best operator for obtaining a relational representation from word embeddings. We conduct a theoretical analysis of generalised bilinear operators that can be used to measure the l2 relational distance between two word-pairs. We show that, if the word embed- dings are standardised and uncorrelated, such an operator will be independent of bilinear terms, and can be simplified to a linear form, where PairDiff is a special case. For numerous word embedding types, we empirically verify the uncorrelation assumption, demonstrating the general applicability of our theoretical result. Moreover, we experimentally discover PairDiff from the bilinear relational compositional operator on several benchmark analogy datasets. |
Tasks | Information Retrieval, Knowledge Base Completion, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1211/ |
https://www.aclweb.org/anthology/C18-1211 | |
PWC | https://paperswithcode.com/paper/why-does-pairdiff-work-a-mathematical |
Repo | |
Framework | |
Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018)
Title | Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018) |
Authors | |
Abstract | |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/W18-0100/ |
https://www.aclweb.org/anthology/W18-0100 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-8th-workshop-on-cognitive |
Repo | |
Framework | |
The Automatic Annotation of the Semiotic Type of Hand Gestures in Obama’ s Humorous Speeches
Title | The Automatic Annotation of the Semiotic Type of Hand Gestures in Obama’ s Humorous Speeches |
Authors | Costanza Navarretta |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1172/ |
https://www.aclweb.org/anthology/L18-1172 | |
PWC | https://paperswithcode.com/paper/the-automatic-annotation-of-the-semiotic-type |
Repo | |
Framework | |
UniMa at SemEval-2018 Task 7: Semantic Relation Extraction and Classification from Scientific Publications
Title | UniMa at SemEval-2018 Task 7: Semantic Relation Extraction and Classification from Scientific Publications |
Authors | Thorsten Keiper, Zhonghao Lyu, Sara Pooladzadeh, Yuan Xu, Jingyi Zhang, Anne Lauscher, Simone Paolo Ponzetto |
Abstract | Large repositories of scientific literature call for the development of robust methods to extract information from scholarly papers. This problem is addressed by the SemEval 2018 Task 7 on extracting and classifying relations found within scientific publications. In this paper, we present a feature-based and a deep learning-based approach to the task and discuss the results of the system runs that we submitted for evaluation. |
Tasks | Relation Classification, Relation Extraction |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1132/ |
https://www.aclweb.org/anthology/S18-1132 | |
PWC | https://paperswithcode.com/paper/unima-at-semeval-2018-task-7-semantic |
Repo | |
Framework | |
Training Neural Machines with Trace-Based Supervision
Title | Training Neural Machines with Trace-Based Supervision |
Authors | Matthew Mirman, Dimitar Dimitrov, Pavle Djordjevic, Timon Gehr, Martin Vechev |
Abstract | We investigate the effectiveness of trace-based supervision methods for training existing neural abstract machines. To define the class of neural machines amenable to trace-based supervision, we introduce the concept of a differential neural computational machine (dNCM) and show that several existing architectures (NTMs, NRAMs) can be described as dNCMs. We performed a detailed experimental evaluation with NTM and NRAM machines, showing that additional supervision on the interpretable portions of these architectures leads to better convergence and generalization capabilities of the learning phase than standard training, in both noise-free and noisy scenarios. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2452 |
http://proceedings.mlr.press/v80/mirman18a/mirman18a.pdf | |
PWC | https://paperswithcode.com/paper/training-neural-machines-with-trace-based |
Repo | |
Framework | |
K-means clustering using random matrix sparsification
Title | K-means clustering using random matrix sparsification |
Authors | Kaushik Sinha |
Abstract | K-means clustering algorithm using Lloyd’s heuristic is one of the most commonly used tools in data mining and machine learning that shows promising performance. However, it suffers from a high computational cost resulting from pairwise Euclidean distance computations between data points and cluster centers in each iteration of Lloyd’s heuristic. Main contributing factor of this computational bottle neck is a matrix-vector multiplication step, where the matrix contains all the data points and the vector is a cluster center. In this paper we show that we can randomly sparsify the original data matrix resulting in a sparse data matrix which can significantly speed up the above mentioned matrix vector multiplication step without significantly affecting cluster quality. In particular, we show that optimal k-means clustering solution of the sparse data matrix, obtained by applying random matrix sparsification, results in an approximately optimal k-means clustering objective of the original data matrix. Our empirical studies on three real world datasets corroborate our theoretical findings and demonstrate that our proposed sparsification method can indeed achieve satisfactory clustering performance. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2072 |
http://proceedings.mlr.press/v80/sinha18a/sinha18a.pdf | |
PWC | https://paperswithcode.com/paper/k-means-clustering-using-random-matrix |
Repo | |
Framework | |
Chat Discrimination for Intelligent Conversational Agents with a Hybrid CNN-LMTGRU Network
Title | Chat Discrimination for Intelligent Conversational Agents with a Hybrid CNN-LMTGRU Network |
Authors | Dennis Singh Moirangthem, Minho Lee |
Abstract | Recently, intelligent dialog systems and smart assistants have attracted the attention of many, and development of novel dialogue agents have become a research challenge. Intelligent agents that can handle both domain-specific task-oriented and open-domain chit-chat dialogs are one of the major requirements in the current systems. In order to address this issue and to realize such smart hybrid dialogue systems, we develop a model to discriminate user utterance between task-oriented and chit-chat conversations. We introduce a hybrid of convolutional neural network (CNN) and a lateral multiple timescale gated recurrent units (LMTGRU) that can represent multiple temporal scale dependencies for the discrimination task. With the help of the combined slow and fast units of the LMTGRU, our model effectively determines whether a user will have a chit-chat conversation or a task-specific conversation with the system. We also show that the LMTGRU structure helps the model to perform well on longer text inputs. We address the lack of dataset by constructing a dataset using Twitter and Maluuba Frames data. The results of the experiments demonstrate that the proposed hybrid network outperforms the conventional models on the chat discrimination task as well as performed comparable to the baselines on various benchmark datasets. |
Tasks | Representation Learning |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3004/ |
https://www.aclweb.org/anthology/W18-3004 | |
PWC | https://paperswithcode.com/paper/chat-discrimination-for-intelligent |
Repo | |
Framework | |
HL-EncDec: A Hybrid-Level Encoder-Decoder for Neural Response Generation
Title | HL-EncDec: A Hybrid-Level Encoder-Decoder for Neural Response Generation |
Authors | Sixing Wu, Dawei Zhang, Ying Li, Xing Xie, Zhonghai Wu |
Abstract | Recent years have witnessed a surge of interest on response generation for neural conversation systems. Most existing models are implemented by following the Encoder-Decoder framework and operate sentences of conversations at word-level. The word-level model is suffering from the Unknown Words Issue and the Preference Issue, which seriously impact the quality of generated responses, for example, generated responses may become irrelevant or too general (i.e. safe responses). To address these issues, this paper proposes a hybrid-level Encoder-Decoder model (HL-EncDec), which not only utilizes the word-level features but also character-level features. We conduct several experiments to evaluate HL-EncDec on a Chinese corpus, experimental results show our model significantly outperforms other non-word-level models in automatic metrics and human annotations and is able to generate more informative responses. We also conduct experiments with a small-scale English dataset to show the generalization ability. |
Tasks | Abstractive Text Summarization, Machine Translation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1072/ |
https://www.aclweb.org/anthology/C18-1072 | |
PWC | https://paperswithcode.com/paper/hl-encdec-a-hybrid-level-encoder-decoder-for |
Repo | |
Framework | |
Guess Me if You Can: Acronym Disambiguation for Enterprises
Title | Guess Me if You Can: Acronym Disambiguation for Enterprises |
Authors | Yang Li, Bo Zhao, Ariel Fuxman, Fangbo Tao |
Abstract | Acronyms are abbreviations formed from the initial components of words or phrases. In enterprises, people often use acronyms to make communications more efficient. However, acronyms could be difficult to understand for people who are not familiar with the subject matter (new employees, etc.), thereby affecting productivity. To alleviate such troubles, we study how to automatically resolve the true meanings of acronyms in a given context. Acronym disambiguation for enterprises is challenging for several reasons. First, acronyms may be highly ambiguous since an acronym used in the enterprise could have multiple internal and external meanings. Second, there are usually no comprehensive knowledge bases such as Wikipedia available in enterprises. Finally, the system should be generic to work for any enterprise. In this work we propose an end-to-end framework to tackle all these challenges. The framework takes the enterprise corpus as input and produces a high-quality acronym disambiguation system as output. Our disambiguation models are trained via distant supervised learning, without requiring any manually labeled training examples. Therefore, our proposed framework can be deployed to any enterprise to support high-quality acronym disambiguation. Experimental results on real world data justified the effectiveness of our system. |
Tasks | Question Answering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1121/ |
https://www.aclweb.org/anthology/P18-1121 | |
PWC | https://paperswithcode.com/paper/guess-me-if-you-can-acronym-disambiguation |
Repo | |
Framework | |
CL Scholar: The ACL Anthology Knowledge Graph Miner
Title | CL Scholar: The ACL Anthology Knowledge Graph Miner |
Authors | Mayank Singh, Pradeep Dogga, Sohan Patro, Dhiraj Barnwal, Ritam Dutt, Rajarshi Haldar, Pawan Goyal, Animesh Mukherjee |
Abstract | We present CL Scholar, the ACL Anthology knowledge graph miner to facilitate high-quality search and exploration of current research progress in the computational linguistics community. In contrast to previous works, periodically crawling, indexing and processing of new incoming articles is completely automated in the current system. CL Scholar utilizes both textual and network information for knowledge graph construction. As an additional novel initiative, CL Scholar supports more than 1200 scholarly natural language queries along with standard keyword-based search on constructed knowledge graph. It answers binary, statistical and list based natural language queries. The current system is deployed at \url{http://cnerg.iitkgp.ac.in/aclakg}. We also provide REST API support along with bulk download facility. Our code and data are available at \url{https://github.com/CLScholar}. |
Tasks | graph construction, Optical Character Recognition |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-5004/ |
https://www.aclweb.org/anthology/N18-5004 | |
PWC | https://paperswithcode.com/paper/cl-scholar-the-acl-anthology-knowledge-graph |
Repo | |
Framework | |