Paper Group NANR 81
Comparison of Assorted Models for Transliteration. Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning. Quadtree Convolutional Neural Networks. Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks. Deep Learning with Logged Bandit Feedback. Where and Why Are They Looking? Jointly In …
Comparison of Assorted Models for Transliteration
Title | Comparison of Assorted Models for Transliteration |
Authors | Saeed Najafi, Bradley Hauer, Rashed Rubby Riyadh, Leyuan Yu, Grzegorz Kondrak |
Abstract | We report the results of our experiments in the context of the NEWS 2018 Shared Task on Transliteration. We focus on the comparison of several diverse systems, including three neural MT models. A combination of discriminative, generative, and neural models obtains the best results on the development sets. We also put forward ideas for improving the shared task. |
Tasks | Transliteration |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2412/ |
https://www.aclweb.org/anthology/W18-2412 | |
PWC | https://paperswithcode.com/paper/comparison-of-assorted-models-for |
Repo | |
Framework | |
Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning
Title | Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning |
Authors | Xiang Yu, Ngoc Thang Vu, Jonas Kuhn |
Abstract | We present a general approach with reinforcement learning (RL) to approximate dynamic oracles for transition systems where exact dynamic oracles are difficult to derive. We treat oracle parsing as a reinforcement learning problem, design the reward function inspired by the classical dynamic oracle, and use Deep Q-Learning (DQN) techniques to train the oracle with gold trees as features. The combination of a priori knowledge and data-driven methods enables an efficient dynamic oracle, which improves the parser performance over static oracles in several transition systems. |
Tasks | Dependency Parsing, Imitation Learning, Q-Learning, Structured Prediction |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6021/ |
https://www.aclweb.org/anthology/W18-6021 | |
PWC | https://paperswithcode.com/paper/approximate-dynamic-oracle-for-dependency |
Repo | |
Framework | |
Quadtree Convolutional Neural Networks
Title | Quadtree Convolutional Neural Networks |
Authors | Pradeep Kumar Jayaraman, Jianhan Mei, Jianfei Cai, Jianmin Zheng |
Abstract | This paper presents a Quadtree Convolutional Neural Network (QCNN) for efficiently learning from image datasets representing sparse data such as handwriting, pen strokes, freehand sketches, etc. Instead of storing the sparse sketches in regular dense tensors, our method decomposes and represents the image as a linear quadtree that is only refined in the non-empty portions of the image. The actual image data corresponding to non-zero pixels is stored in the finest nodes of the quadtree. Convolution and pooling operations are restricted to the sparse pixels, leading to better efficiency in computation time as well as memory usage. Specifically, the computational and memory costs in QCNN grow linearly in the number of non-zero pixels, as opposed to traditional CNNs where the costs are quadratic in the number of pixels. This enables QCNN to learn from sparse images much faster and process high resolution images without the memory constraints faced by traditional CNNs. We study QCNN on four sparse image datasets for classification and sketch simplification tasks. The results show that QCNN can obtain comparable accuracy with large reduction in computational and memory costs. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Pradeep_Kumar_Jayaraman_Quadtree_Convolutional_Neural_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Pradeep_Kumar_Jayaraman_Quadtree_Convolutional_Neural_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/quadtree-convolutional-neural-networks |
Repo | |
Framework | |
Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks
Title | Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks |
Authors | Qilong Wang, Zilin Gao, Jiangtao Xie, Wangmeng Zuo, Peihua Li |
Abstract | In most of existing deep convolutional neural networks (CNNs) for classification, global average (first-order) pooling (GAP) has become a standard module to summarize activations of the last convolution layer as final representation for prediction. Recent researches show integration of higher-order pooling (HOP) methods clearly improves performance of deep CNNs. However, both GAP and existing HOP methods assume unimodal distributions, which cannot fully capture statistics of convolutional activations, limiting representation ability of deep CNNs, especially for samples with complex contents. To overcome the above limitation, this paper proposes a global Gated Mixture of Second-order Pooling (GM-SOP) method to further improve representation ability of deep CNNs. To this end, we introduce a sparsity-constrained gating mechanism and propose a novel parametric SOP as component of mixture model. Given a bank of SOP candidates, our method can adaptively choose Top-K (K > 1) candidates for each input sample through the sparsity-constrained gating module, and performs weighted sum of outputs of K selected candidates as representation of the sample. The proposed GM-SOP can flexibly accommodate a large number of personalized SOP candidates in an efficient way, leading to richer representations. The deep networks with our GM-SOP can be end-to-end trained, having potential to characterize complex, multi-modal distributions. The proposed method is evaluated on two large scale image benchmarks (i.e., downsampled ImageNet-1K and Places365), and experimental results show our GM-SOP is superior to its counterparts and achieves very competitive performance. The source code will be available at http://www.peihuali.org/GM-SOP. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7403-global-gated-mixture-of-second-order-pooling-for-improving-deep-convolutional-neural-networks |
http://papers.nips.cc/paper/7403-global-gated-mixture-of-second-order-pooling-for-improving-deep-convolutional-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/global-gated-mixture-of-second-order-pooling |
Repo | |
Framework | |
Deep Learning with Logged Bandit Feedback
Title | Deep Learning with Logged Bandit Feedback |
Authors | Thorsten Joachims, Adith Swaminathan, Maarten de Rijke |
Abstract | We propose a new output layer for deep neural networks that permits the use of logged contextual bandit feedback for training. Such contextual bandit feedback can be available in huge quantities (e.g., logs of search engines, recommender systems) at little cost, opening up a path for training deep networks on orders of magnitude more data. To this effect, we propose a Counterfactual Risk Minimization (CRM) approach for training deep networks using an equivariant empirical risk estimator with variance regularization, BanditNet, and show how the resulting objective can be decomposed in a way that allows Stochastic Gradient Descent (SGD) training. We empirically demonstrate the effectiveness of the method by showing how deep networks – ResNets in particular – can be trained for object recognition without conventionally labeled images. |
Tasks | Object Recognition, Recommendation Systems |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJaP_-xAb |
https://openreview.net/pdf?id=SJaP_-xAb | |
PWC | https://paperswithcode.com/paper/deep-learning-with-logged-bandit-feedback |
Repo | |
Framework | |
Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
Title | Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks |
Authors | Ping Wei, Yang Liu, Tianmin Shu, Nanning Zheng, Song-Chun Zhu |
Abstract | This paper addresses a new problem - jointly inferring human attention, intentions, and tasks from videos. Given an RGB-D video where a human performs a task, we answer three questions simultaneously: 1) where the human is looking - attention prediction; 2) why the human is looking there - intention prediction; and 3) what task the human is performing - task recognition. We propose a hierarchical model of human-attention-object (HAO) which represents tasks, intentions, and attention under a unified framework. A task is represented as sequential intentions which transition to each other. An intention is composed of the human pose, attention, and objects. A beam search algorithm is adopted for inference on the HAO graph to output the attention, intention, and task results. We built a new video dataset of tasks, intentions, and attention. It contains 14 task classes, 70 intention categories, 28 object classes, 809 videos, and approximately 330,000 frames. Experiments show that our approach outperforms existing approaches. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Wei_Where_and_Why_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Wei_Where_and_Why_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/where-and-why-are-they-looking-jointly |
Repo | |
Framework | |
Recurrent Relational Networks for complex relational reasoning
Title | Recurrent Relational Networks for complex relational reasoning |
Authors | Rasmus Berg Palm, Ulrich Paquet, Ole Winther |
Abstract | Humans possess an ability to abstractly reason about objects and their interactions, an ability not shared with state-of-the-art deep learning models. Relational networks, introduced by Santoro et al. (2017), add the capacity for relational reasoning to deep neural networks, but are limited in the complexity of the reasoning tasks they can address. We introduce recurrent relational networks which increase the suite of solvable tasks to those that require an order of magnitude more steps of relational reasoning. We use recurrent relational networks to solve Sudoku puzzles and achieve state-of-the-art results by solving 96.6% of the hardest Sudoku puzzles, where relational networks fail to solve any. We also apply our model to the BaBi textual QA dataset solving 19/20 tasks which is competitive with state-of-the-art sparse differentiable neural computers. The recurrent relational network is a general purpose module that can augment any neural network model with the capacity to do many-step relational reasoning. |
Tasks | Relational Reasoning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SkJKHMW0Z |
https://openreview.net/pdf?id=SkJKHMW0Z | |
PWC | https://paperswithcode.com/paper/recurrent-relational-networks-for-complex |
Repo | |
Framework | |
Detecting Code-Switching between Turkish-English Language Pair
Title | Detecting Code-Switching between Turkish-English Language Pair |
Authors | Zeynep Yirmibe{\c{s}}o{\u{g}}lu, G{"u}l{\c{s}}en Eryi{\u{g}}it |
Abstract | Code-switching (usage of different languages within a single conversation context in an alternative manner) is a highly increasing phenomenon in social media and colloquial usage which poses different challenges for natural language processing. This paper introduces the first study for the detection of Turkish-English code-switching and also a small test data collected from social media in order to smooth the way for further studies. The proposed system using character level n-grams and conditional random fields (CRFs) obtains 95.6{%} micro-averaged F1-score on the introduced test data set. |
Tasks | Information Retrieval |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6115/ |
https://www.aclweb.org/anthology/W18-6115 | |
PWC | https://paperswithcode.com/paper/detecting-code-switching-between-turkish |
Repo | |
Framework | |
A Repository of Corpora for Summarization
Title | A Repository of Corpora for Summarization |
Authors | Franck Dernoncourt, Mohammad Ghassemi, Walter Chang |
Abstract | |
Tasks | Abstractive Text Summarization, Document Summarization, Multi-Document Summarization |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1509/ |
https://www.aclweb.org/anthology/L18-1509 | |
PWC | https://paperswithcode.com/paper/a-repository-of-corpora-for-summarization |
Repo | |
Framework | |
Analogies in Complex Verb Meaning Shifts: the Effect of Affect in Semantic Similarity Models
Title | Analogies in Complex Verb Meaning Shifts: the Effect of Affect in Semantic Similarity Models |
Authors | Maximilian K{"o}per, Sabine Schulte im Walde |
Abstract | We present a computational model to detect and distinguish analogies in meaning shifts between German base and complex verbs. In contrast to corpus-based studies, a novel dataset demonstrates that {``}regular{''} shifts represent the smallest class. Classification experiments relying on a standard similarity model successfully distinguish between four types of shifts, with verb classes boosting the performance, and affective features for abstractness, emotion and sentiment representing the most salient indicators. | |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2024/ |
https://www.aclweb.org/anthology/N18-2024 | |
PWC | https://paperswithcode.com/paper/analogies-in-complex-verb-meaning-shifts-the |
Repo | |
Framework | |
Analyzing Clothing Layer Deformation Statistics of 3D Human Motions
Title | Analyzing Clothing Layer Deformation Statistics of 3D Human Motions |
Authors | Jinlong Yang, Jean-Sebastien Franco, Franck Hetroy-Wheeler, Stefanie Wuhrer |
Abstract | Recent capture technologies and methods allow not only to retrieve 3D model sequence of moving people in clothing, but also to separate and extract the underlying body geometry, motion component and the clothing as a geometric layer. So far this clothing layer has only been used as raw offsets for individual applications such as retargeting a different body capture sequence with the clothing layer of another sequence, with limited scope, e.g. using identical or similar motions. The structured, semantics and motion-correlated nature of the information contained in this layer has yet to be fully understood and exploited. To this purpose we propose a comprehensive analysis of the statistics of this layer with a simple two-component model, based on PCA subspace reduction of the layer information on one hand, and a generic parameter regression model using neural networks on the other hand, designed to regress from any semantic parameter whose variation is observed in a training set, to the layer parameterization space. We show that this model not only allows to reproduce previous retargeting works, but generalizes the data generation capabilities to other semantic parameters such as clothing variation and size, or physical material parameters with synthetically generated training sequence, paving the way for many kinds of capture data-driven creation and augmentation applications. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Jinlong_YANG_Analyzing_Clothing_Layer_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Jinlong_YANG_Analyzing_Clothing_Layer_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-clothing-layer-deformation |
Repo | |
Framework | |
Using Author Embeddings to Improve Tweet Stance Classification
Title | Using Author Embeddings to Improve Tweet Stance Classification |
Authors | Adrian Benton, Mark Dredze |
Abstract | Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification {–} where a tweet is categorized according to a viewpoint it espouses {–} the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pretraining method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited. |
Tasks | |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6124/ |
https://www.aclweb.org/anthology/W18-6124 | |
PWC | https://paperswithcode.com/paper/using-author-embeddings-to-improve-tweet |
Repo | |
Framework | |
AffecThor at SemEval-2018 Task 1: A cross-linguistic approach to sentiment intensity quantification in tweets
Title | AffecThor at SemEval-2018 Task 1: A cross-linguistic approach to sentiment intensity quantification in tweets |
Authors | Mostafa Abdou, Artur Kulmizev, Joan Gin{'e}s i Ametll{'e} |
Abstract | In this paper we describe our submission to SemEval-2018 Task 1: Affects in Tweets. The model which we present is an ensemble of various neural architectures and gradient boosted trees, and employs three different types of vectorial tweet representations. Furthermore, our system is language-independent and ranked first in 5 out of the 12 subtasks in which we participated, while achieving competitive results in the remaining ones. Comparatively remarkable performance is observed on both the Arabic and Spanish languages. |
Tasks | Emotion Classification, Sentiment Analysis, Transfer Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1032/ |
https://www.aclweb.org/anthology/S18-1032 | |
PWC | https://paperswithcode.com/paper/affecthor-at-semeval-2018-task-1-a-cross |
Repo | |
Framework | |
Unsupervised Korean Word Sense Disambiguation using CoreNet
Title | Unsupervised Korean Word Sense Disambiguation using CoreNet |
Authors | Kijong Han, Sangha Nam, Jiseong Kim, Younggyun Hahm, Key-Sun Choi |
Abstract | |
Tasks | Dependency Parsing, Machine Translation, Word Sense Disambiguation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1165/ |
https://www.aclweb.org/anthology/L18-1165 | |
PWC | https://paperswithcode.com/paper/unsupervised-korean-word-sense-disambiguation |
Repo | |
Framework | |
A Dependency Perspective on RST Discourse Parsing and Evaluation
Title | A Dependency Perspective on RST Discourse Parsing and Evaluation |
Authors | Mathieu Morey, Philippe Muller, Nicholas Asher |
Abstract | Computational text-level discourse analysis mostly happens within Rhetorical Structure Theory (RST), whose structures have classically been presented as constituency trees, and relies on data from the RST Discourse Treebank (RST-DT); as a result, the RST discourse parsing community has largely borrowed from the syntactic constituency parsing community. The standard evaluation procedure for RST discourse parsers is thus a simplified variant of PARSEVAL, and most RST discourse parsers use techniques that originated in syntactic constituency parsing. In this article, we isolate a number of conceptual and computational problems with the constituency hypothesis. We then examine the consequences, for the implementation and evaluation of RST discourse parsers, of adopting a dependency perspective on RST structures, a view advocated so far only by a few approaches to discourse parsing. While doing that, we show the importance of the notion of headedness of RST structures. We analyze RST discourse parsing as dependency parsing by adapting to RST a recent proposal in syntactic parsing that relies on head-ordered dependency trees, a representation isomorphic to headed constituency trees. We show how to convert the original trees from the RST corpus, RST-DT, and their binarized versions used by all existing RST parsers to head-ordered dependency trees. We also propose a way to convert existing simple dependency parser output to constituent trees. This allows us to evaluate and to compare approaches from both constituent-based and dependency-based perspectives in a unified framework, using constituency and dependency metrics. We thus propose an evaluation framework to compare extant approaches easily and uniformly, something the RST parsing community has lacked up to now. We can also compare parsers{'} predictions to each other across frameworks. This allows us to characterize families of parsing strategies across the different frameworks, in particular with respect to the notion of headedness. Our experiments provide evidence for the conceptual similarities between dependency parsers and shift-reduce constituency parsers, and confirm that dependency parsing constitutes a viable approach to RST discourse parsing. |
Tasks | Constituency Parsing, Dependency Parsing |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/J18-2001/ |
https://www.aclweb.org/anthology/J18-2001 | |
PWC | https://paperswithcode.com/paper/a-dependency-perspective-on-rst-discourse |
Repo | |
Framework | |