October 16, 2019

2529 words 12 mins read

Paper Group NAWR 1

Paper Group NAWR 1

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text. Varying image description tasks: spoken versus written descriptions. Creating a Translation Matrix of the Bible’s Names Across 591 Languages. Encoding Sentiment Information into Word Vectors for Sentiment Analysis. Named Entity Recognition With Parallel Recurrent N …

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text

Title Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text
Authors Junjie Xing, Kenny Zhu, Shaodian Zhang
Abstract Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain. However, building domain-specific CWS requires extremely high annotation cost. In this paper, we propose an approach by exploiting domain-invariant knowledge from high resource to low resource domains. Extensive experiments show that our model achieves consistently higher accuracy than the single-task CWS and other transfer learning baselines, especially when there is a large disparity between source and target domains.
Tasks Chinese Word Segmentation, Transfer Learning
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1307/
PDF https://www.aclweb.org/anthology/C18-1307
PWC https://paperswithcode.com/paper/adaptive-multi-task-transfer-learning-for
Repo https://github.com/adapt-sjtu/AMTTL
Framework tf

Varying image description tasks: spoken versus written descriptions

Title Varying image description tasks: spoken versus written descriptions
Authors Emiel van Miltenburg, Ruud Koolen, Emiel Krahmer
Abstract Automatic image description systems are commonly trained and evaluated on written image descriptions. At the same time, these systems are often used to provide spoken descriptions (e.g. for visually impaired users) through apps like TapTapSee or Seeing AI. This is not a problem, as long as spoken and written descriptions are very similar. However, linguistic research suggests that spoken language often differs from written language. These differences are not regular, and vary from context to context. Therefore, this paper investigates whether there are differences between written and spoken image descriptions, even if they are elicited through similar tasks. We compare descriptions produced in two languages (English and Dutch), and in both languages observe substantial differences between spoken and written descriptions. Future research should see if users prefer the spoken over the written style and, if so, aim to emulate spoken descriptions.
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/W18-3910/
PDF https://www.aclweb.org/anthology/W18-3910
PWC https://paperswithcode.com/paper/varying-image-description-tasks-spoken-versus
Repo https://github.com/cltl/Spoken-versus-Written
Framework none

Creating a Translation Matrix of the Bible’s Names Across 591 Languages

Title Creating a Translation Matrix of the Bible’s Names Across 591 Languages
Authors Winston Wu, Nidhi Vyas, David Yarowsky
Abstract
Tasks Entity Alignment, Machine Translation, Morphological Analysis, Part-Of-Speech Tagging, Transliteration, Word Alignment
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1263/
PDF https://www.aclweb.org/anthology/L18-1263
PWC https://paperswithcode.com/paper/creating-a-translation-matrix-of-the-bibleas
Repo https://github.com/wswu/trabina
Framework none

Encoding Sentiment Information into Word Vectors for Sentiment Analysis

Title Encoding Sentiment Information into Word Vectors for Sentiment Analysis
Authors Zhe Ye, Fang Li, Timothy Baldwin
Abstract General-purpose pre-trained word embeddings have become a mainstay of natural language processing, and more recently, methods have been proposed to encode external knowledge into word embeddings to benefit specific downstream tasks. The goal of this paper is to encode sentiment knowledge into pre-trained word vectors to improve the performance of sentiment analysis. Our proposed method is based on a convolutional neural network (CNN) and an external sentiment lexicon. Experiments on four popular sentiment analysis datasets show that this method improves the accuracy of sentiment analysis compared to a number of benchmark methods.
Tasks Learning Word Embeddings, Sentiment Analysis, Word Embeddings
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1085/
PDF https://www.aclweb.org/anthology/C18-1085
PWC https://paperswithcode.com/paper/encoding-sentiment-information-into-word
Repo https://github.com/yezhejack/SentiNet
Framework pytorch

Named Entity Recognition With Parallel Recurrent Neural Networks

Title Named Entity Recognition With Parallel Recurrent Neural Networks
Authors Andrej {\v{Z}}ukov-Gregori{\v{c}}, Yoram Bachrach, Sam Coope
Abstract We present a new architecture for named entity recognition. Our model employs multiple independent bidirectional LSTM units across the same input and promotes diversity among them by employing an inter-model regularization term. By distributing computation across multiple smaller LSTMs we find a significant reduction in the total number of parameters. We find our architecture achieves state-of-the-art performance on the CoNLL 2003 NER dataset.
Tasks Feature Engineering, Named Entity Recognition, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2012/
PDF https://www.aclweb.org/anthology/P18-2012
PWC https://paperswithcode.com/paper/named-entity-recognition-with-parallel
Repo https://github.com/speedcell4/UOI-P18-2012
Framework none

Frame- and Entity-Based Knowledge for Common-Sense Argumentative Reasoning

Title Frame- and Entity-Based Knowledge for Common-Sense Argumentative Reasoning
Authors Teresa Botschen, Daniil Sorokin, Iryna Gurevych
Abstract Common-sense argumentative reasoning is a challenging task that requires holistic understanding of the argumentation where external knowledge about the world is hypothesized to play a key role. We explore the idea of using event knowledge about prototypical situations from FrameNet and fact knowledge about concrete entities from Wikidata to solve the task. We find that both resources can contribute to an improvement over the non-enriched approach and point out two persisting challenges: first, integration of many annotations of the same type, and second, fusion of complementary annotations. After our explorations, we question the key role of external world knowledge with respect to the argumentative reasoning task and rather point towards a logic-based analysis of the chain of reasoning.
Tasks Argument Mining, Common Sense Reasoning, Dependency Parsing, Natural Language Inference, Question Answering, Relation Classification, Semantic Parsing, Semantic Role Labeling, Word Embeddings
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-5211/
PDF https://www.aclweb.org/anthology/W18-5211
PWC https://paperswithcode.com/paper/frame-and-entity-based-knowledge-for-common
Repo https://github.com/UKPLab/emnlp2018-argmin-commonsense-knowledge
Framework none

CATS: A Tool for Customized Alignment of Text Simplification Corpora

Title CATS: A Tool for Customized Alignment of Text Simplification Corpora
Authors Sanja {\v{S}}tajner, Marc Franco-Salvador, Paolo Rosso, Simone Paolo Ponzetto
Abstract
Tasks Text Simplification
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1615/
PDF https://www.aclweb.org/anthology/L18-1615
PWC https://paperswithcode.com/paper/cats-a-tool-for-customized-alignment-of-text
Repo https://github.com/neosyon/SimpTextAlign
Framework tf

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification

Title ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
Authors Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai
Abstract SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications. Despite the maturity of Optical Character Recognition (OCR) systems dedicated to document text, scene text recognition remains a challenging problem. The large variations in background, appearance, and layout pose significant challenges, which the traditional OCR methods cannot handle effectively. Recent advances in scene text recognition are driven by the success of deep learning-based recognition models. Among them are methods that recognize text by characters using convolutional neural networks (CNN), methods that classify words with CNNs [24], [26], and methods that recognize character sequences using a combination of a CNN and a recurrent neural network (RNN) [54]. In spite of their success, these methods do not explicitly address the problem of irregular text, which is text that is not horizontal and frontal, has curved layout, etc. Instances of irregular text frequently appear in natural scenes. As exemplified in Figure 1, typical cases include oriented text, perspective text [49], and curved text. Designed without the invariance to such irregularities, previous methods often struggle in recognizing such text instances.
Tasks Optical Character Recognition, Scene Text Recognition
Published 2018-06-25
URL http://122.205.5.5:8071/UpLoadFiles/Papers/ASTER_PAMI18.pdf
PDF http://122.205.5.5:8071/UpLoadFiles/Papers/ASTER_PAMI18.pdf
PWC https://paperswithcode.com/paper/aster-an-attentional-scene-text-recognizer
Repo https://github.com/bgshih/aster
Framework tf

Ranking-Based Automatic Seed Selection and Noise Reduction for Weakly Supervised Relation Extraction

Title Ranking-Based Automatic Seed Selection and Noise Reduction for Weakly Supervised Relation Extraction
Authors Van-Thuy Phi, Joan Santoso, Masashi Shimbo, Yuji Matsumoto
Abstract This paper addresses the tasks of automatic seed selection for bootstrapping relation extraction, and noise reduction for distantly supervised relation extraction. We first point out that these tasks are related. Then, inspired by ranking relation instances and patterns computed by the HITS algorithm, and selecting cluster centroids using the K-means, LSA, or NMF method, we propose methods for selecting the initial seeds from an existing resource, or reducing the level of noise in the distantly labeled data. Experiments show that our proposed methods achieve a better performance than the baseline systems in both tasks.
Tasks Relation Extraction, Word Sense Disambiguation
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2015/
PDF https://www.aclweb.org/anthology/P18-2015
PWC https://paperswithcode.com/paper/ranking-based-automatic-seed-selection-and
Repo https://github.com/pvthuy/part-whole-relations
Framework none

Using Formulaic Expressions in Writing Assistance Systems

Title Using Formulaic Expressions in Writing Assistance Systems
Authors Kenichi Iwatsuki, Akiko Aizawa
Abstract Formulaic expressions (FEs) used in scholarly papers, such as {`}there has been little discussion about{'}, are helpful for non-native English speakers. However, it is time-consuming for users to manually search for an appropriate expression every time they want to consult FE dictionaries. For this reason, we tackle the task of semantic searches of FE dictionaries. At the start of our research, we identified two salient difficulties in this task. First, the paucity of example sentences in existing FE dictionaries results in a shortage of context information, which is necessary for acquiring semantic representation of FEs. Second, while a semantic category label is assigned to each FE in many FE dictionaries, it is difficult to predict the labels from user input, forcing users to manually designate the semantic category when searching. To address these difficulties, we propose a new framework for semantic searches of FEs and propose a new method to leverage both existing dictionaries and domain sentence corpora. Further, we expand an existing FE dictionary to consider building a more comprehensive and domain-specific FE dictionary and to verify the effectiveness of our method. |
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1227/
PDF https://www.aclweb.org/anthology/C18-1227
PWC https://paperswithcode.com/paper/using-formulaic-expressions-in-writing
Repo https://github.com/Alab-NII/FE
Framework none

Differentiable Monte Carlo Ray Tracing through Edge Sampling

Title Differentiable Monte Carlo Ray Tracing through Edge Sampling
Authors Tzu-Mao Li, Miika Aittala, Frédo Durand, Jaakko Lehtinen
Abstract Gradient-based methods are becoming increasingly important for computer graphics, machine learning, and computer vision. The ability to compute gradients is crucial to optimization, inverse problems, and deep learning. In rendering, the gradient is required with respect to variables such as camera parameters, light sources, scene geometry, or material appearance. However, computing the gradient of rendering is challenging because the rendering integral includes visibility terms that are not differentiable. Previous work on differentiable rendering has focused on approximate solutions. They often do not handle secondary effects such as shadows or global illumination, or they do not provide the gradient with respect to variables other than pixel coordinates. We introduce a general-purpose differentiable ray tracer, which, to our knowledge, is the first comprehensive solution that is able to compute derivatives of scalar functions over a rendered image with respect to arbitrary scene parameters such as camera pose, scene geometry, materials, and lighting parameters. The key to our method is a novel edge sampling algorithm that directly samples the Dirac delta functions introduced by the derivatives of the discontinuous integrand. We also develop efficient importance sampling methods based on spatial hierarchies. Our method can generate gradients in times running from seconds to minutes depending on scene complexity and desired precision. We interface our differentiable ray tracer with the deep learning library PyTorch and show prototype applications in inverse rendering and the generation of adversarial examples for neural networks.
Tasks
Published 2018-08-12
URL https://people.csail.mit.edu/tzumao/diffrt/
PDF https://people.csail.mit.edu/tzumao/diffrt/diffrt.pdf
PWC https://paperswithcode.com/paper/differentiable-monte-carlo-ray-tracing
Repo https://github.com/BachiLi/redner
Framework pytorch

Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes

Title Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes
Authors Mirko Torrisi, Manaz Kaleel, Gianluca Pollastri
Abstract Motivation: Although secondary structure predictors have been developed for decades, current ab initio methods have still some way to go to reach their theoretical limits. Moreover, the continuous effort towards harnessing ever-expanding data sets and more sophisticated, deeper Machine Learning techniques, has not come to an end. Results: Here we present Porter 5, the latest release of one of the best performing ab initio secondary structure predictors. Version 5 achieves 84% accuracy (84% SOV) when tested on 3 classes, and 73% accuracy (77% SOV) on 8 classes, on a large independent set, significantly outperforming all the most recent ab initio predictors we have tested. Availability: The web and standalone versions of Porter5 are available at http://distilldeep.ucd.ie/porter/.
Tasks Protein Secondary Structure Prediction
Published 2018-10-05
URL https://doi.org/10.1101/289033
PDF https://www.biorxiv.org/content/early/2018/10/05/289033.full.pdf
PWC https://paperswithcode.com/paper/porter-5-fast-state-of-the-art-ab-initio
Repo https://github.com/mircare/Porter5
Framework none

SUNNYNLP at SemEval-2018 Task 10: A Support-Vector-Machine-Based Method for Detecting Semantic Difference using Taxonomy and Word Embedding Features

Title SUNNYNLP at SemEval-2018 Task 10: A Support-Vector-Machine-Based Method for Detecting Semantic Difference using Taxonomy and Word Embedding Features
Authors Sunny Lai, Kwong Sak Leung, Yee Leung
Abstract We present SUNNYNLP, our system for solving SemEval 2018 Task 10: {``}Capturing Discriminative Attributes{''}. Our Support-Vector-Machine(SVM)-based system combines features extracted from pre-trained embeddings and statistical information from Is-A taxonomy to detect semantic difference of concepts pairs. Our system is demonstrated to be effective in detecting semantic difference and is ranked 1st in the competition in terms of F1 measure. The open source of our code is coined SUNNYNLP. |
Tasks Dialogue State Tracking, Question Answering, Semantic Textual Similarity
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1118/
PDF https://www.aclweb.org/anthology/S18-1118
PWC https://paperswithcode.com/paper/sunnynlp-at-semeval-2018-task-10-a-support
Repo https://github.com/Yermouth/sunnynlp
Framework none

Extremely Randomized CNets for Multi-label Classification

Title Extremely Randomized CNets for Multi-label Classification
Authors Teresa M.A. Basile, Nicola Di Mauro, Floriana Esposito
Abstract Multi-label classification (MLC) is a challenging task in ma-chine learning consisting in the prediction of multiple labels associated with a single instance. Promising approaches for MLC are those able to capture label dependencies by learning a single probabilistic model—differently from other competitive approaches requiring to learn many models. The model is then exploited to compute the most probable label configuration given the observed attributes. Cutset Networks (CNets) are density estimators leveraging context-specific independencies providing exact inference in polynomial time. The recently introduced Extremely Randomized CNets (XCNets) reduce the structure learning complexity making able to learn ensembles of XCNets outperforming state-of-the-art density estimators. In this paper we employ XCNets for MLC by exploiting efficient Most Probable Explanations (MPE). An experimental evaluation on real-world datasets shows how the proposed approach is competitive w.r.t. other sophisticated methods for MLC
Tasks Density Estimation, Multi-Label Classification
Published 2018-10-01
URL https://link.springer.com/chapter/10.1007/978-3-030-03840-3_25
PDF http://www.di.uniba.it/~ndm/pubs/basile18aixia.pdf
PWC https://paperswithcode.com/paper/extremely-randomized-cnets-for-multi-label
Repo https://github.com/nicoladimauro/mlxcnet
Framework none

Large Scale Image Segmentation with Structured Loss based Deep Learning for Connectome Reconstruction

Title Large Scale Image Segmentation with Structured Loss based Deep Learning for Connectome Reconstruction
Authors Jan Funke, Fabian David Tschopp, William Grisaitis, Arlo Sheridan, Chandan Singh, Stephan Saalfeld, Srinivas C. Turaga
Abstract We present a method combining affinity prediction with region agglomeration, which improves significantly upon the state of the art of neuron segmentation from electron microscopy (EM) in accuracy and scalability. Our method consists of a 3D U-net, trained to predict affinities between voxels, followed by iterative region agglomeration. We train using a structured loss based on MALIS, encouraging topologically correct segmentations obtained from affinity thresholding. Our extension consists of two parts: First, we present a quasi-linear method to compute the loss gradient, improving over the original quadratic algorithm. Second, we compute the gradient in two separate passes to avoid spurious gradient contributions in early training stages. Our predictions are accurate enough that simple learning-free percentile-based agglomeration outperforms more involved methods used earlier on inferior predictions. We present results on three diverse EM datasets, achieving relative improvements over previous results of 27%, 15%, and 250%. Our findings suggest that a single method can be applied to both nearly isotropic block-face EM data and anisotropic serial sectioned EM data. The runtime of our method scales linearly with the size of the volume and achieves a throughput of ~2.6 seconds per megavoxel, qualifying our method for the processing of very large datasets.
Tasks Brain Image Segmentation, Semantic Segmentation
Published 2018-05-24
URL https://ieeexplore.ieee.org/abstract/document/8364622/authors#authors
PDF https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8364622
PWC https://paperswithcode.com/paper/large-scale-image-segmentation-with
Repo https://github.com/funkey/mala
Framework tf
comments powered by Disqus