October 20, 2019

2837 words 14 mins read

Paper Group AWR 193

Paper Group AWR 193

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution. EnsNet: Ensconce Text in the Wild. A Graph-to-Sequence Model for AMR-to-Text Generation. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. RFCDE: Random Forests for Conditional De …

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution

Title The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution
Authors Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung
Abstract We introduce a new benchmark for coreference resolution and NLI, Knowref, that targets common-sense understanding and world knowledge. Previous coreference resolution tasks can largely be solved by exploiting the number and gender of the antecedents, or have been handcrafted and do not reflect the diversity of naturally occurring text. We present a corpus of over 8,000 annotated text passages with ambiguous pronominal anaphora. These instances are both challenging and realistic. We show that various coreference systems, whether rule-based, feature-rich, or neural, perform significantly worse on the task than humans, who display high inter-annotator agreement. To explain this performance gap, we show empirically that state-of-the art models often fail to capture context, instead relying on the gender or number of candidate antecedents to make a decision. We then use problem-specific insights to propose a data-augmentation trick called antecedent switching to alleviate this tendency in models. Finally, we show that antecedent switching yields promising results on other tasks as well: we use it to achieve state-of-the-art results on the GAP coreference task.
Tasks Common Sense Reasoning, Coreference Resolution, Data Augmentation
Published 2018-11-02
URL https://arxiv.org/abs/1811.01747v3
PDF https://arxiv.org/pdf/1811.01747v3.pdf
PWC https://paperswithcode.com/paper/the-hard-core-coreference-corpus-removing
Repo https://github.com/aemami1/KnowRef
Framework none

EnsNet: Ensconce Text in the Wild

Title EnsNet: Ensconce Text in the Wild
Authors Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai
Abstract A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SMBNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.
Tasks Image Text Removal
Published 2018-12-03
URL http://arxiv.org/abs/1812.00723v1
PDF http://arxiv.org/pdf/1812.00723v1.pdf
PWC https://paperswithcode.com/paper/ensnet-ensconce-text-in-the-wild
Repo https://github.com/HCIILAB/Scene-Text-Removal
Framework mxnet

A Graph-to-Sequence Model for AMR-to-Text Generation

Title A Graph-to-Sequence Model for AMR-to-Text Generation
Authors Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Abstract The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph. The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a linearized AMR structure. Although being able to model non-local semantic information, a sequence LSTM can lose information from the AMR graph structure, and thus faces challenges with large graphs, which result in long sequences. We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. On a standard benchmark, our model shows superior results to existing methods in the literature.
Tasks Graph-to-Sequence, Text Generation
Published 2018-05-07
URL http://arxiv.org/abs/1805.02473v3
PDF http://arxiv.org/pdf/1805.02473v3.pdf
PWC https://paperswithcode.com/paper/a-graph-to-sequence-model-for-amr-to-text
Repo https://github.com/freesunshine0316/neural-graph-to-seq-mp
Framework tf

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Title Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Authors Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi
Abstract We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.
Tasks Coreference Resolution, graph construction, Joint Entity and Relation Extraction, Named Entity Recognition
Published 2018-08-29
URL http://arxiv.org/abs/1808.09602v1
PDF http://arxiv.org/pdf/1808.09602v1.pdf
PWC https://paperswithcode.com/paper/multi-task-identification-of-entities
Repo https://github.com/danilo-dessi/skg
Framework none

RFCDE: Random Forests for Conditional Density Estimation

Title RFCDE: Random Forests for Conditional Density Estimation
Authors Taylor Pospisil, Ann B. Lee
Abstract Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or classification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Python versions, which call a common C++ library, are available.
Tasks Density Estimation
Published 2018-04-16
URL http://arxiv.org/abs/1804.05753v2
PDF http://arxiv.org/pdf/1804.05753v2.pdf
PWC https://paperswithcode.com/paper/rfcde-random-forests-for-conditional-density
Repo https://github.com/tpospisi/rfcde
Framework none

Low-rank geometric mean metric learning

Title Low-rank geometric mean metric learning
Authors Mukul Bhutani, Pratik Jawanpuria, Hiroyuki Kasai, Bamdev Mishra
Abstract We propose a low-rank approach to learning a Mahalanobis metric from data. Inspired by the recent geometric mean metric learning (GMML) algorithm, we propose a low-rank variant of the algorithm. This allows to jointly learn a low-dimensional subspace where the data reside and the Mahalanobis metric that appropriately fits the data. Our results show that we compete effectively with GMML at lower ranks.
Tasks Metric Learning
Published 2018-06-14
URL http://arxiv.org/abs/1806.05454v1
PDF http://arxiv.org/pdf/1806.05454v1.pdf
PWC https://paperswithcode.com/paper/low-rank-geometric-mean-metric-learning
Repo https://github.com/muk343/LR-GMML
Framework none

Scalable Population Synthesis with Deep Generative Modeling

Title Scalable Population Synthesis with Deep Generative Modeling
Authors Stanislav S. Borysov, Jeppe Rich, Francisco C. Pereira
Abstract Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to ‘grow’ pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.
Tasks
Published 2018-08-21
URL http://arxiv.org/abs/1808.06910v2
PDF http://arxiv.org/pdf/1808.06910v2.pdf
PWC https://paperswithcode.com/paper/scalable-population-synthesis-with-deep
Repo https://github.com/fredshone/pandamonia
Framework none

End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features

Title End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Authors Chiori Hori, Huda Alamri, Jue Wang, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh
Abstract Dialog systems need to understand dynamic visual scenes in order to have conversations with users about the objects and events around them. Scene-aware dialog systems for real-world applications could be developed by integrating state-of-the-art technologies from multiple research areas, including: end-to-end dialog technologies, which generate system responses using models trained from dialog data; visual question answering (VQA) technologies, which answer questions about images using learned image features; and video description technologies, in which descriptions/captions are generated from videos using multimodal information. We introduce a new dataset of dialogs about videos of human behaviors. Each dialog is a typed conversation that consists of a sequence of 10 question-and-answer(QA) pairs between two Amazon Mechanical Turk (AMT) workers. In total, we collected dialogs on roughly 9,000 videos. Using this new dataset for Audio Visual Scene-aware dialog (AVSD), we trained an end-to-end conversation model that generates responses in a dialog about a video. Our experiments demonstrate that using multimodal features that were developed for multimodal attention-based video description enhances the quality of generated dialog about dynamic scenes (videos). Our dataset, model code and pretrained models will be publicly available for a new Video Scene-Aware Dialog challenge.
Tasks Question Answering, Video Description, Visual Question Answering
Published 2018-06-21
URL http://arxiv.org/abs/1806.08409v2
PDF http://arxiv.org/pdf/1806.08409v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-audio-visual-scene-aware-dialog
Repo https://github.com/hudaAlamri/DSTC7-Audio-Visual-Scene-Aware-Dialog-AVSD-Challenge
Framework pytorch

Multi-level 3D CNN for Learning Multi-scale Spatial Features

Title Multi-level 3D CNN for Learning Multi-scale Spatial Features
Authors Sambit Ghadai, Xian Lee, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy
Abstract 3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory.
Tasks 3D Object Recognition, Object Recognition
Published 2018-05-30
URL https://arxiv.org/abs/1805.12254v2
PDF https://arxiv.org/pdf/1805.12254v2.pdf
PWC https://paperswithcode.com/paper/multi-resolution-3d-convolutional-neural
Repo https://github.com/idealab-isu/GPView
Framework none

Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training

Title Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Authors Peng Xu, Andrea Madotto, Chien-Sheng Wu, Ji Ho Park, Pascale Fung
Abstract In this paper, we propose Emo2Vec which encodes emotional semantics into vectors. We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition. Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora. When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier.
Tasks Multi-Task Learning, Sentiment Analysis
Published 2018-09-12
URL http://arxiv.org/abs/1809.04505v1
PDF http://arxiv.org/pdf/1809.04505v1.pdf
PWC https://paperswithcode.com/paper/emo2vec-learning-generalized-emotion
Repo https://github.com/pxuab/emo2vec_wassa_paper
Framework pytorch

Keep it stupid simple

Title Keep it stupid simple
Authors Erik J Peterson, Necati Alp Müyesser, Timothy Verstynen, Kyle Dunovan
Abstract Deep reinforcement learning can match and exceed human performance, but if even minor changes are introduced to the environment artificial networks often can’t adapt. Humans meanwhile are quite adaptable. We hypothesize that this is partly because of how humans use heuristics, and partly because humans can imagine new and more challenging environments to learn from. We’ve developed a model of hierarchical reinforcement learning that combines both these elements into a stumbler-strategist network. We test transfer performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that combining imagined play with a heuristic–labeling each position as “good” or “bad”'–both accelerates learning and promotes transfer to novel games, while also improving model interpretability.
Tasks Hierarchical Reinforcement Learning
Published 2018-09-10
URL http://arxiv.org/abs/1809.03406v1
PDF http://arxiv.org/pdf/1809.03406v1.pdf
PWC https://paperswithcode.com/paper/keep-it-stupid-simple
Repo https://github.com/CoAxLab/azad
Framework pytorch

Improving Super-Resolution Methods via Incremental Residual Learning

Title Improving Super-Resolution Methods via Incremental Residual Learning
Authors Muneeb Aadil, Rafia Rahim, Sibt ul Hussain
Abstract Recently, Convolutional Neural Networks (CNNs) have shown promising performance in super-resolution (SR). However, these methods operate primarily on Low Resolution (LR) inputs for memory efficiency but this limits, as we demonstrate, their ability to (i) model high frequency information; and (ii) smoothly translate from LR to High Resolution (HR) space. To this end, we propose a novel Incremental Residual Learning (IRL) framework to address these mentioned issues. In IRL, first we select a typical SR pre-trained network as a master branch. Next we sequentially train and add residual branches to the main branch, where each residual branch is learned to model accumulated residuals of all previous branches. We plug state of the art methods in IRL framework and demonstrate consistent performance improvement on public benchmark datasets to set a new state of the art for SR at only approximately 20% increase in training time.
Tasks Super-Resolution
Published 2018-08-21
URL https://arxiv.org/abs/1808.07110v2
PDF https://arxiv.org/pdf/1808.07110v2.pdf
PWC https://paperswithcode.com/paper/improving-super-resolution-methods-via
Repo https://github.com/muneebaadil/sisr-irl
Framework pytorch

Few-shot Learning for Named Entity Recognition in Medical Text

Title Few-shot Learning for Named Entity Recognition in Medical Text
Authors Maximilian Hofer, Andrey Kormilitzin, Paul Goldberg, Alejo Nevado-Holgado
Abstract Deep neural network models have recently achieved state-of-the-art performance gains in a variety of natural language processing (NLP) tasks (Young, Hazarika, Poria, & Cambria, 2017). However, these gains rely on the availability of large amounts of annotated examples, without which state-of-the-art performance is rarely achievable. This is especially inconvenient for the many NLP fields where annotated examples are scarce, such as medical text. To improve NLP models in this situation, we evaluate five improvements on named entity recognition (NER) tasks when only ten annotated examples are available: (1) layer-wise initialization with pre-trained weights, (2) hyperparameter tuning, (3) combining pre-training data, (4) custom word embeddings, and (5) optimizing out-of-vocabulary (OOV) words. Experimental results show that the F1 score of 69.3% achievable by state-of-the-art models can be improved to 78.87%.
Tasks Few-Shot Learning, Medical Named Entity Recognition, Named Entity Recognition, Word Embeddings
Published 2018-11-13
URL http://arxiv.org/abs/1811.05468v1
PDF http://arxiv.org/pdf/1811.05468v1.pdf
PWC https://paperswithcode.com/paper/few-shot-learning-for-named-entity
Repo https://github.com/SilverQ/NER_Final
Framework tf

Subsampled Rényi Differential Privacy and Analytical Moments Accountant

Title Subsampled Rényi Differential Privacy and Analytical Moments Accountant
Authors Yu-Xiang Wang, Borja Balle, Shiva Kasiviswanathan
Abstract We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the R'enyi Differential Privacy (RDP) (Mironov, 2017) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter. Our results generalize the moments accounting technique, developed by Abadi et al. (2016) for the Gaussian mechanism, to any subsampled RDP mechanism.
Tasks
Published 2018-07-31
URL http://arxiv.org/abs/1808.00087v2
PDF http://arxiv.org/pdf/1808.00087v2.pdf
PWC https://paperswithcode.com/paper/subsampled-renyi-differential-privacy-and
Repo https://github.com/yuxiangw/autodp
Framework none

Density-aware Single Image De-raining using a Multi-stream Dense Network

Title Density-aware Single Image De-raining using a Multi-stream Dense Network
Authors He Zhang, Vishal M. Patel
Abstract Single image rain streak removal is an extremely challenging problem due to the presence of non-uniform rain densities in images. We present a novel density-aware multi-stream densely connected convolutional neural network-based algorithm, called DID-MDN, for joint rain density estimation and de-raining. The proposed method enables the network itself to automatically determine the rain-density information and then efficiently remove the corresponding rain-streaks guided by the estimated rain-density label. To better characterize rain-streaks with different scales and shapes, a multi-stream densely connected de-raining network is proposed which efficiently leverages features from different scales. Furthermore, a new dataset containing images with rain-density labels is created and used to train the proposed density-aware network. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. In addition, an ablation study is performed to demonstrate the improvements obtained by different modules in the proposed method. Code can be found at: https://github.com/hezhangsprinter
Tasks Density Estimation
Published 2018-02-21
URL http://arxiv.org/abs/1802.07412v1
PDF http://arxiv.org/pdf/1802.07412v1.pdf
PWC https://paperswithcode.com/paper/density-aware-single-image-de-raining-using-a
Repo https://github.com/lsy17096535/Single-Image-Deraining
Framework none
comments powered by Disqus