October 20, 2019

2837 words 14 mins read

Paper Group AWR 193

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution. EnsNet: Ensconce Text in the Wild. A Graph-to-Sequence Model for AMR-to-Text Generation. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. RFCDE: Random Forests for Conditional De …

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution


Title	The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution
Authors	Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung
Abstract	We introduce a new benchmark for coreference resolution and NLI, Knowref, that targets common-sense understanding and world knowledge. Previous coreference resolution tasks can largely be solved by exploiting the number and gender of the antecedents, or have been handcrafted and do not reflect the diversity of naturally occurring text. We present a corpus of over 8,000 annotated text passages with ambiguous pronominal anaphora. These instances are both challenging and realistic. We show that various coreference systems, whether rule-based, feature-rich, or neural, perform significantly worse on the task than humans, who display high inter-annotator agreement. To explain this performance gap, we show empirically that state-of-the art models often fail to capture context, instead relying on the gender or number of candidate antecedents to make a decision. We then use problem-specific insights to propose a data-augmentation trick called antecedent switching to alleviate this tendency in models. Finally, we show that antecedent switching yields promising results on other tasks as well: we use it to achieve state-of-the-art results on the GAP coreference task.
Tasks	Common Sense Reasoning, Coreference Resolution, Data Augmentation
Published	2018-11-02
URL	https://arxiv.org/abs/1811.01747v3
PDF	https://arxiv.org/pdf/1811.01747v3.pdf
PWC	https://paperswithcode.com/paper/the-hard-core-coreference-corpus-removing
Repo	https://github.com/aemami1/KnowRef
Framework	none

EnsNet: Ensconce Text in the Wild


Title	EnsNet: Ensconce Text in the Wild
Authors	Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai
Abstract	A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SMBNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.
Tasks	Image Text Removal
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00723v1
PDF	http://arxiv.org/pdf/1812.00723v1.pdf
PWC	https://paperswithcode.com/paper/ensnet-ensconce-text-in-the-wild
Repo	https://github.com/HCIILAB/Scene-Text-Removal
Framework	mxnet

A Graph-to-Sequence Model for AMR-to-Text Generation


Title	A Graph-to-Sequence Model for AMR-to-Text Generation
Authors	Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Abstract	The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph. The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a linearized AMR structure. Although being able to model non-local semantic information, a sequence LSTM can lose information from the AMR graph structure, and thus faces challenges with large graphs, which result in long sequences. We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. On a standard benchmark, our model shows superior results to existing methods in the literature.
Tasks	Graph-to-Sequence, Text Generation
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02473v3
PDF	http://arxiv.org/pdf/1805.02473v3.pdf
PWC	https://paperswithcode.com/paper/a-graph-to-sequence-model-for-amr-to-text
Repo	https://github.com/freesunshine0316/neural-graph-to-seq-mp
Framework	tf

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction


Title	Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Authors	Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi
Abstract	We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.
Tasks	Coreference Resolution, graph construction, Joint Entity and Relation Extraction, Named Entity Recognition
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09602v1
PDF	http://arxiv.org/pdf/1808.09602v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-identification-of-entities
Repo	https://github.com/danilo-dessi/skg
Framework	none

RFCDE: Random Forests for Conditional Density Estimation


Title	RFCDE: Random Forests for Conditional Density Estimation
Authors	Taylor Pospisil, Ann B. Lee
Abstract	Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or classification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Python versions, which call a common C++ library, are available.
Tasks	Density Estimation
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05753v2
PDF	http://arxiv.org/pdf/1804.05753v2.pdf
PWC	https://paperswithcode.com/paper/rfcde-random-forests-for-conditional-density
Repo	https://github.com/tpospisi/rfcde
Framework	none

Low-rank geometric mean metric learning


Title	Low-rank geometric mean metric learning
Authors	Mukul Bhutani, Pratik Jawanpuria, Hiroyuki Kasai, Bamdev Mishra
Abstract	We propose a low-rank approach to learning a Mahalanobis metric from data. Inspired by the recent geometric mean metric learning (GMML) algorithm, we propose a low-rank variant of the algorithm. This allows to jointly learn a low-dimensional subspace where the data reside and the Mahalanobis metric that appropriately fits the data. Our results show that we compete effectively with GMML at lower ranks.
Tasks	Metric Learning
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05454v1
PDF	http://arxiv.org/pdf/1806.05454v1.pdf
PWC	https://paperswithcode.com/paper/low-rank-geometric-mean-metric-learning
Repo	https://github.com/muk343/LR-GMML
Framework	none

Scalable Population Synthesis with Deep Generative Modeling


Title	Scalable Population Synthesis with Deep Generative Modeling
Authors	Stanislav S. Borysov, Jeppe Rich, Francisco C. Pereira
Abstract	Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to ‘grow’ pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.
Tasks
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06910v2
PDF	http://arxiv.org/pdf/1808.06910v2.pdf
PWC	https://paperswithcode.com/paper/scalable-population-synthesis-with-deep
Repo	https://github.com/fredshone/pandamonia
Framework	none

End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features


Title	End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Authors	Chiori Hori, Huda Alamri, Jue Wang, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh
Abstract	Dialog systems need to understand dynamic visual scenes in order to have conversations with users about the objects and events around them. Scene-aware dialog systems for real-world applications could be developed by integrating state-of-the-art technologies from multiple research areas, including: end-to-end dialog technologies, which generate system responses using models trained from dialog data; visual question answering (VQA) technologies, which answer questions about images using learned image features; and video description technologies, in which descriptions/captions are generated from videos using multimodal information. We introduce a new dataset of dialogs about videos of human behaviors. Each dialog is a typed conversation that consists of a sequence of 10 question-and-answer(QA) pairs between two Amazon Mechanical Turk (AMT) workers. In total, we collected dialogs on roughly 9,000 videos. Using this new dataset for Audio Visual Scene-aware dialog (AVSD), we trained an end-to-end conversation model that generates responses in a dialog about a video. Our experiments demonstrate that using multimodal features that were developed for multimodal attention-based video description enhances the quality of generated dialog about dynamic scenes (videos). Our dataset, model code and pretrained models will be publicly available for a new Video Scene-Aware Dialog challenge.
Tasks	Question Answering, Video Description, Visual Question Answering
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08409v2
PDF	http://arxiv.org/pdf/1806.08409v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-audio-visual-scene-aware-dialog
Repo	https://github.com/hudaAlamri/DSTC7-Audio-Visual-Scene-Aware-Dialog-AVSD-Challenge
Framework	pytorch

Multi-level 3D CNN for Learning Multi-scale Spatial Features


Title	Multi-level 3D CNN for Learning Multi-scale Spatial Features
Authors	Sambit Ghadai, Xian Lee, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy
Abstract	3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory.
Tasks	3D Object Recognition, Object Recognition
Published	2018-05-30
URL	https://arxiv.org/abs/1805.12254v2
PDF	https://arxiv.org/pdf/1805.12254v2.pdf
PWC	https://paperswithcode.com/paper/multi-resolution-3d-convolutional-neural
Repo	https://github.com/idealab-isu/GPView
Framework	none

Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training


Title	Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Authors	Peng Xu, Andrea Madotto, Chien-Sheng Wu, Ji Ho Park, Pascale Fung
Abstract	In this paper, we propose Emo2Vec which encodes emotional semantics into vectors. We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition. Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora. When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier.
Tasks	Multi-Task Learning, Sentiment Analysis
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04505v1
PDF	http://arxiv.org/pdf/1809.04505v1.pdf
PWC	https://paperswithcode.com/paper/emo2vec-learning-generalized-emotion
Repo	https://github.com/pxuab/emo2vec_wassa_paper
Framework	pytorch

Keep it stupid simple


Title	Keep it stupid simple
Authors	Erik J Peterson, Necati Alp Müyesser, Timothy Verstynen, Kyle Dunovan
Abstract	Deep reinforcement learning can match and exceed human performance, but if even minor changes are introduced to the environment artificial networks often can’t adapt. Humans meanwhile are quite adaptable. We hypothesize that this is partly because of how humans use heuristics, and partly because humans can imagine new and more challenging environments to learn from. We’ve developed a model of hierarchical reinforcement learning that combines both these elements into a stumbler-strategist network. We test transfer performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that combining imagined play with a heuristic–labeling each position as “good” or “bad”'–both accelerates learning and promotes transfer to novel games, while also improving model interpretability.
Tasks	Hierarchical Reinforcement Learning
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03406v1
PDF	http://arxiv.org/pdf/1809.03406v1.pdf
PWC	https://paperswithcode.com/paper/keep-it-stupid-simple
Repo	https://github.com/CoAxLab/azad
Framework	pytorch

Improving Super-Resolution Methods via Incremental Residual Learning


Title	Improving Super-Resolution Methods via Incremental Residual Learning
Authors	Muneeb Aadil, Rafia Rahim, Sibt ul Hussain
Abstract	Recently, Convolutional Neural Networks (CNNs) have shown promising performance in super-resolution (SR). However, these methods operate primarily on Low Resolution (LR) inputs for memory efficiency but this limits, as we demonstrate, their ability to (i) model high frequency information; and (ii) smoothly translate from LR to High Resolution (HR) space. To this end, we propose a novel Incremental Residual Learning (IRL) framework to address these mentioned issues. In IRL, first we select a typical SR pre-trained network as a master branch. Next we sequentially train and add residual branches to the main branch, where each residual branch is learned to model accumulated residuals of all previous branches. We plug state of the art methods in IRL framework and demonstrate consistent performance improvement on public benchmark datasets to set a new state of the art for SR at only approximately 20% increase in training time.
Tasks	Super-Resolution
Published	2018-08-21
URL	https://arxiv.org/abs/1808.07110v2
PDF	https://arxiv.org/pdf/1808.07110v2.pdf
PWC	https://paperswithcode.com/paper/improving-super-resolution-methods-via
Repo	https://github.com/muneebaadil/sisr-irl
Framework	pytorch

Few-shot Learning for Named Entity Recognition in Medical Text


Title	Few-shot Learning for Named Entity Recognition in Medical Text
Authors	Maximilian Hofer, Andrey Kormilitzin, Paul Goldberg, Alejo Nevado-Holgado
Abstract	Deep neural network models have recently achieved state-of-the-art performance gains in a variety of natural language processing (NLP) tasks (Young, Hazarika, Poria, & Cambria, 2017). However, these gains rely on the availability of large amounts of annotated examples, without which state-of-the-art performance is rarely achievable. This is especially inconvenient for the many NLP fields where annotated examples are scarce, such as medical text. To improve NLP models in this situation, we evaluate five improvements on named entity recognition (NER) tasks when only ten annotated examples are available: (1) layer-wise initialization with pre-trained weights, (2) hyperparameter tuning, (3) combining pre-training data, (4) custom word embeddings, and (5) optimizing out-of-vocabulary (OOV) words. Experimental results show that the F1 score of 69.3% achievable by state-of-the-art models can be improved to 78.87%.
Tasks	Few-Shot Learning, Medical Named Entity Recognition, Named Entity Recognition, Word Embeddings
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05468v1
PDF	http://arxiv.org/pdf/1811.05468v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-learning-for-named-entity
Repo	https://github.com/SilverQ/NER_Final
Framework	tf

Subsampled Rényi Differential Privacy and Analytical Moments Accountant


Title	Subsampled Rényi Differential Privacy and Analytical Moments Accountant
Authors	Yu-Xiang Wang, Borja Balle, Shiva Kasiviswanathan
Abstract	We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the R'enyi Differential Privacy (RDP) (Mironov, 2017) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter. Our results generalize the moments accounting technique, developed by Abadi et al. (2016) for the Gaussian mechanism, to any subsampled RDP mechanism.
Tasks
Published	2018-07-31
URL	http://arxiv.org/abs/1808.00087v2
PDF	http://arxiv.org/pdf/1808.00087v2.pdf
PWC	https://paperswithcode.com/paper/subsampled-renyi-differential-privacy-and
Repo	https://github.com/yuxiangw/autodp
Framework	none

Density-aware Single Image De-raining using a Multi-stream Dense Network


Title	Density-aware Single Image De-raining using a Multi-stream Dense Network
Authors	He Zhang, Vishal M. Patel
Abstract	Single image rain streak removal is an extremely challenging problem due to the presence of non-uniform rain densities in images. We present a novel density-aware multi-stream densely connected convolutional neural network-based algorithm, called DID-MDN, for joint rain density estimation and de-raining. The proposed method enables the network itself to automatically determine the rain-density information and then efficiently remove the corresponding rain-streaks guided by the estimated rain-density label. To better characterize rain-streaks with different scales and shapes, a multi-stream densely connected de-raining network is proposed which efficiently leverages features from different scales. Furthermore, a new dataset containing images with rain-density labels is created and used to train the proposed density-aware network. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. In addition, an ablation study is performed to demonstrate the improvements obtained by different modules in the proposed method. Code can be found at: https://github.com/hezhangsprinter
Tasks	Density Estimation
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07412v1
PDF	http://arxiv.org/pdf/1802.07412v1.pdf
PWC	https://paperswithcode.com/paper/density-aware-single-image-de-raining-using-a
Repo	https://github.com/lsy17096535/Single-Image-Deraining
Framework	none