Paper Group AWR 193
The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution. EnsNet: Ensconce Text in the Wild. A Graph-to-Sequence Model for AMR-to-Text Generation. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. RFCDE: Random Forests for Conditional De …
The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution
Title | The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution |
Authors | Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung |
Abstract | We introduce a new benchmark for coreference resolution and NLI, Knowref, that targets common-sense understanding and world knowledge. Previous coreference resolution tasks can largely be solved by exploiting the number and gender of the antecedents, or have been handcrafted and do not reflect the diversity of naturally occurring text. We present a corpus of over 8,000 annotated text passages with ambiguous pronominal anaphora. These instances are both challenging and realistic. We show that various coreference systems, whether rule-based, feature-rich, or neural, perform significantly worse on the task than humans, who display high inter-annotator agreement. To explain this performance gap, we show empirically that state-of-the art models often fail to capture context, instead relying on the gender or number of candidate antecedents to make a decision. We then use problem-specific insights to propose a data-augmentation trick called antecedent switching to alleviate this tendency in models. Finally, we show that antecedent switching yields promising results on other tasks as well: we use it to achieve state-of-the-art results on the GAP coreference task. |
Tasks | Common Sense Reasoning, Coreference Resolution, Data Augmentation |
Published | 2018-11-02 |
URL | https://arxiv.org/abs/1811.01747v3 |
https://arxiv.org/pdf/1811.01747v3.pdf | |
PWC | https://paperswithcode.com/paper/the-hard-core-coreference-corpus-removing |
Repo | https://github.com/aemami1/KnowRef |
Framework | none |
EnsNet: Ensconce Text in the Wild
Title | EnsNet: Ensconce Text in the Wild |
Authors | Shuaitao Zhang, Yuliang Liu, Lianwen Jin, Yaoxiong Huang, Songxuan Lai |
Abstract | A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SMBNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device. |
Tasks | Image Text Removal |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00723v1 |
http://arxiv.org/pdf/1812.00723v1.pdf | |
PWC | https://paperswithcode.com/paper/ensnet-ensconce-text-in-the-wild |
Repo | https://github.com/HCIILAB/Scene-Text-Removal |
Framework | mxnet |
A Graph-to-Sequence Model for AMR-to-Text Generation
Title | A Graph-to-Sequence Model for AMR-to-Text Generation |
Authors | Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea |
Abstract | The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph. The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a linearized AMR structure. Although being able to model non-local semantic information, a sequence LSTM can lose information from the AMR graph structure, and thus faces challenges with large graphs, which result in long sequences. We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. On a standard benchmark, our model shows superior results to existing methods in the literature. |
Tasks | Graph-to-Sequence, Text Generation |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02473v3 |
http://arxiv.org/pdf/1805.02473v3.pdf | |
PWC | https://paperswithcode.com/paper/a-graph-to-sequence-model-for-amr-to-text |
Repo | https://github.com/freesunshine0316/neural-graph-to-seq-mp |
Framework | tf |
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Title | Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction |
Authors | Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi |
Abstract | We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature. |
Tasks | Coreference Resolution, graph construction, Joint Entity and Relation Extraction, Named Entity Recognition |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09602v1 |
http://arxiv.org/pdf/1808.09602v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-identification-of-entities |
Repo | https://github.com/danilo-dessi/skg |
Framework | none |
RFCDE: Random Forests for Conditional Density Estimation
Title | RFCDE: Random Forests for Conditional Density Estimation |
Authors | Taylor Pospisil, Ann B. Lee |
Abstract | Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or classification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Python versions, which call a common C++ library, are available. |
Tasks | Density Estimation |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05753v2 |
http://arxiv.org/pdf/1804.05753v2.pdf | |
PWC | https://paperswithcode.com/paper/rfcde-random-forests-for-conditional-density |
Repo | https://github.com/tpospisi/rfcde |
Framework | none |
Low-rank geometric mean metric learning
Title | Low-rank geometric mean metric learning |
Authors | Mukul Bhutani, Pratik Jawanpuria, Hiroyuki Kasai, Bamdev Mishra |
Abstract | We propose a low-rank approach to learning a Mahalanobis metric from data. Inspired by the recent geometric mean metric learning (GMML) algorithm, we propose a low-rank variant of the algorithm. This allows to jointly learn a low-dimensional subspace where the data reside and the Mahalanobis metric that appropriately fits the data. Our results show that we compete effectively with GMML at lower ranks. |
Tasks | Metric Learning |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05454v1 |
http://arxiv.org/pdf/1806.05454v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-geometric-mean-metric-learning |
Repo | https://github.com/muk343/LR-GMML |
Framework | none |
Scalable Population Synthesis with Deep Generative Modeling
Title | Scalable Population Synthesis with Deep Generative Modeling |
Authors | Stanislav S. Borysov, Jeppe Rich, Francisco C. Pereira |
Abstract | Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to ‘grow’ pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics. |
Tasks | |
Published | 2018-08-21 |
URL | http://arxiv.org/abs/1808.06910v2 |
http://arxiv.org/pdf/1808.06910v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-population-synthesis-with-deep |
Repo | https://github.com/fredshone/pandamonia |
Framework | none |
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Title | End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features |
Authors | Chiori Hori, Huda Alamri, Jue Wang, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh |
Abstract | Dialog systems need to understand dynamic visual scenes in order to have conversations with users about the objects and events around them. Scene-aware dialog systems for real-world applications could be developed by integrating state-of-the-art technologies from multiple research areas, including: end-to-end dialog technologies, which generate system responses using models trained from dialog data; visual question answering (VQA) technologies, which answer questions about images using learned image features; and video description technologies, in which descriptions/captions are generated from videos using multimodal information. We introduce a new dataset of dialogs about videos of human behaviors. Each dialog is a typed conversation that consists of a sequence of 10 question-and-answer(QA) pairs between two Amazon Mechanical Turk (AMT) workers. In total, we collected dialogs on roughly 9,000 videos. Using this new dataset for Audio Visual Scene-aware dialog (AVSD), we trained an end-to-end conversation model that generates responses in a dialog about a video. Our experiments demonstrate that using multimodal features that were developed for multimodal attention-based video description enhances the quality of generated dialog about dynamic scenes (videos). Our dataset, model code and pretrained models will be publicly available for a new Video Scene-Aware Dialog challenge. |
Tasks | Question Answering, Video Description, Visual Question Answering |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08409v2 |
http://arxiv.org/pdf/1806.08409v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-audio-visual-scene-aware-dialog |
Repo | https://github.com/hudaAlamri/DSTC7-Audio-Visual-Scene-Aware-Dialog-AVSD-Challenge |
Framework | pytorch |
Multi-level 3D CNN for Learning Multi-scale Spatial Features
Title | Multi-level 3D CNN for Learning Multi-scale Spatial Features |
Authors | Sambit Ghadai, Xian Lee, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy |
Abstract | 3D object recognition accuracy can be improved by learning the multi-scale spatial features from 3D spatial geometric representations of objects such as point clouds, 3D models, surfaces, and RGB-D data. Current deep learning approaches learn such features either using structured data representations (voxel grids and octrees) or from unstructured representations (graphs and point clouds). Learning features from such structured representations is limited by the restriction on resolution and tree depth while unstructured representations creates a challenge due to non-uniformity among data samples. In this paper, we propose an end-to-end multi-level learning approach on a multi-level voxel grid to overcome these drawbacks. To demonstrate the utility of the proposed multi-level learning, we use a multi-level voxel representation of 3D objects to perform object recognition. The multi-level voxel representation consists of a coarse voxel grid that contains volumetric information of the 3D object. In addition, each voxel in the coarse grid that contains a portion of the object boundary is subdivided into multiple fine-level voxel grids. The performance of our multi-level learning algorithm for object recognition is comparable to dense voxel representations while using significantly lower memory. |
Tasks | 3D Object Recognition, Object Recognition |
Published | 2018-05-30 |
URL | https://arxiv.org/abs/1805.12254v2 |
https://arxiv.org/pdf/1805.12254v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-resolution-3d-convolutional-neural |
Repo | https://github.com/idealab-isu/GPView |
Framework | none |
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Title | Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training |
Authors | Peng Xu, Andrea Madotto, Chien-Sheng Wu, Ji Ho Park, Pascale Fung |
Abstract | In this paper, we propose Emo2Vec which encodes emotional semantics into vectors. We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition. Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora. When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier. |
Tasks | Multi-Task Learning, Sentiment Analysis |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04505v1 |
http://arxiv.org/pdf/1809.04505v1.pdf | |
PWC | https://paperswithcode.com/paper/emo2vec-learning-generalized-emotion |
Repo | https://github.com/pxuab/emo2vec_wassa_paper |
Framework | pytorch |
Keep it stupid simple
Title | Keep it stupid simple |
Authors | Erik J Peterson, Necati Alp Müyesser, Timothy Verstynen, Kyle Dunovan |
Abstract | Deep reinforcement learning can match and exceed human performance, but if even minor changes are introduced to the environment artificial networks often can’t adapt. Humans meanwhile are quite adaptable. We hypothesize that this is partly because of how humans use heuristics, and partly because humans can imagine new and more challenging environments to learn from. We’ve developed a model of hierarchical reinforcement learning that combines both these elements into a stumbler-strategist network. We test transfer performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that combining imagined play with a heuristic–labeling each position as “good” or “bad”'–both accelerates learning and promotes transfer to novel games, while also improving model interpretability. |
Tasks | Hierarchical Reinforcement Learning |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03406v1 |
http://arxiv.org/pdf/1809.03406v1.pdf | |
PWC | https://paperswithcode.com/paper/keep-it-stupid-simple |
Repo | https://github.com/CoAxLab/azad |
Framework | pytorch |
Improving Super-Resolution Methods via Incremental Residual Learning
Title | Improving Super-Resolution Methods via Incremental Residual Learning |
Authors | Muneeb Aadil, Rafia Rahim, Sibt ul Hussain |
Abstract | Recently, Convolutional Neural Networks (CNNs) have shown promising performance in super-resolution (SR). However, these methods operate primarily on Low Resolution (LR) inputs for memory efficiency but this limits, as we demonstrate, their ability to (i) model high frequency information; and (ii) smoothly translate from LR to High Resolution (HR) space. To this end, we propose a novel Incremental Residual Learning (IRL) framework to address these mentioned issues. In IRL, first we select a typical SR pre-trained network as a master branch. Next we sequentially train and add residual branches to the main branch, where each residual branch is learned to model accumulated residuals of all previous branches. We plug state of the art methods in IRL framework and demonstrate consistent performance improvement on public benchmark datasets to set a new state of the art for SR at only approximately 20% increase in training time. |
Tasks | Super-Resolution |
Published | 2018-08-21 |
URL | https://arxiv.org/abs/1808.07110v2 |
https://arxiv.org/pdf/1808.07110v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-super-resolution-methods-via |
Repo | https://github.com/muneebaadil/sisr-irl |
Framework | pytorch |
Few-shot Learning for Named Entity Recognition in Medical Text
Title | Few-shot Learning for Named Entity Recognition in Medical Text |
Authors | Maximilian Hofer, Andrey Kormilitzin, Paul Goldberg, Alejo Nevado-Holgado |
Abstract | Deep neural network models have recently achieved state-of-the-art performance gains in a variety of natural language processing (NLP) tasks (Young, Hazarika, Poria, & Cambria, 2017). However, these gains rely on the availability of large amounts of annotated examples, without which state-of-the-art performance is rarely achievable. This is especially inconvenient for the many NLP fields where annotated examples are scarce, such as medical text. To improve NLP models in this situation, we evaluate five improvements on named entity recognition (NER) tasks when only ten annotated examples are available: (1) layer-wise initialization with pre-trained weights, (2) hyperparameter tuning, (3) combining pre-training data, (4) custom word embeddings, and (5) optimizing out-of-vocabulary (OOV) words. Experimental results show that the F1 score of 69.3% achievable by state-of-the-art models can be improved to 78.87%. |
Tasks | Few-Shot Learning, Medical Named Entity Recognition, Named Entity Recognition, Word Embeddings |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05468v1 |
http://arxiv.org/pdf/1811.05468v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-for-named-entity |
Repo | https://github.com/SilverQ/NER_Final |
Framework | tf |
Subsampled Rényi Differential Privacy and Analytical Moments Accountant
Title | Subsampled Rényi Differential Privacy and Analytical Moments Accountant |
Authors | Yu-Xiang Wang, Borja Balle, Shiva Kasiviswanathan |
Abstract | We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the R'enyi Differential Privacy (RDP) (Mironov, 2017) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter. Our results generalize the moments accounting technique, developed by Abadi et al. (2016) for the Gaussian mechanism, to any subsampled RDP mechanism. |
Tasks | |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1808.00087v2 |
http://arxiv.org/pdf/1808.00087v2.pdf | |
PWC | https://paperswithcode.com/paper/subsampled-renyi-differential-privacy-and |
Repo | https://github.com/yuxiangw/autodp |
Framework | none |
Density-aware Single Image De-raining using a Multi-stream Dense Network
Title | Density-aware Single Image De-raining using a Multi-stream Dense Network |
Authors | He Zhang, Vishal M. Patel |
Abstract | Single image rain streak removal is an extremely challenging problem due to the presence of non-uniform rain densities in images. We present a novel density-aware multi-stream densely connected convolutional neural network-based algorithm, called DID-MDN, for joint rain density estimation and de-raining. The proposed method enables the network itself to automatically determine the rain-density information and then efficiently remove the corresponding rain-streaks guided by the estimated rain-density label. To better characterize rain-streaks with different scales and shapes, a multi-stream densely connected de-raining network is proposed which efficiently leverages features from different scales. Furthermore, a new dataset containing images with rain-density labels is created and used to train the proposed density-aware network. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. In addition, an ablation study is performed to demonstrate the improvements obtained by different modules in the proposed method. Code can be found at: https://github.com/hezhangsprinter |
Tasks | Density Estimation |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07412v1 |
http://arxiv.org/pdf/1802.07412v1.pdf | |
PWC | https://paperswithcode.com/paper/density-aware-single-image-de-raining-using-a |
Repo | https://github.com/lsy17096535/Single-Image-Deraining |
Framework | none |