January 29, 2020

2823 words 14 mins read

Paper Group ANR 684

Paper Group ANR 684

Generalizing Back-Translation in Neural Machine Translation. Learning an Uncertainty-Aware Object Detector for Autonomous Driving. An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective. DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs. Multimodal and Multi-vie …

Generalizing Back-Translation in Neural Machine Translation

Title Generalizing Back-Translation in Neural Machine Translation
Authors Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney
Abstract Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of cross-entropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German - English news translation task.
Tasks Data Augmentation, Machine Translation, Synthetic Data Generation
Published 2019-06-17
URL https://arxiv.org/abs/1906.07286v1
PDF https://arxiv.org/pdf/1906.07286v1.pdf
PWC https://paperswithcode.com/paper/generalizing-back-translation-in-neural
Repo
Framework

Learning an Uncertainty-Aware Object Detector for Autonomous Driving

Title Learning an Uncertainty-Aware Object Detector for Autonomous Driving
Authors Gregory P. Meyer, Niranjan Thakurdesai
Abstract The capability to detect objects is a core part of autonomous driving. Due to sensor noise and incomplete data, perfectly detecting and localizing every object is infeasible. Therefore, it is important for a detector to provide the amount of uncertainty in each prediction. Providing the autonomous system with reliable uncertainties enables the vehicle to react differently based on the level of uncertainty. Previous work has estimated the uncertainty in a detection by predicting a probability distribution over object bounding boxes. In this work, we propose a method to improve the ability to learn the probability distribution by considering the potential noise in the ground-truth labeled data. Our proposed approach improves not only the accuracy of the learned distribution but also the object detection performance.
Tasks Autonomous Driving, Object Detection
Published 2019-10-24
URL https://arxiv.org/abs/1910.11375v2
PDF https://arxiv.org/pdf/1910.11375v2.pdf
PWC https://paperswithcode.com/paper/learning-an-uncertainty-aware-object-detector
Repo
Framework

An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective

Title An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective
Authors Swathikiran Sudhakaran, Oswald Lanz
Abstract We review three recent deep learning based methods for action recognition and present a brief comparative analysis of the methods from a neurophyisiological point of view. We posit that there are some analogy between the three presented deep learning based methods and some of the existing hypotheses regarding the functioning of human brain.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01273v1
PDF https://arxiv.org/pdf/1907.01273v1.pdf
PWC https://paperswithcode.com/paper/an-analysis-of-deep-neural-networks-with
Repo
Framework

DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs

Title DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs
Authors Yi-Lin Tuan, Yun-Nung Chen, Hung-yi Lee
Abstract Data-driven, knowledge-grounded neural conversation models are capable of generating more informative responses. However, these models have not yet demonstrated that they can zero-shot adapt to updated, unseen knowledge graphs. This paper proposes a new task about how to apply dynamic knowledge graphs in neural conversation model and presents a novel TV series conversation corpus (DyKgChat) for the task. Our new task and corpus aids in understanding the influence of dynamic knowledge graphs on responses generation. Also, we propose a preliminary model that selects an output from two networks at each time step: a sequence-to-sequence model (Seq2Seq) and a multi-hop reasoning model, in order to support dynamic knowledge graphs. To benchmark this new task and evaluate the capability of adaptation, we introduce several evaluation metrics and the experiments show that our proposed approach outperforms previous knowledge-grounded conversation models. The proposed corpus and model can motivate the future research directions.
Tasks Dialogue Generation, Knowledge Graphs
Published 2019-10-01
URL https://arxiv.org/abs/1910.00610v1
PDF https://arxiv.org/pdf/1910.00610v1.pdf
PWC https://paperswithcode.com/paper/dykgchat-benchmarking-dialogue-generation
Repo
Framework

Multimodal and Multi-view Models for Emotion Recognition

Title Multimodal and Multi-view Models for Emotion Recognition
Authors Gustavo Aguilar, Viktor Rozgić, Weiran Wang, Chao Wang
Abstract Studies on emotion recognition (ER) show that combining lexical and acoustic information results in more robust and accurate models. The majority of the studies focus on settings where both modalities are available in training and evaluation. However, in practice, this is not always the case; getting ASR output may represent a bottleneck in a deployment pipeline due to computational complexity or privacy-related constraints. To address this challenge, we study the problem of efficiently combining acoustic and lexical modalities during training while still providing a deployable acoustic model that does not require lexical inputs. We first experiment with multimodal models and two attention mechanisms to assess the extent of the benefits that lexical information can provide. Then, we frame the task as a multi-view learning problem to induce semantic information from a multimodal model into our acoustic-only network using a contrastive loss function. Our multimodal model outperforms the previous state of the art on the USC-IEMOCAP dataset reported on lexical and acoustic information. Additionally, our multi-view-trained acoustic network significantly surpasses models that have been exclusively trained with acoustic features.
Tasks Emotion Recognition, MULTI-VIEW LEARNING
Published 2019-06-24
URL https://arxiv.org/abs/1906.10198v1
PDF https://arxiv.org/pdf/1906.10198v1.pdf
PWC https://paperswithcode.com/paper/multimodal-and-multi-view-models-for-emotion
Repo
Framework

Asynchronous Distributed Learning from Constraints

Title Asynchronous Distributed Learning from Constraints
Authors Francesco Farina, Stefano Melacci, Andrea Garulli, Antonio Giannitrapani
Abstract In this paper, the extension of the framework of Learning from Constraints (LfC) to a distributed setting where multiple parties, connected over the network, contribute to the learning process is studied. LfC relies on the generic notion of “constraint” to inject knowledge into the learning problem and, due to its generality, it deals with possibly nonconvex constraints, enforced either in a hard or soft way. Motivated by recent progresses in the field of distributed and constrained nonconvex optimization, we apply the (distributed) Asynchronous Method of Multipliers (ASYMM) to LfC. The study shows that such a method allows us to support scenarios where selected constraints (i.e., knowledge), data, and outcomes of the learning process can be locally stored in each computational node without being shared with the rest of the network, opening the road to further investigations into privacy-preserving LfC. Constraints act as a bridge between what is shared over the net and what is private to each node and no central authority is required. We demonstrate the applicability of these ideas in two distributed real-world settings in the context of digit recognition and document classification.
Tasks Document Classification
Published 2019-11-13
URL https://arxiv.org/abs/1911.05473v1
PDF https://arxiv.org/pdf/1911.05473v1.pdf
PWC https://paperswithcode.com/paper/asynchronous-distributed-learning-from
Repo
Framework

DermGAN: Synthetic Generation of Clinical Skin Images with Pathology

Title DermGAN: Synthetic Generation of Clinical Skin Images with Pathology
Authors Amirata Ghorbani, Vivek Natarajan, David Coz, Yuan Liu
Abstract Despite the recent success in applying supervised deep learning to medical imaging tasks, the problem of obtaining large and diverse expert-annotated datasets required for the development of high performant models remains particularly challenging. In this work, we explore the possibility of using Generative Adverserial Networks (GAN) to synthesize clinical images with skin condition. We propose DermGAN, an adaptation of the popular Pix2Pix architecture, to create synthetic images for a pre-specified skin condition while being able to vary its size, location and the underlying skin color. We demonstrate that the generated images are of high fidelity using objective GAN evaluation metrics. In a Human Turing test, we note that the synthetic images are not only visually similar to real images, but also embody the respective skin condition in dermatologists’ eyes. Finally, when using the synthetic images as a data augmentation technique for training a skin condition classifier, we observe that the model performs comparably to the baseline model overall while improving on rare but malignant conditions.
Tasks Data Augmentation
Published 2019-11-20
URL https://arxiv.org/abs/1911.08716v1
PDF https://arxiv.org/pdf/1911.08716v1.pdf
PWC https://paperswithcode.com/paper/dermgan-synthetic-generation-of-clinical-skin
Repo
Framework

Multi-dimensional Features for Prediction with Tweets

Title Multi-dimensional Features for Prediction with Tweets
Authors Nupoor Gandhi, Alex Morales, Dolores Albarracin
Abstract With the rise of opioid abuse in the US, there has been a growth of overlapping hotspots for overdose-related and HIV-related deaths in Springfield, Boston, Fall River, New Bedford, and parts of Cape Cod. With a large part of population, including rural communities, active on social media, it is crucial that we leverage the predictive power of social media as a preventive measure. We explore the predictive power of micro-blogging social media website Twitter with respect to HIV new diagnosis rates per county. While trending work in Twitter NLP has focused on primarily text-based features, we show that multi-dimensional feature construction can significantly improve the predictive power of topic features alone with respect STI’s (sexually transmitted infections). By multi-dimensional features, we mean leveraging not only the topical features (text) of a corpus, but also location-based information (counties) about the tweets in feature-construction. We develop novel text-location-based smoothing features to predict new diagnoses of HIV.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.09324v1
PDF https://arxiv.org/pdf/1910.09324v1.pdf
PWC https://paperswithcode.com/paper/multi-dimensional-features-for-prediction
Repo
Framework

Attention-based Modeling for Emotion Detection and Classification in Textual Conversations

Title Attention-based Modeling for Emotion Detection and Classification in Textual Conversations
Authors Waleed Ragheb, Jérôme Azé, Sandra Bringay, Maximilien Servajean
Abstract This paper addresses the problem of modeling textual conversations and detecting emotions. Our proposed model makes use of 1) deep transfer learning rather than the classical shallow methods of word embedding; 2) self-attention mechanisms to focus on the most important parts of the texts and 3) turn-based conversational modeling for classifying the emotions. The approach does not rely on any hand-crafted features or lexicons. Our model was evaluated on the data provided by the SemEval-2019 shared task on contextual emotion detection in text. The model shows very competitive results.
Tasks Transfer Learning
Published 2019-06-14
URL https://arxiv.org/abs/1906.07020v1
PDF https://arxiv.org/pdf/1906.07020v1.pdf
PWC https://paperswithcode.com/paper/attention-based-modeling-for-emotion
Repo
Framework

Learning Domain-Independent Planning Heuristics with Hypergraph Networks

Title Learning Domain-Independent Planning Heuristics with Hypergraph Networks
Authors William Shen, Felipe Trevizan, Sylvie Thiébaux
Abstract We present the first approach capable of learning domain-independent planning heuristics entirely from scratch. The heuristics we learn map the hypergraph representation of the delete-relaxation of the planning problem at hand, to a cost estimate that approximates that of the least-cost path from the current state to the goal through the hypergraph. We generalise Graph Networks to obtain a new framework for learning over hypergraphs, which we specialise to learn planning heuristics by training over state/value pairs obtained from optimal cost plans. Our experiments show that the resulting architecture, STRIPS-HGNs, is capable of learning heuristics that are competitive with existing delete-relaxation heuristics including LM-cut. We show that the heuristics we learn are able to generalise across different problems and domains, including to domains that were not seen during training.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1911.13101v1
PDF https://arxiv.org/pdf/1911.13101v1.pdf
PWC https://paperswithcode.com/paper/learning-domain-independent-planning
Repo
Framework

ACFNet: Attentional Class Feature Network for Semantic Segmentation

Title ACFNet: Attentional Class Feature Network for Semantic Segmentation
Authors Fan Zhang, Yanqin Chen, Zhihang Li, Zhibin Hong, Jingtuo Liu, Feifei Ma, Junyu Han, Errui Ding
Abstract Recent works have made great progress in semantic segmentation by exploiting richer context, most of which are designed from a spatial perspective. In contrast to previous works, we present the concept of class center which extracts the global context from a categorical perspective. This class-level context describes the overall representation of each class in an image. We further propose a novel module, named Attentional Class Feature (ACF) module, to calculate and adaptively combine different class centers according to each pixel. Based on the ACF module, we introduce a coarse-to-fine segmentation network, called Attentional Class Feature Network (ACFNet), which can be composed of an ACF module and any off-the-shell segmentation network (base network). In this paper, we use two types of base networks to evaluate the effectiveness of ACFNet. We achieve new state-of-the-art performance of 81.85% mIoU on Cityscapes dataset with only finely annotated data used for training.
Tasks Semantic Segmentation
Published 2019-09-20
URL https://arxiv.org/abs/1909.09408v3
PDF https://arxiv.org/pdf/1909.09408v3.pdf
PWC https://paperswithcode.com/paper/acfnet-attentional-class-feature-network-for
Repo
Framework

Swapped Face Detection using Deep Learning and Subjective Assessment

Title Swapped Face Detection using Deep Learning and Subjective Assessment
Authors Xinyi Ding, Zohreh Raziei, Eric C. Larson, Eli V. Olinick, Paul Krueger, Michael Hahsler
Abstract The tremendous success of deep learning for imaging applications has resulted in numerous beneficial advances. Unfortunately, this success has also been a catalyst for malicious uses such as photo-realistic face swapping of parties without consent. Transferring one person’s face from a source image to a target image of another person, while keeping the image photo-realistic overall has become increasingly easy and automatic, even for individuals without much knowledge of image processing. In this study, we use deep transfer learning for face swapping detection, showing true positive rates >96% with very few false alarms. Distinguished from existing methods that only provide detection accuracy, we also provide uncertainty for each prediction, which is critical for trust in the deployment of such detection systems. Moreover, we provide a comparison to human subjects. To capture human recognition performance, we build a website to collect pairwise comparisons of images from human subjects. Based on these comparisons, images are ranked from most real to most fake. We compare this ranking to the outputs from our automatic model, showing good, but imperfect, correspondence with linear correlations >0.75. Overall, the results show the effectiveness of our method. As part of this study, we create a novel, publicly available dataset that is, to the best of our knowledge, the largest public swapped face dataset created using still images. Our goal of this study is to inspire more research in the field of image forensics through the creation of a public dataset and initial analysis.
Tasks Face Detection, Face Swapping, Transfer Learning
Published 2019-09-10
URL https://arxiv.org/abs/1909.04217v1
PDF https://arxiv.org/pdf/1909.04217v1.pdf
PWC https://paperswithcode.com/paper/swapped-face-detection-using-deep-learning
Repo
Framework

UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering

Title UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering
Authors Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, Lun-Wei Ku
Abstract In relation extraction for knowledge-based question answering, searching from one entity to another entity via a single relation is called “one hop”. In related work, an exhaustive search from all one-hop relations, two-hop relations, and so on to the max-hop relations in the knowledge graph is necessary but expensive. Therefore, the number of hops is generally restricted to two or three. In this paper, we propose UHop, an unrestricted-hop framework which relaxes this restriction by use of a transition-based search framework to replace the relation-chain-based search one. We conduct experiments on conventional 1- and 2-hop questions as well as lengthy questions, including datasets such as WebQSP, PathQuestion, and Grid World. Results show that the proposed framework enables the ability to halt, works well with state-of-the-art models, achieves competitive performance without exhaustive searches, and opens the performance gap for long relation paths.
Tasks Question Answering, Relation Extraction
Published 2019-04-02
URL http://arxiv.org/abs/1904.01246v1
PDF http://arxiv.org/pdf/1904.01246v1.pdf
PWC https://paperswithcode.com/paper/uhop-an-unrestricted-hop-relation-extraction
Repo
Framework

Large-Scale Sparse Subspace Clustering Using Landmarks

Title Large-Scale Sparse Subspace Clustering Using Landmarks
Authors Farhad Pourkamali-Anaraki
Abstract Subspace clustering methods based on expressing each data point as a linear combination of all other points in a dataset are popular unsupervised learning techniques. However, existing methods incur high computational complexity on large-scale datasets as they require solving an expensive optimization problem and performing spectral clustering on large affinity matrices. This paper presents an efficient approach to subspace clustering by selecting a small subset of the input data called landmarks. The resulting subspace clustering method in the reduced domain runs in linear time with respect to the size of the original data. Numerical experiments on synthetic and real data demonstrate the effectiveness of our method.
Tasks
Published 2019-08-02
URL https://arxiv.org/abs/1908.00683v1
PDF https://arxiv.org/pdf/1908.00683v1.pdf
PWC https://paperswithcode.com/paper/large-scale-sparse-subspace-clustering-using
Repo
Framework

Event Detection in Noisy Streaming Data with Combination of Corroborative and Probabilistic Sources

Title Event Detection in Noisy Streaming Data with Combination of Corroborative and Probabilistic Sources
Authors Abhijit Suprem, Calton Pu
Abstract Global physical event detection has traditionally relied on dense coverage of physical sensors around the world; while this is an expensive undertaking, there have not been alternatives until recently. The ubiquity of social networks and human sensors in the field provides a tremendous amount of real-time, live data about true physical events from around the world. However, while such human sensor data have been exploited for retrospective large-scale event detection, such as hurricanes or earthquakes, they has been limited to no success in exploiting this rich resource for general physical event detection. Prior implementation approaches have suffered from the concept drift phenomenon, where real-world data exhibits constant, unknown, unbounded changes in its data distribution, making static machine learning models ineffective in the long term. We propose and implement an end-to-end collaborative drift adaptive system that integrates corroborative and probabilistic sources to deliver real-time predictions. Furthermore, out system is adaptive to concept drift and performs automated continuous learning to maintain high performance. We demonstrate our approach in a real-time demo available online for landslide disaster detection, with extensibility to other real-world physical events such as flooding, wildfires, hurricanes, and earthquakes.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09281v1
PDF https://arxiv.org/pdf/1911.09281v1.pdf
PWC https://paperswithcode.com/paper/event-detection-in-noisy-streaming-data-with
Repo
Framework
comments powered by Disqus