Paper Group AWR 126
Accurate Visual Localization for Automotive Applications. Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual. Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts. Single Image Deraining: A Comprehensive Benchmark Analysis. Constructive Type-Logical Supertagging with Self-Attention Networks. Improving Robus …
Accurate Visual Localization for Automotive Applications
Title | Accurate Visual Localization for Automotive Applications |
Authors | Eli Brosh, Matan Friedmann, Ilan Kadar, Lev Yitzhak Lavy, Elad Levi, Shmuel Rippa, Yair Lempert, Bruno Fernandez-Ruiz, Roei Herzig, Trevor Darrell |
Abstract | Accurate vehicle localization is a crucial step towards building effective Vehicle-to-Vehicle networks and automotive applications. Yet standard grade GPS data, such as that provided by mobile phones, is often noisy and exhibits significant localization errors in many urban areas. Approaches for accurate localization from imagery often rely on structure-based techniques, and thus are limited in scale and are expensive to compute. In this paper, we present a scalable visual localization approach geared for real-time performance. We propose a hybrid coarse-to-fine approach that leverages visual and GPS location cues. Our solution uses a self-supervised approach to learn a compact road image representation. This representation enables efficient visual retrieval and provides coarse localization cues, which are fused with vehicle ego-motion to obtain high accuracy location estimates. As a benchmark to evaluate the performance of our visual localization approach, we introduce a new large-scale driving dataset based on video and GPS data obtained from a large-scale network of connected dash-cams. Our experiments confirm that our approach is highly effective in challenging urban environments, reducing localization error by an order of magnitude. |
Tasks | Visual Localization |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.03706v1 |
http://arxiv.org/pdf/1905.03706v1.pdf | |
PWC | https://paperswithcode.com/paper/190503706 |
Repo | https://github.com/getnexar/Nexar-Visual-Localization |
Framework | none |
Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual
Title | Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual |
Authors | He He, Sheng Zha, Haohan Wang |
Abstract | Statistical natural language inference (NLI) models are susceptible to learning dataset bias: superficial cues that happen to associate with the label on a particular dataset, but are not useful in general, e.g., negation words indicate contradiction. As exposed by several recent challenge datasets, these models perform poorly when such association is absent, e.g., predicting that “I love dogs” contradicts “I don’t love cats”. Our goal is to design learning algorithms that guard against known dataset bias. We formalize the concept of dataset bias under the framework of distribution shift and present a simple debiasing algorithm based on residual fitting, which we call DRiFt. We first learn a biased model that only uses features that are known to relate to dataset bias. Then, we train a debiased model that fits to the residual of the biased model, focusing on examples that cannot be predicted well by biased features only. We use DRiFt to train three high-performing NLI models on two benchmark datasets, SNLI and MNLI. Our debiased models achieve significant gains over baseline models on two challenge test sets, while maintaining reasonable performance on the original test sets. |
Tasks | Natural Language Inference |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10763v2 |
https://arxiv.org/pdf/1908.10763v2.pdf | |
PWC | https://paperswithcode.com/paper/unlearn-dataset-bias-in-natural-language |
Repo | https://github.com/hhexiy/debiased |
Framework | mxnet |
Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts
Title | Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts |
Authors | Rui Xia, Zixiang Ding |
Abstract | Emotion cause extraction (ECE), the task aimed at extracting the potential causes behind certain emotions in text, has gained much attention in recent years due to its wide applications. However, it suffers from two shortcomings: 1) the emotion must be annotated before cause extraction in ECE, which greatly limits its applications in real-world scenarios; 2) the way to first annotate emotion and then extract the cause ignores the fact that they are mutually indicative. In this work, we propose a new task: emotion-cause pair extraction (ECPE), which aims to extract the potential pairs of emotions and corresponding causes in a document. We propose a 2-step approach to address this new ECPE task, which first performs individual emotion extraction and cause extraction via multi-task learning, and then conduct emotion-cause pairing and filtering. The experimental results on a benchmark emotion cause corpus prove the feasibility of the ECPE task as well as the effectiveness of our approach. |
Tasks | Emotion Recognition, Multi-Task Learning |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01267v1 |
https://arxiv.org/pdf/1906.01267v1.pdf | |
PWC | https://paperswithcode.com/paper/emotion-cause-pair-extraction-a-new-task-to |
Repo | https://github.com/NUSTM/ECPE |
Framework | tf |
Single Image Deraining: A Comprehensive Benchmark Analysis
Title | Single Image Deraining: A Comprehensive Benchmark Analysis |
Authors | Siyuan Li, Iago Breno Araujo, Wenqi Ren, Zhangyang Wang, Eric K. Tokuda, Roberto Hirata Junior, Roberto Cesar-Junior, Jiawan Zhang, Xiaojie Guo, Xiaochun Cao |
Abstract | We present a comprehensive study and evaluation of existing single image deraining algorithms, using a new large-scale benchmark consisting of both synthetic and real-world rainy images.This dataset highlights diverse data sources and image contents, and is divided into three subsets (rain streak, rain drop, rain and mist), each serving different training or evaluation purposes. We further provide a rich variety of criteria for dehazing algorithm evaluation, ranging from full-reference metrics, to no-reference metrics, to subjective evaluation and the novel task-driven evaluation. Experiments on the dataset shed light on the comparisons and limitations of state-of-the-art deraining algorithms, and suggest promising future directions. |
Tasks | Rain Removal, Single Image Deraining |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08558v1 |
http://arxiv.org/pdf/1903.08558v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-deraining-a-comprehensive |
Repo | https://github.com/lsy17096535/Single-Image-Deraining |
Framework | none |
Constructive Type-Logical Supertagging with Self-Attention Networks
Title | Constructive Type-Logical Supertagging with Self-Attention Networks |
Authors | Konstantinos Kogkalidis, Michael Moortgat, Tejaswini Deoskar |
Abstract | We propose a novel application of self-attention networks towards grammar induction. We present an attention-based supertagger for a refined type-logical grammar, trained on constructing types inductively. In addition to achieving a high overall type accuracy, our model is able to learn the syntax of the grammar’s type system along with its denotational semantics. This lifts the closed world assumption commonly made by lexicalized grammar supertaggers, greatly enhancing its generalization potential. This is evidenced both by its adequate accuracy over sparse word types and its ability to correctly construct complex types never seen during training, which, to the best of our knowledge, was as of yet unaccomplished. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13418v1 |
https://arxiv.org/pdf/1905.13418v1.pdf | |
PWC | https://paperswithcode.com/paper/constructive-type-logical-supertagging-with |
Repo | https://github.com/konstantinosKokos/Lassy-TLG-Supertagging |
Framework | pytorch |
Improving Robustness of Deep Learning Based Knee MRI Segmentation: Mixup and Adversarial Domain Adaptation
Title | Improving Robustness of Deep Learning Based Knee MRI Segmentation: Mixup and Adversarial Domain Adaptation |
Authors | Egor Panfilov, Aleksei Tiulpin, Stefan Klein, Miika T. Nieminen, Simo Saarakkala |
Abstract | Degeneration of articular cartilage (AC) is actively studied in knee osteoarthritis (OA) research via magnetic resonance imaging (MRI). Segmentation of AC tissues from MRI data is an essential step in quantification of their damage. Deep learning (DL) based methods have shown potential in this realm and are the current state-of-the-art, however, their robustness to heterogeneity of MRI acquisition settings remains an open problem. In this study, we investigated two modern regularization techniques – mixup and adversarial unsupervised domain adaptation (UDA) – to improve the robustness of DL-based knee cartilage segmentation to new MRI acquisition settings. Our validation setup included two datasets produced by different MRI scanners and using distinct data acquisition protocols. We assessed the robustness of automatic segmentation by comparing mixup and UDA approaches to a strong baseline method at different OA severity stages and, additionally, in relation to anatomical locations. Our results showed that for moderate changes in knee MRI data acquisition settings both approaches may provide notable improvements in the robustness, which are consistent for all stages of the disease and affect the clinically important areas of the knee joint. However, mixup may be considered as a recommended approach, since it is more computationally efficient and does not require additional data from the target acquisition setup. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.04126v3 |
https://arxiv.org/pdf/1908.04126v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-robustness-of-deep-learning-based |
Repo | https://github.com/MIPT-Oulu/RobustCartilageSegmentation |
Framework | none |
Semi-supervised representation learning via dual autoencoders for domain adaptation
Title | Semi-supervised representation learning via dual autoencoders for domain adaptation |
Authors | Shuai Yang, Hao Wang, Yuhong Zhang, Pei-Pei Li, Yi Zhu, Xuegang Hu |
Abstract | Domain adaptation aims to exploit the knowledge in source domain to promote the learning tasks in target domain, which plays a critical role in real-world applications. Recently, lots of deep learning approaches based on autoencoders have achieved a significance performance in domain adaptation. However, most existing methods focus on minimizing the distribution divergence by putting the source and target data together to learn global feature representations, while they do not consider the local relationship between instances in the same category from different domains. To address this problem, we propose a novel Semi-Supervised Representation Learning framework via Dual Autoencoders for domain adaptation, named SSRLDA. More specifically, we extract richer feature representations by learning the global and local feature representations simultaneously using two novel autoencoders, which are referred to as marginalized denoising autoencoder with adaptation distribution (MDAad) and multi-class marginalized denoising autoencoder (MMDA) respectively. Meanwhile, we make full use of label information to optimize feature representations. Experimental results show that our proposed approach outperforms several state-of-the-art baseline methods. |
Tasks | Denoising, Domain Adaptation, Representation Learning, Unsupervised Domain Adaptation |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.01342v4 |
https://arxiv.org/pdf/1908.01342v4.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-representation-learning-via |
Repo | https://github.com/Minminhfut/SSRLDACode |
Framework | none |
Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness
Title | Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness |
Authors | NhatHai Phan, Minh Vu, Yang Liu, Ruoming Jin, Dejing Dou, Xintao Wu, My T. Thai |
Abstract | In this paper, we propose a novel Heterogeneous Gaussian Mechanism (HGM) to preserve differential privacy in deep neural networks, with provable robustness against adversarial examples. We first relax the constraint of the privacy budget in the traditional Gaussian Mechanism from (0, 1] to (0, \infty), with a new bound of the noise scale to preserve differential privacy. The noise in our mechanism can be arbitrarily redistributed, offering a distinctive ability to address the trade-off between model utility and privacy loss. To derive provable robustness, our HGM is applied to inject Gaussian noise into the first hidden layer. Then, a tighter robustness bound is proposed. Theoretical analysis and thorough evaluations show that our mechanism notably improves the robustness of differentially private deep neural networks, compared with baseline approaches, under a variety of model attacks. |
Tasks | |
Published | 2019-06-02 |
URL | https://arxiv.org/abs/1906.01444v1 |
https://arxiv.org/pdf/1906.01444v1.pdf | |
PWC | https://paperswithcode.com/paper/heterogeneous-gaussian-mechanism-preserving |
Repo | https://github.com/haiphanNJIT/SecureSGD |
Framework | tf |
Source Camera Verification from Strongly Stabilized Videos
Title | Source Camera Verification from Strongly Stabilized Videos |
Authors | Enes Altinisik, Husrev Taha Sencar |
Abstract | The in-camera image stabilization technology deployed by most cameras today poses one of the most significant challenges to photo-response non-uniformity based source camera attribution from videos. When performed digitally, stabilization involves cropping, warping, and inpainting of video frames to eliminate unwanted camera motion. Hence, successful attribution requires inversion of these transformations in a blind manner. To address this challenge, we introduce a source camera verification method for videos that takes into account spatially variant nature of stabilization transformations and assumes a larger degree of freedom in their search. Our method identifies transformations at a sub-frame level and incorporates a number of constraints to validate their correctness. The method also adopts a holistic approach in countering disruptive effects of other video generation steps, such as video coding and downsizing, for more reliable attribution. Tests performed on one public and two custom datasets show that proposed method is able to verify the source of 23-40% of videos that underwent stronger stabilization without a significant impact on false attribution rate |
Tasks | Video Generation |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1912.05018v2 |
https://arxiv.org/pdf/1912.05018v2.pdf | |
PWC | https://paperswithcode.com/paper/source-camera-attribution-from-strongly |
Repo | https://github.com/VideoPRNUExtractor/Weighter |
Framework | none |
CITE: A Corpus of Image-Text Discourse Relations
Title | CITE: A Corpus of Image-Text Discourse Relations |
Authors | Malihe Alikhani, Sreyasi Nag Chowdhury, Gerard de Melo, Matthew Stone |
Abstract | This paper presents a novel crowd-sourced resource for multimodal discourse: our resource characterizes inferences in image-text contexts in the domain of cooking recipes in the form of coherence relations. Like previous corpora annotating discourse structure between text arguments, such as the Penn Discourse Treebank, our new corpus aids in establishing a better understanding of natural communication and common-sense reasoning, while our findings have implications for a wide range of applications, such as understanding and generation of multimodal documents. |
Tasks | Common Sense Reasoning |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06286v2 |
http://arxiv.org/pdf/1904.06286v2.pdf | |
PWC | https://paperswithcode.com/paper/cite-a-corpus-of-image-text-discourse |
Repo | https://github.com/malihealikhani/CITE |
Framework | none |
Time2Vec: Learning a Vector Representation of Time
Title | Time2Vec: Learning a Vector Representation of Time |
Authors | Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus Brubaker |
Abstract | Time is an important feature in many applications involving events that occur synchronously and/or asynchronously. To effectively consume time information, recent studies have focused on designing new architectures. In this paper, we take an orthogonal but complementary approach by providing a model-agnostic vector representation for time, called Time2Vec, that can be easily imported into many existing and future architectures and improve their performances. We show on a range of models and problems that replacing the notion of time with its Time2Vec representation improves the performance of the final model. |
Tasks | |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05321v1 |
https://arxiv.org/pdf/1907.05321v1.pdf | |
PWC | https://paperswithcode.com/paper/time2vec-learning-a-vector-representation-of |
Repo | https://github.com/avinashbarnwal/Time2Vec |
Framework | none |
ProtoAttend: Attention-Based Prototypical Learning
Title | ProtoAttend: Attention-Based Prototypical Learning |
Authors | Sercan O. Arik, Tomas Pfister |
Abstract | We propose a novel inherently interpretable machine learning method that bases decisions on few relevant examples that we call prototypes. Our method, ProtoAttend, can be integrated into a wide range of neural network architectures including pre-trained models. It utilizes an attention mechanism that relates the encoded representations to samples in order to determine prototypes. The resulting model outperforms state of the art in three high impact problems without sacrificing accuracy of the original model: (1) it enables high-quality interpretability that outputs samples most relevant to the decision-making (i.e. a sample-based interpretability method); (2) it achieves state of the art confidence estimation by quantifying the mismatch across prototype labels; and (3) it obtains state of the art in distribution mismatch detection. All this can be achieved with minimal additional test time and a practically viable training time computational cost. |
Tasks | Decision Making, Interpretable Machine Learning |
Published | 2019-02-17 |
URL | https://arxiv.org/abs/1902.06292v4 |
https://arxiv.org/pdf/1902.06292v4.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-prototypical-learning-towards |
Repo | https://github.com/google-research/google-research/tree/master/protoattend |
Framework | tf |
Fonts-2-Handwriting: A Seed-Augment-Train framework for universal digit classification
Title | Fonts-2-Handwriting: A Seed-Augment-Train framework for universal digit classification |
Authors | Vinay Uday Prabhu, Sanghyun Han, Dian Ang Yap, Mihail Douhaniaris, Preethi Seshadri, John Whaley |
Abstract | In this paper, we propose a Seed-Augment-Train/Transfer (SAT) framework that contains a synthetic seed image dataset generation procedure for languages with different numeral systems using freely available open font file datasets. This seed dataset of images is then augmented to create a purely synthetic training dataset, which is in turn used to train a deep neural network and test on held-out real world handwritten digits dataset spanning five Indic scripts, Kannada, Tamil, Gujarati, Malayalam, and Devanagari. We showcase the efficacy of this approach both qualitatively, by training a Boundary-seeking GAN (BGAN) that generates realistic digit images in the five languages, and also quantitatively by testing a CNN trained on the synthetic data on the real-world datasets. This establishes not only an interesting nexus between the font-datasets-world and transfer learning but also provides a recipe for universal-digit classification in any script. |
Tasks | Transfer Learning |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.08633v1 |
https://arxiv.org/pdf/1905.08633v1.pdf | |
PWC | https://paperswithcode.com/paper/190508633 |
Repo | https://github.com/unifyid-labs/DeepGenStruct-Notebooks |
Framework | none |
A Learnable Safety Measure
Title | A Learnable Safety Measure |
Authors | Steve Heim, Alexander von Rohr, Sebastian Trimpe, Alexander Badri-Spröwitz |
Abstract | Failures are challenging for learning to control physical systems since they risk damage, time-consuming resets, and often provide little gradient information. Adding safety constraints to exploration typically requires a lot of prior knowledge and domain expertise. We present a safety measure which implicitly captures how the system dynamics relate to a set of failure states. Not only can this measure be used as a safety function, but also to directly compute the set of safe state-action pairs. Further, we show a model-free approach to learn this measure by active sampling using Gaussian processes. While safety can only be guaranteed after learning the safety measure, we show that failures can already be greatly reduced by using the estimated measure during learning. |
Tasks | Gaussian Processes |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02835v1 |
https://arxiv.org/pdf/1910.02835v1.pdf | |
PWC | https://paperswithcode.com/paper/a-learnable-safety-measure |
Repo | https://github.com/sheim/vibly |
Framework | none |
Visual Semantic Reasoning for Image-Text Matching
Title | Visual Semantic Reasoning for Image-Text Matching |
Authors | Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, Yun Fu |
Abstract | Image-text matching has been a hot research topic bridging the vision and language areas. It remains challenging because the current representation of image usually lacks global semantic concepts as in its corresponding text caption. To address this issue, we propose a simple and interpretable reasoning model to generate visual representation that captures key objects and semantic concepts of a scene. Specifically, we first build up connections between image regions and perform reasoning with Graph Convolutional Networks to generate features with semantic relationships. Then, we propose to use the gate and memory mechanism to perform global semantic reasoning on these relationship-enhanced features, select the discriminative information and gradually generate the representation for the whole scene. Experiments validate that our method achieves a new state-of-the-art for the image-text matching on MS-COCO and Flickr30K datasets. It outperforms the current best method by 6.8% relatively for image retrieval and 4.8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set). On Flickr30K, our model improves image retrieval by 12.6% relatively and caption retrieval by 5.8% relatively (Recall@1). Our code is available at https://github.com/KunpengLi1994/VSRN. |
Tasks | Image Retrieval, Text Matching |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02701v1 |
https://arxiv.org/pdf/1909.02701v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-semantic-reasoning-for-image-text |
Repo | https://github.com/KunpengLi1994/VSRN |
Framework | pytorch |