April 3, 2020

3148 words 15 mins read

Paper Group AWR 22

Adapting Grad-CAM for Embedding Networks. Fast Sequence-Based Embedding with Diffusion Graphs. Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU. Learning Deformable Registration of Medical Images with Anatomical Constraints. Estimating Counterfactual Treatment Outcomes over Time Through Advers …

Adapting Grad-CAM for Embedding Networks


Title	Adapting Grad-CAM for Embedding Networks
Authors	Lei Chen, Jianhui Chen, Hossein Hajimirsadeghi, Greg Mori
Abstract	The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embedding networks are trained by millions of dynamically paired examples (e.g. triplets). To overcome these challenges, we propose an adaptation of the Grad-CAM method for embedding networks. First, we aggregate grad-weights from multiple training examples to improve the stability of Grad-CAM. Then, we develop an efficient weight-transfer method to explain decisions for any image without back-propagation. We extensively validate the method on the standard CUB200 dataset in which our method produces more accurate visual attention than the original Grad-CAM method. We also apply the method to a house price estimation application using images. The method produces convincing qualitative results, showcasing the practicality of our approach.
Tasks	Image Captioning, Image Classification
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06538v1
PDF	https://arxiv.org/pdf/2001.06538v1.pdf
PWC	https://paperswithcode.com/paper/adapting-grad-cam-for-embedding-networks
Repo	https://github.com/shinmura0/Faster-Grad-CAM
Framework	tf

Fast Sequence-Based Embedding with Diffusion Graphs


Title	Fast Sequence-Based Embedding with Diffusion Graphs
Authors	Benedek Rozemberczki, Rik Sarkar
Abstract	A graph embedding is a representation of graph vertices in a low-dimensional space, which approximately preserves properties such as distances between nodes. Vertex sequence-based embedding procedures use features extracted from linear sequences of nodes to create embeddings using a neural network. In this paper, we propose diffusion graphs as a method to rapidly generate vertex sequences for network embedding. Its computational efficiency is superior to previous methods due to simpler sequence generation, and it produces more accurate results. In experiments, we found that the performance relative to other methods improves with increasing edge density in the graph. In a community detection task, clustering nodes in the embedding space produces better results compared to other sequence-based embedding methods.
Tasks	Community Detection, Graph Embedding, Network Embedding
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07463v1
PDF	https://arxiv.org/pdf/2001.07463v1.pdf
PWC	https://paperswithcode.com/paper/fast-sequence-based-embedding-with-diffusion-1
Repo	https://github.com/benedekrozemberczki/karateclub
Framework	none

Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU


Title	Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU
Authors	Patrick Kidger, Terry Lyons
Abstract	Signatory is a library for calculating signature and logsignature transforms and related functionality. The focus is on making this functionality available for use in machine learning, and as such includes features such as GPU support and backpropagation. To our knowledge it is the first publically available GPU-capable library for these operations. It also implements several new algorithmic improvements, and provides several new features not available in previous libraries. The library operates as a Python wrapper around C++, and is compatible with the PyTorch ecosystem. It may be installed directly via \texttt{pip}. Source code, documentation, examples, benchmarks and tests may be found at \texttt{\url{https://github.com/patrick-kidger/signatory}}. The license is Apache-2.0.
Tasks
Published	2020-01-03
URL	https://arxiv.org/abs/2001.00706v1
PDF	https://arxiv.org/pdf/2001.00706v1.pdf
PWC	https://paperswithcode.com/paper/signatory-differentiable-computations-of-the
Repo	https://github.com/patrick-kidger/signatory
Framework	pytorch

Learning Deformable Registration of Medical Images with Anatomical Constraints


Title	Learning Deformable Registration of Medical Images with Anatomical Constraints
Authors	Lucas Mansilla, Diego H. Milone, Enzo Ferrante
Abstract	Deformable image registration is a fundamental problem in the field of medical image analysis. During the last years, we have witnessed the advent of deep learning-based image registration methods which achieve state-of-the-art performance, and drastically reduce the required computational time. However, little work has been done regarding how can we encourage our models to produce not only accurate, but also anatomically plausible results, which is still an open question in the field. In this work, we argue that incorporating anatomical priors in the form of global constraints into the learning process of these models, will further improve their performance and boost the realism of the warped images after registration. We learn global non-linear representations of image anatomy using segmentation masks, and employ them to constraint the registration process. The proposed AC-RegNet architecture is evaluated in the context of chest X-ray image registration using three different datasets, where the high anatomical variability makes the task extremely challenging. Our experiments show that the proposed anatomically constrained registration model produces more realistic and accurate results than state-of-the-art methods, demonstrating the potential of this approach.
Tasks	Image Registration
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07183v2
PDF	https://arxiv.org/pdf/2001.07183v2.pdf
PWC	https://paperswithcode.com/paper/learning-deformable-registration-of-medical
Repo	https://github.com/lucasmansilla/ACRN_Chest_X-ray_IA
Framework	tf

Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations


Title	Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations
Authors	Ioana Bica, Ahmed M. Alaa, James Jordon, Mihaela van der Schaar
Abstract	Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions. In this paper, we introduce the Counterfactual Recurrent Network (CRN), a novel sequence-to-sequence model that leverages the increasingly available patient observational data to estimate treatment effects over time and answer such medical questions. To handle the bias from time-varying confounders, covariates affecting the treatment assignment policy in the observational data, CRN uses domain adversarial training to build balancing representations of the patient history. At each timestep, CRN constructs a treatment invariant representation which removes the association between patient history and treatment assignments and thus can be reliably used for making counterfactual predictions. On a simulated model of tumour growth, with varying degree of time-dependent confounding, we show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment than current state-of-the-art methods.
Tasks
Published	2020-02-10
URL	https://arxiv.org/abs/2002.04083v1
PDF	https://arxiv.org/pdf/2002.04083v1.pdf
PWC	https://paperswithcode.com/paper/estimating-counterfactual-treatment-outcomes-1
Repo	https://github.com/ioanabica/Counterfactual-Recurrent-Network
Framework	none

Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation


Title	Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation
Authors	Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Ming-Hsuan Yang
Abstract	Few-shot classification aims to recognize novel categories with only few labeled images in each class. Existing metric-based few-shot classification algorithms predict categories by comparing the feature embeddings of query images with those from a few labeled images (support examples) using a learned metric function. While promising performance has been demonstrated, these methods often fail to generalize to unseen domains due to large discrepancy of the feature distribution across domains. In this work, we address the problem of few-shot classification under domain shifts for metric-based methods. Our core idea is to use feature-wise transformation layers for augmenting the image features using affine transforms to simulate various feature distributions under different domains in the training stage. To capture variations of the feature distributions under different domains, we further apply a learning-to-learn approach to search for the hyper-parameters of the feature-wise transformation layers. We conduct extensive experiments and ablation studies under the domain generalization setting using five few-shot classification datasets: mini-ImageNet, CUB, Cars, Places, and Plantae. Experimental results demonstrate that the proposed feature-wise transformation layer is applicable to various metric-based models, and provides consistent improvements on the few-shot classification performance under domain shift.
Tasks	Cross-Domain Few-Shot, Domain Generalization
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08735v3
PDF	https://arxiv.org/pdf/2001.08735v3.pdf
PWC	https://paperswithcode.com/paper/cross-domain-few-shot-classification-via-1
Repo	https://github.com/hytseng0509/CrossDomainFewShot
Framework	pytorch

Supporting supervised learning in fungal Biosynthetic Gene Cluster discovery: new benchmark datasets


Title	Supporting supervised learning in fungal Biosynthetic Gene Cluster discovery: new benchmark datasets
Authors	Hayda Almeida, Adrian Tsang, Abdoulaye Baniré Diallo
Abstract	Fungal Biosynthetic Gene Clusters (BGCs) of secondary metabolites are clusters of genes capable of producing natural products, compounds that play an important role in the production of a wide variety of bioactive compounds, including antibiotics and pharmaceuticals. Identifying BGCs can lead to the discovery of novel natural products to benefit human health. Previous work has been focused on developing automatic tools to support BGC discovery in plants, fungi, and bacteria. Data-driven methods, as well as probabilistic and supervised learning methods have been explored in identifying BGCs. Most methods applied to identify fungal BGCs were data-driven and presented limited scope. Supervised learning methods have been shown to perform well at identifying BGCs in bacteria, and could be well suited to perform the same task in fungi. But labeled data instances are needed to perform supervised learning. Openly accessible BGC databases contain only a very small portion of previously curated fungal BGCs. Making new fungal BGC datasets available could motivate the development of supervised learning methods for fungal BGCs and potentially improve prediction performance compared to data-driven methods. In this work we propose new publicly available fungal BGC datasets to support the BGC discovery task using supervised learning. These datasets are prepared to perform binary classification and predict candidate BGC regions in fungal genomes. In addition we analyse the performance of a well supported supervised learning tool developed to predict BGCs.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.03260v1
PDF	https://arxiv.org/pdf/2001.03260v1.pdf
PWC	https://paperswithcode.com/paper/supporting-supervised-learning-in-fungal
Repo	https://github.com/bioinfoUQAM/fungalbgcdata
Framework	none

RSANet: Recurrent Slice-wise Attention Network for Multiple Sclerosis Lesion Segmentation


Title	RSANet: Recurrent Slice-wise Attention Network for Multiple Sclerosis Lesion Segmentation
Authors	Hang Zhang, Jinwei Zhang, Qihao Zhang, Jeremy Kim, Shun Zhang, Susan A. Gauthier, Pascal Spincemaille, Thanh D. Nguyen, Mert R. Sabuncu, Yi Wang
Abstract	Brain lesion volume measured on T2 weighted MRI images is a clinically important disease marker in multiple sclerosis (MS). Manual delineation of MS lesions is a time-consuming and highly operator-dependent task, which is influenced by lesion size, shape and conspicuity. Recently, automated lesion segmentation algorithms based on deep neural networks have been developed with promising results. In this paper, we propose a novel recurrent slice-wise attention network (RSANet), which models 3D MRI images as sequences of slices and captures long-range dependencies through a recurrent manner to utilize contextual information of MS lesions. Experiments on a dataset with 43 patients show that the proposed method outperforms the state-of-the-art approaches. Our implementation is available online at https://github.com/tinymilky/RSANet.
Tasks	Lesion Segmentation
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12470v1
PDF	https://arxiv.org/pdf/2002.12470v1.pdf
PWC	https://paperswithcode.com/paper/rsanet-recurrent-slice-wise-attention-network
Repo	https://github.com/tinymilky/RSANet
Framework	pytorch

NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search


Title	NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search
Authors	Xuanyi Dong, Yi Yang
Abstract	Neural architecture search (NAS) has achieved breakthrough success in a great number of applications in the past few years. It could be time to take a step back and analyze the good and bad aspects in the field of NAS. A variety of algorithms search architectures under different search space. These searched architectures are trained using different setups, e.g., hyper-parameters, data augmentation, regularization. This raises a comparability problem when comparing the performance of various NAS algorithms. NAS-Bench-101 has shown success to alleviate this problem. In this work, we propose an extension to NAS-Bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information. NAS-Bench-201 has a fixed search space and provides a unified benchmark for almost any up-to-date NAS algorithms. The design of our search space is inspired from the one used in the most popular cell-based searching algorithms, where a cell is represented as a DAG. Each edge here is associated with an operation selected from a predefined operation set. For it to be applicable for all NAS algorithms, the search space defined in NAS-Bench-201 includes all possible architectures generated by 4 nodes and 5 associated operation options, which results in 15,625 candidates in total. The training log and the performance for each architecture candidate are provided for three datasets. This allows researchers to avoid unnecessary repetitive training for selected candidate and focus solely on the search algorithm itself. The training time saved for every candidate also largely improves the efficiency of many methods. We provide additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms. In further support, we have analyzed it from many aspects and benchmarked 10 recent NAS algorithms.
Tasks	Data Augmentation, Neural Architecture Search
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00326v2
PDF	https://arxiv.org/pdf/2001.00326v2.pdf
PWC	https://paperswithcode.com/paper/nas-bench-102-extending-the-scope-of
Repo	https://github.com/D-X-Y/NAS-Projects
Framework	pytorch

Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-based Modules


Title	Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-based Modules
Authors	Ernesto Jiménez-Ruiz, Asan Agibetov, Jiaoyan Chen, Matthias Samwald, Valerie Cross
Abstract	Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In this paper we present an approach that combines a neural embedding model and logic-based modules to accurately divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed method is adequate in practice and can be integrated within the workflow of systems unable to cope with very large ontologies.
Tasks
Published	2020-02-25
URL	https://arxiv.org/abs/2003.05370v1
PDF	https://arxiv.org/pdf/2003.05370v1.pdf
PWC	https://paperswithcode.com/paper/dividing-the-ontology-alignment-task-with
Repo	https://github.com/ernestojimenezruiz/logmap-matcher
Framework	none


Title	Social Media Mining Toolkit (SMMT)
Authors	Ramya Tekumalla, Juan M. Banda
Abstract	There has been a dramatic increase in the popularity of utilizing social media data for research purposes within the biomedical community. In PubMed alone, there have been nearly 2,500 publication entries since 2014 that deal with analyzing social media data from Twitter and Reddit. However, the vast majority of those works do not share their code or data for replicating their studies. With minimal exceptions, the few that do, place the burden on the researcher to figure out how to fetch the data, how to best format their data, and how to create automatic and manual annotations on the acquired data. In order to address this pressing issue, we introduce the Social Media Mining Toolkit (SMMT), a suite of tools aimed to encapsulate the cumbersome details of acquiring, preprocessing, annotating and standardizing social media data. The purpose of our toolkit is for researchers to focus on answering research questions, and not the technical aspects of using social media data. By using a standard toolkit, researchers will be able to acquire, use, and release data in a consistent way that is transparent for everybody using the toolkit, hence, simplifying research reproducibility and accessibility in the social media domain.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.13894v1
PDF	https://arxiv.org/pdf/2003.13894v1.pdf
PWC	https://paperswithcode.com/paper/social-media-mining-toolkit-smmt
Repo	https://github.com/thepanacealab/SMMT
Framework	none

On Feature Normalization and Data Augmentation


Title	On Feature Normalization and Data Augmentation
Authors	Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger
Abstract	Modern neural network training relies heavily on data augmentation for improved generalization. After the initial success of label-preserving augmentations, there has been a recent surge of interest in label-perturbing approaches, which combine features and labels across training samples to smooth the learned decision surface. In this paper, we propose a new augmentation method that leverages the first and second moments extracted and re-injected by feature normalization. We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation methods. We demonstrate its efficacy across benchmark data sets in computer vision, speech, and natural language processing, where it consistently improves the generalization performance of highly competitive baseline networks.
Tasks	Data Augmentation, Image Classification
Published	2020-02-25
URL	https://arxiv.org/abs/2002.11102v2
PDF	https://arxiv.org/pdf/2002.11102v2.pdf
PWC	https://paperswithcode.com/paper/on-feature-normalization-and-data
Repo	https://github.com/Boyiliee/MoEx
Framework	pytorch

IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems


Title	IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Authors	Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Haiqing Chen
Abstract	Personal assistant systems, such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana, are becoming ever more widely used. Understanding user intent such as clarification questions, potential answers and user feedback in information-seeking conversations is critical for retrieving good responses. In this paper, we analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model “IART”, which refers to “Intent-Aware Ranking with Transformers”. IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture, which relies entirely on a self-attention mechanism instead of recurrent nets. It incorporates intent-aware utterance attention to derive an importance weighting scheme of utterances in conversation context with the aim of better conversation history understanding. We conduct extensive experiments with three information-seeking conversation data sets including both standard benchmarks and commercial data. Our proposed model outperforms all baseline methods with respect to a variety of metrics. We also perform case studies and analysis of learned user intent and its impact on response ranking in information-seeking conversations to provide interpretation of results.
Tasks	Representation Learning
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00571v1
PDF	https://arxiv.org/pdf/2002.00571v1.pdf
PWC	https://paperswithcode.com/paper/iart-intent-aware-response-ranking-with
Repo	https://github.com/yangliuy/Intent-Aware-Ranking-Transformers
Framework	none

Visual Grounding in Video for Unsupervised Word Translation


Title	Visual Grounding in Video for Unsupervised Word Translation
Authors	Gunnar A. Sigurdsson, Jean-Baptiste Alayrac, Aida Nematzadeh, Lucas Smaira, Mateusz Malinowski, João Carreira, Phil Blunsom, Andrew Zisserman
Abstract	There are thousands of actively spoken languages on Earth, but a single visual world. Grounding in this visual world has the potential to bridge the gap between all these languages. Our goal is to use visual grounding to improve unsupervised word mapping between languages. The key idea is to establish a common visual representation between two languages by learning embeddings from unpaired instructional videos narrated in the native language. Given this shared embedding we demonstrate that (i) we can map words between the languages, particularly the ‘visual’ words; (ii) that the shared embedding provides a good initialization for existing unsupervised text-based word translation techniques, forming the basis for our proposed hybrid visual-text mapping algorithm, MUVE; and (iii) our approach achieves superior performance by addressing the shortcomings of text-based methods – it is more robust, handles datasets with less commonality, and is applicable to low-resource languages. We apply these methods to translate words from English to French, Korean, and Japanese – all without any parallel corpora and simply by watching many videos of people speaking while doing things.
Tasks
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05078v2
PDF	https://arxiv.org/pdf/2003.05078v2.pdf
PWC	https://paperswithcode.com/paper/visual-grounding-in-video-for-unsupervised
Repo	https://github.com/gsig/visual-grounding
Framework	none

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese


Title	CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Authors	Liang Xu, Yu tong, Qianqian Dong, Yixuan Liao, Cong Yu, Yin Tian, Weitang Liu, Lu Li, Caiquan Liu, Xuanwei Zhang
Abstract	In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese. CLUENER2020 contains 10 categories. Apart from common labels like person, organization, and location, it contains more diverse categories. It is more challenging than current other Chinese NER datasets and could better reflect real-world applications. For comparison, we implement several state-of-the-art baselines as sequence labeling tasks and report human performance, as well as its analysis. To facilitate future work on fine-grained NER for Chinese, we release our dataset, baselines, and leader-board.
Tasks	Chinese Named Entity Recognition, Named Entity Recognition
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04351v4
PDF	https://arxiv.org/pdf/2001.04351v4.pdf
PWC	https://paperswithcode.com/paper/cluener2020-fine-grained-name-entity
Repo	https://github.com/CLUEbenchmark/CLUENER2020
Framework	pytorch