January 31, 2020

3044 words 15 mins read

Paper Group AWR 412

Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography. Modelling urban networks using Variational Autoencoders. Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline. Cognitive Graph for Multi-Hop Reading Comprehension at Scale. ConvLab: Multi-Domain End-to-End Dialog Sy …

Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography


Title	Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography
Authors	Aleksei Tiulpin, Mikko Finnilä, Petri Lehenkari, Heikki J. Nieminen, Simo Saarakkala
Abstract	Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography. Such a protocol was designed to provide X-ray attenuation contrast to visualize AC structure. However, at the same time, this protocol has one major disadvantage: the loss of contrast at the tidemark (calcified cartilage interface, CCI). An accurate segmentation of CCI can be very important for understanding the etiology of OA and ex-vivo evaluation of tidemark condition at early OA stages. In this paper, we present the first application of Deep Learning to PTA-stained osteochondral samples that allows to perform tidemark segmentation in a fully-automatic manner. Our method is based on U-Net trained using a combination of binary cross-entropy and soft Jaccard loss. On cross-validation, this approach yielded intersection over the union of 0.59, 0.70, 0.79, 0.83 and 0.86 within 15 {\mu}m, 30 {\mu}m, 45 {\mu}m, 60 {\mu}m and 75 {\mu}m padded zones around the tidemark, respectively. Our codes and the dataset that consisted of 35 PTA-stained human AC samples are made publicly available together with the segmentation masks to facilitate the development of biomedical image segmentation methods.
Tasks	Semantic Segmentation
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05089v1
PDF	https://arxiv.org/pdf/1907.05089v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-tidemark-segmentation-in
Repo	https://github.com/MIPT-Oulu/mCTSegmentation
Framework	none

Modelling urban networks using Variational Autoencoders


Title	Modelling urban networks using Variational Autoencoders
Authors	Kira Kempinska, Roberto Murcio
Abstract	A long-standing question for urban and regional planners pertains to the ability to describe urban patterns quantitatively. Cities’ transport infrastructure, particularly street networks, provides an invaluable source of information about the urban patterns generated by peoples’ movements and their interactions. With the increasing availability of street network datasets and the advancements in deep learning methods, we are presented with an unprecedented opportunity to push the frontiers of urban modelling towards more data-driven and accurate models of urban forms. In this study, we present our initial work on applying deep generative models to urban street network data to create spatially explicit urban models. We based our work on Variational Autoencoders (VAEs) which are deep generative models that have recently gained their popularity due to the ability to generate realistic images. Initial results show that VAEs are capable of capturing key high-level urban network metrics using low-dimensional vectors and generating new urban forms of complexity matching the cities captured in the street network data.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.06465v1
PDF	https://arxiv.org/pdf/1905.06465v1.pdf
PWC	https://paperswithcode.com/paper/modelling-urban-networks-using-variational
Repo	https://github.com/kirakowalska/vae-urban-network
Framework	pytorch

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline


Title	Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Authors	Vishvak Murahari, Dhruv Batra, Devi Parikh, Abhishek Das
Abstract	Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. We adapt the recently proposed ViLBERT (Lu et al., 2019) model for multi-turn visually-grounded conversations. Our model is pretrained on the Conceptual Captions and Visual Question Answering datasets, and finetuned on VisDial. Our best single model outperforms prior published work (including model ensembles) by more than 1% absolute on NDCG and MRR. Next, we find that additional finetuning using “dense” annotations in VisDial leads to even higher NDCG – more than 10% over our base model – but hurts MRR – more than 17% below our base model! This highlights a trade-off between the two primary metrics – NDCG and MRR – which we find is due to dense annotations not correlating well with the original ground-truth answers to questions.
Tasks	Language Modelling, Representation Learning, Visual Dialog
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02379v2
PDF	https://arxiv.org/pdf/1912.02379v2.pdf
PWC	https://paperswithcode.com/paper/large-scale-pretraining-for-visual-dialog-a
Repo	https://github.com/vmurahari3/visdial-bert
Framework	pytorch

Cognitive Graph for Multi-Hop Reading Comprehension at Scale


Title	Cognitive Graph for Multi-Hop Reading Comprehension at Scale
Authors	Ming Ding, Chang Zhou, Qibin Chen, Hongxia Yang, Jie Tang
Abstract	We propose a new CogQA framework for multi-hop question answering in web-scale documents. Inspired by the dual process theory in cognitive science, the framework gradually builds a \textit{cognitive graph} in an iterative process by coordinating an implicit extraction module (System 1) and an explicit reasoning module (System 2). While giving accurate answers, our framework further provides explainable reasoning paths. Specifically, our implementation based on BERT and graph neural network efficiently handles millions of documents for multi-hop reasoning questions in the HotpotQA fullwiki dataset, achieving a winning joint $F_1$ score of 34.9 on the leaderboard, compared to 23.6 of the best competitor.
Tasks	Multi-Hop Reading Comprehension, Question Answering, Reading Comprehension
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05460v2
PDF	https://arxiv.org/pdf/1905.05460v2.pdf
PWC	https://paperswithcode.com/paper/cognitive-graph-for-multi-hop-reading
Repo	https://github.com/THUDM/CogQA
Framework	pytorch

ConvLab: Multi-Domain End-to-End Dialog System Platform


Title	ConvLab: Multi-Domain End-to-End Dialog System Platform
Authors	Sungjin Lee, Qi Zhu, Ryuichi Takanobu, Xiang Li, Yaoqin Zhang, Zheng Zhang, Jinchao Li, Baolin Peng, Xiujun Li, Minlie Huang, Jianfeng Gao
Abstract	We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments. ConvLab offers a set of fully annotated datasets and associated pre-trained reference models. As a showcase, we extend the MultiWOZ dataset with user dialog act annotations to train all component models and demonstrate how ConvLab makes it easy and effortless to conduct complicated experiments in multi-domain end-to-end dialog settings.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08637v1
PDF	http://arxiv.org/pdf/1904.08637v1.pdf
PWC	https://paperswithcode.com/paper/convlab-multi-domain-end-to-end-dialog-system
Repo	https://github.com/ConvLab/ConvLab
Framework	pytorch

Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments


Title	Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments
Authors	Mert Kayhan, Okan Köpüklü, Mhd Hasan Sarhan, Mehmet Yigitsoy, Abouzar Eslami, Gerhard Rigoll
Abstract	For many practical problems and applications, it is not feasible to create a vast and accurately labeled dataset, which restricts the application of deep learning in many areas. Semi-supervised learning algorithms intend to improve performance by also leveraging unlabeled data. This is very valuable for 2D-pose estimation task where data labeling requires substantial time and is subject to noise. This work aims to investigate if semi-supervised learning techniques can achieve acceptable performance level that makes using these algorithms during training justifiable. To this end, a lightweight network architecture is introduced and mean teacher, virtual adversarial training and pseudo-labeling algorithms are evaluated on 2D-pose estimation for surgical instruments. For the applicability of pseudo-labelling algorithm, we propose a novel confidence measure, total variation. Experimental results show that utilization of semi-supervised learning improves the performance on unseen geometries drastically while maintaining high accuracy for seen geometries. For RMIT benchmark, our lightweight architecture outperforms state-of-the-art with supervised learning. For Endovis benchmark, pseudo-labelling algorithm improves the supervised baseline achieving the new state-of-the-art performance.
Tasks	Deep Attention, Pose Estimation
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04618v1
PDF	https://arxiv.org/pdf/1912.04618v1.pdf
PWC	https://paperswithcode.com/paper/deep-attention-based-semi-supervised-2d-pose
Repo	https://github.com/mertkayhan/SSL-2D-Pose
Framework	tf

Revisiting Self-Supervised Visual Representation Learning


Title	Revisiting Self-Supervised Visual Representation Learning
Authors	Alexander Kolesnikov, Xiaohua Zhai, Lucas Beyer
Abstract	Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks. A large number of the pretext tasks for self-supervised learning have been studied, but other important aspects, such as the choice of convolutional neural networks (CNN), has not received equal attention. Therefore, we revisit numerous previously proposed self-supervised models, conduct a thorough large scale study and, as a result, uncover multiple crucial insights. We challenge a number of common practices in selfsupervised visual representation learning and observe that standard recipes for CNN design do not always translate to self-supervised representation learning. As part of our study, we drastically boost the performance of previously proposed techniques and outperform previously published state-of-the-art results by a large margin.
Tasks	Representation Learning, Self-Supervised Image Classification
Published	2019-01-25
URL	http://arxiv.org/abs/1901.09005v1
PDF	http://arxiv.org/pdf/1901.09005v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-self-supervised-visual
Repo	https://github.com/rickyHong/Puzzle-tensorflow-latest-repl
Framework	tf

G2SAT: Learning to Generate SAT Formulas


Title	G2SAT: Learning to Generate SAT Formulas
Authors	Jiaxuan You, Haoze Wu, Clark Barrett, Raghuram Ramanujan, Jure Leskovec
Abstract	The Boolean Satisfiability (SAT) problem is the canonical NP-complete problem and is fundamental to computer science, with a wide array of applications in planning, verification, and theorem proving. Developing and evaluating practical SAT solvers relies on extensive empirical testing on a set of real-world benchmark formulas. However, the availability of such real-world SAT formulas is limited. While these benchmark formulas can be augmented with synthetically generated ones, existing approaches for doing so are heavily hand-crafted and fail to simultaneously capture a wide range of characteristics exhibited by real-world SAT instances. In this work, we present G2SAT, the first deep generative framework that learns to generate SAT formulas from a given set of input formulas. Our key insight is that SAT formulas can be transformed into latent bipartite graph representations which we model using a specialized deep generative neural network. We show that G2SAT can generate SAT formulas that closely resemble given real-world SAT instances, as measured by both graph metrics and SAT solver behavior. Further, we show that our synthetic SAT formulas could be used to improve SAT solver performance on real-world benchmarks, which opens up new opportunities for the continued development of SAT solvers and a deeper understanding of their performance.
Tasks	Automated Theorem Proving
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13445v1
PDF	https://arxiv.org/pdf/1910.13445v1.pdf
PWC	https://paperswithcode.com/paper/g2sat-learning-to-generate-sat-formulas
Repo	https://github.com/JiaxuanYou/G2SAT
Framework	pytorch

Graph Neural Network on Electronic Health Records for Predicting Alzheimer’s Disease


Title	Graph Neural Network on Electronic Health Records for Predicting Alzheimer’s Disease
Authors	Weicheng Zhu, Narges Razavian
Abstract	The cause of Alzheimer’s disease (AD) is poorly understood, so forecasting AD remains a hard task in population health. Failure of clinical trials for AD treatments indicates that AD should be intervened at the earlier, pre-symptomatic stages. Developing an explainable method for predicting AD is critical for providing better treatment targets, better clinical trial recruitment, and better clinical care for the AD patients. In this paper, we present a novel approach for disease (AD) prediction based on Electronic Health Records (EHR) and graph neural network. Our method improves the performance on sparse data which is common in EHR, and obtains state-of-art results in predicting AD 12 to 24 months in advance on real-world EHR data, compared to other baseline results. Our approach also provides an insight into the structural relationship among different diagnosis, Lab values, and procedures from EHR as per graph structures learned by our model.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03761v1
PDF	https://arxiv.org/pdf/1912.03761v1.pdf
PWC	https://paperswithcode.com/paper/graph-neural-network-on-electronic-health
Repo	https://github.com/NYUMedML/GNN_for_EHR
Framework	pytorch

Cognitive Knowledge Graph Reasoning for One-shot Relational Learning


Title	Cognitive Knowledge Graph Reasoning for One-shot Relational Learning
Authors	Zhengxiao Du, Chang Zhou, Ming Ding, Hongxia Yang, Jie Tang
Abstract	Inferring new facts from existing knowledge graphs (KG) with explainable reasoning processes is a significant problem and has received much attention recently. However, few studies have focused on relation types unseen in the original KG, given only one or a few instances for training. To bridge this gap, we propose CogKR for one-shot KG reasoning. The one-shot relational learning problem is tackled through two modules: the summary module summarizes the underlying relationship of the given instances, based on which the reasoning module infers the correct answers. Motivated by the dual process theory in cognitive science, in the reasoning module, a cognitive graph is built by iteratively coordinating retrieval (System 1, collecting relevant evidence intuitively) and reasoning (System 2, conducting relational reasoning over collected information). The structural information offered by the cognitive graph enables our model to aggregate pieces of evidence from multiple reasoning paths and explain the reasoning process graphically. Experiments show that CogKR substantially outperforms previous state-of-the-art models on one-shot KG reasoning benchmarks, with relative improvements of 24.3%-29.7% on MRR. The source code is available at https://github.com/THUDM/CogKR.
Tasks	Knowledge Graphs, Relational Reasoning
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05489v1
PDF	https://arxiv.org/pdf/1906.05489v1.pdf
PWC	https://paperswithcode.com/paper/cognitive-knowledge-graph-reasoning-for-one
Repo	https://github.com/THUDM/CogKR
Framework	pytorch

Understanding Deep Networks via Extremal Perturbations and Smooth Masks


Title	Understanding Deep Networks via Extremal Perturbations and Smooth Masks
Authors	Ruth Fong, Mandela Patrick, Andrea Vedaldi
Abstract	The problem of attribution is concerned with identifying the parts of an input that are responsible for a model’s output. An important family of attribution methods is based on measuring the effect of perturbations applied to the input. In this paper, we discuss some of the shortcomings of existing approaches to perturbation analysis and address them by introducing the concept of extremal perturbations, which are theoretically grounded and interpretable. We also introduce a number of technical innovations to compute extremal perturbations, including a new area constraint and a parametric family of smooth perturbations, which allow us to remove all tunable hyper-parameters from the optimization problem. We analyze the effect of perturbations as a function of their area, demonstrating excellent sensitivity to the spatial properties of the deep neural network under stimulation. We also extend perturbation analysis to the intermediate layers of a network. This application allows us to identify the salient channels necessary for classification, which, when visualized using feature inversion, can be used to elucidate model behavior. Lastly, we introduce TorchRay, an interpretability library built on PyTorch.
Tasks	Interpretable Machine Learning
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08485v1
PDF	https://arxiv.org/pdf/1910.08485v1.pdf
PWC	https://paperswithcode.com/paper/understanding-deep-networks-via-extremal
Repo	https://github.com/facebookresearch/TorchRay
Framework	pytorch

Learning Multi-level Dependencies for Robust Word Recognition


Title	Learning Multi-level Dependencies for Robust Word Recognition
Authors	Zhiwei Wang, Hui Liu, Jiliang Tang, Songfan Yang, Gale Yan Huang, Zitao Liu
Abstract	Robust language processing systems are becoming increasingly important given the recent awareness of dangerous situations where brittle machine learning models can be easily broken with the presence of noises. In this paper, we introduce a robust word recognition framework that captures multi-level sequential dependencies in noised sentences. The proposed framework employs a sequence-to-sequence model over characters of each word, whose output is given to a word-level bi-directional recurrent neural network. We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition.
Tasks
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09789v1
PDF	https://arxiv.org/pdf/1911.09789v1.pdf
PWC	https://paperswithcode.com/paper/learning-multi-level-dependencies-for-robust
Repo	https://github.com/zw-s-github/MUDE
Framework	pytorch

Merge and Label: A novel neural network architecture for nested NER


Title	Merge and Label: A novel neural network architecture for nested NER
Authors	Joseph Fisher, Andreas Vlachos
Abstract	Named entity recognition (NER) is one of the best studied tasks in natural language processing. However, most approaches are not capable of handling nested structures which are common in many applications. In this paper we introduce a novel neural network architecture that first merges tokens and/or entities into entities forming nested structures, and then labels each of them independently. Unlike previous work, our merge and label approach predicts real-valued instead of discrete segmentation structures, which allow it to combine word and nested entity embeddings while maintaining differentiability. %which smoothly groups entities into single vectors across multiple levels. We evaluate our approach using the ACE 2005 Corpus, where it achieves state-of-the-art F1 of 74.6, further improved with contextual embeddings (BERT) to 82.4, an overall improvement of close to 8 F1 points over previous approaches trained on the same data. Additionally we compare it against BiLSTM-CRFs, the dominant approach for flat NER structures, demonstrating that its ability to predict nested structures does not impact performance in simpler cases.
Tasks	Entity Embeddings, Named Entity Recognition, Nested Mention Recognition
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00464v1
PDF	https://arxiv.org/pdf/1907.00464v1.pdf
PWC	https://paperswithcode.com/paper/merge-and-label-a-novel-neural-network
Repo	https://github.com/fishjh2/merge_label
Framework	pytorch

Lattice CNNs for Matching Based Chinese Question Answering


Title	Lattice CNNs for Matching Based Chinese Question Answering
Authors	Yuxuan Lai, Yansong Feng, Xiaohan Yu, Zheng Wang, Kun Xu, Dongyan Zhao
Abstract	Short text matching often faces the challenges that there are great word mismatch and expression diversity between the two texts, which would be further aggravated in languages like Chinese where there is no natural space to segment words explicitly. In this paper, we propose a novel lattice based CNN model (LCNs) to utilize multi-granularity information inherent in the word lattice while maintaining strong ability to deal with the introduced noisy information for matching based question answering in Chinese. We conduct extensive experiments on both document based question answering and knowledge based question answering tasks, and experimental results show that the LCNs models can significantly outperform the state-of-the-art matching models and strong baselines by taking advantages of better ability to distill rich but discriminative information from the word lattice input.
Tasks	Question Answering, Text Matching
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09087v1
PDF	http://arxiv.org/pdf/1902.09087v1.pdf
PWC	https://paperswithcode.com/paper/lattice-cnns-for-matching-based-chinese
Repo	https://github.com/Erutan-pku/LCN-for-Chinese-QA
Framework	tf

AMF: Aggregated Mondrian Forests for Online Learning


Title	AMF: Aggregated Mondrian Forests for Online Learning
Authors	Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
Abstract	Random Forests (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants however are “offline” algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online random forest algorithm based on Mondrian Forests. Using a variant of the Context Tree Weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification.
Tasks
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10529v1
PDF	https://arxiv.org/pdf/1906.10529v1.pdf
PWC	https://paperswithcode.com/paper/amf-aggregated-mondrian-forests-for-online
Repo	https://github.com/stephanegaiffas/AMF
Framework	none