Paper Group AWR 412
Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography. Modelling urban networks using Variational Autoencoders. Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline. Cognitive Graph for Multi-Hop Reading Comprehension at Scale. ConvLab: Multi-Domain End-to-End Dialog Sy …
Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography
Title | Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography |
Authors | Aleksei Tiulpin, Mikko Finnilä, Petri Lehenkari, Heikki J. Nieminen, Simo Saarakkala |
Abstract | Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography. Such a protocol was designed to provide X-ray attenuation contrast to visualize AC structure. However, at the same time, this protocol has one major disadvantage: the loss of contrast at the tidemark (calcified cartilage interface, CCI). An accurate segmentation of CCI can be very important for understanding the etiology of OA and ex-vivo evaluation of tidemark condition at early OA stages. In this paper, we present the first application of Deep Learning to PTA-stained osteochondral samples that allows to perform tidemark segmentation in a fully-automatic manner. Our method is based on U-Net trained using a combination of binary cross-entropy and soft Jaccard loss. On cross-validation, this approach yielded intersection over the union of 0.59, 0.70, 0.79, 0.83 and 0.86 within 15 {\mu}m, 30 {\mu}m, 45 {\mu}m, 60 {\mu}m and 75 {\mu}m padded zones around the tidemark, respectively. Our codes and the dataset that consisted of 35 PTA-stained human AC samples are made publicly available together with the segmentation masks to facilitate the development of biomedical image segmentation methods. |
Tasks | Semantic Segmentation |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05089v1 |
https://arxiv.org/pdf/1907.05089v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-tidemark-segmentation-in |
Repo | https://github.com/MIPT-Oulu/mCTSegmentation |
Framework | none |
Modelling urban networks using Variational Autoencoders
Title | Modelling urban networks using Variational Autoencoders |
Authors | Kira Kempinska, Roberto Murcio |
Abstract | A long-standing question for urban and regional planners pertains to the ability to describe urban patterns quantitatively. Cities’ transport infrastructure, particularly street networks, provides an invaluable source of information about the urban patterns generated by peoples’ movements and their interactions. With the increasing availability of street network datasets and the advancements in deep learning methods, we are presented with an unprecedented opportunity to push the frontiers of urban modelling towards more data-driven and accurate models of urban forms. In this study, we present our initial work on applying deep generative models to urban street network data to create spatially explicit urban models. We based our work on Variational Autoencoders (VAEs) which are deep generative models that have recently gained their popularity due to the ability to generate realistic images. Initial results show that VAEs are capable of capturing key high-level urban network metrics using low-dimensional vectors and generating new urban forms of complexity matching the cities captured in the street network data. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.06465v1 |
https://arxiv.org/pdf/1905.06465v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-urban-networks-using-variational |
Repo | https://github.com/kirakowalska/vae-urban-network |
Framework | pytorch |
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Title | Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline |
Authors | Vishvak Murahari, Dhruv Batra, Devi Parikh, Abhishek Das |
Abstract | Prior work in visual dialog has focused on training deep neural models on VisDial in isolation. Instead, we present an approach to leverage pretraining on related vision-language datasets before transferring to visual dialog. We adapt the recently proposed ViLBERT (Lu et al., 2019) model for multi-turn visually-grounded conversations. Our model is pretrained on the Conceptual Captions and Visual Question Answering datasets, and finetuned on VisDial. Our best single model outperforms prior published work (including model ensembles) by more than 1% absolute on NDCG and MRR. Next, we find that additional finetuning using “dense” annotations in VisDial leads to even higher NDCG – more than 10% over our base model – but hurts MRR – more than 17% below our base model! This highlights a trade-off between the two primary metrics – NDCG and MRR – which we find is due to dense annotations not correlating well with the original ground-truth answers to questions. |
Tasks | Language Modelling, Representation Learning, Visual Dialog |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02379v2 |
https://arxiv.org/pdf/1912.02379v2.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-pretraining-for-visual-dialog-a |
Repo | https://github.com/vmurahari3/visdial-bert |
Framework | pytorch |
Cognitive Graph for Multi-Hop Reading Comprehension at Scale
Title | Cognitive Graph for Multi-Hop Reading Comprehension at Scale |
Authors | Ming Ding, Chang Zhou, Qibin Chen, Hongxia Yang, Jie Tang |
Abstract | We propose a new CogQA framework for multi-hop question answering in web-scale documents. Inspired by the dual process theory in cognitive science, the framework gradually builds a \textit{cognitive graph} in an iterative process by coordinating an implicit extraction module (System 1) and an explicit reasoning module (System 2). While giving accurate answers, our framework further provides explainable reasoning paths. Specifically, our implementation based on BERT and graph neural network efficiently handles millions of documents for multi-hop reasoning questions in the HotpotQA fullwiki dataset, achieving a winning joint $F_1$ score of 34.9 on the leaderboard, compared to 23.6 of the best competitor. |
Tasks | Multi-Hop Reading Comprehension, Question Answering, Reading Comprehension |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05460v2 |
https://arxiv.org/pdf/1905.05460v2.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-graph-for-multi-hop-reading |
Repo | https://github.com/THUDM/CogQA |
Framework | pytorch |
ConvLab: Multi-Domain End-to-End Dialog System Platform
Title | ConvLab: Multi-Domain End-to-End Dialog System Platform |
Authors | Sungjin Lee, Qi Zhu, Ryuichi Takanobu, Xiang Li, Yaoqin Zhang, Zheng Zhang, Jinchao Li, Baolin Peng, Xiujun Li, Minlie Huang, Jianfeng Gao |
Abstract | We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments. ConvLab offers a set of fully annotated datasets and associated pre-trained reference models. As a showcase, we extend the MultiWOZ dataset with user dialog act annotations to train all component models and demonstrate how ConvLab makes it easy and effortless to conduct complicated experiments in multi-domain end-to-end dialog settings. |
Tasks | |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08637v1 |
http://arxiv.org/pdf/1904.08637v1.pdf | |
PWC | https://paperswithcode.com/paper/convlab-multi-domain-end-to-end-dialog-system |
Repo | https://github.com/ConvLab/ConvLab |
Framework | pytorch |
Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments
Title | Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments |
Authors | Mert Kayhan, Okan Köpüklü, Mhd Hasan Sarhan, Mehmet Yigitsoy, Abouzar Eslami, Gerhard Rigoll |
Abstract | For many practical problems and applications, it is not feasible to create a vast and accurately labeled dataset, which restricts the application of deep learning in many areas. Semi-supervised learning algorithms intend to improve performance by also leveraging unlabeled data. This is very valuable for 2D-pose estimation task where data labeling requires substantial time and is subject to noise. This work aims to investigate if semi-supervised learning techniques can achieve acceptable performance level that makes using these algorithms during training justifiable. To this end, a lightweight network architecture is introduced and mean teacher, virtual adversarial training and pseudo-labeling algorithms are evaluated on 2D-pose estimation for surgical instruments. For the applicability of pseudo-labelling algorithm, we propose a novel confidence measure, total variation. Experimental results show that utilization of semi-supervised learning improves the performance on unseen geometries drastically while maintaining high accuracy for seen geometries. For RMIT benchmark, our lightweight architecture outperforms state-of-the-art with supervised learning. For Endovis benchmark, pseudo-labelling algorithm improves the supervised baseline achieving the new state-of-the-art performance. |
Tasks | Deep Attention, Pose Estimation |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04618v1 |
https://arxiv.org/pdf/1912.04618v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-attention-based-semi-supervised-2d-pose |
Repo | https://github.com/mertkayhan/SSL-2D-Pose |
Framework | tf |
Revisiting Self-Supervised Visual Representation Learning
Title | Revisiting Self-Supervised Visual Representation Learning |
Authors | Alexander Kolesnikov, Xiaohua Zhai, Lucas Beyer |
Abstract | Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks. A large number of the pretext tasks for self-supervised learning have been studied, but other important aspects, such as the choice of convolutional neural networks (CNN), has not received equal attention. Therefore, we revisit numerous previously proposed self-supervised models, conduct a thorough large scale study and, as a result, uncover multiple crucial insights. We challenge a number of common practices in selfsupervised visual representation learning and observe that standard recipes for CNN design do not always translate to self-supervised representation learning. As part of our study, we drastically boost the performance of previously proposed techniques and outperform previously published state-of-the-art results by a large margin. |
Tasks | Representation Learning, Self-Supervised Image Classification |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.09005v1 |
http://arxiv.org/pdf/1901.09005v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-self-supervised-visual |
Repo | https://github.com/rickyHong/Puzzle-tensorflow-latest-repl |
Framework | tf |
G2SAT: Learning to Generate SAT Formulas
Title | G2SAT: Learning to Generate SAT Formulas |
Authors | Jiaxuan You, Haoze Wu, Clark Barrett, Raghuram Ramanujan, Jure Leskovec |
Abstract | The Boolean Satisfiability (SAT) problem is the canonical NP-complete problem and is fundamental to computer science, with a wide array of applications in planning, verification, and theorem proving. Developing and evaluating practical SAT solvers relies on extensive empirical testing on a set of real-world benchmark formulas. However, the availability of such real-world SAT formulas is limited. While these benchmark formulas can be augmented with synthetically generated ones, existing approaches for doing so are heavily hand-crafted and fail to simultaneously capture a wide range of characteristics exhibited by real-world SAT instances. In this work, we present G2SAT, the first deep generative framework that learns to generate SAT formulas from a given set of input formulas. Our key insight is that SAT formulas can be transformed into latent bipartite graph representations which we model using a specialized deep generative neural network. We show that G2SAT can generate SAT formulas that closely resemble given real-world SAT instances, as measured by both graph metrics and SAT solver behavior. Further, we show that our synthetic SAT formulas could be used to improve SAT solver performance on real-world benchmarks, which opens up new opportunities for the continued development of SAT solvers and a deeper understanding of their performance. |
Tasks | Automated Theorem Proving |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13445v1 |
https://arxiv.org/pdf/1910.13445v1.pdf | |
PWC | https://paperswithcode.com/paper/g2sat-learning-to-generate-sat-formulas |
Repo | https://github.com/JiaxuanYou/G2SAT |
Framework | pytorch |
Graph Neural Network on Electronic Health Records for Predicting Alzheimer’s Disease
Title | Graph Neural Network on Electronic Health Records for Predicting Alzheimer’s Disease |
Authors | Weicheng Zhu, Narges Razavian |
Abstract | The cause of Alzheimer’s disease (AD) is poorly understood, so forecasting AD remains a hard task in population health. Failure of clinical trials for AD treatments indicates that AD should be intervened at the earlier, pre-symptomatic stages. Developing an explainable method for predicting AD is critical for providing better treatment targets, better clinical trial recruitment, and better clinical care for the AD patients. In this paper, we present a novel approach for disease (AD) prediction based on Electronic Health Records (EHR) and graph neural network. Our method improves the performance on sparse data which is common in EHR, and obtains state-of-art results in predicting AD 12 to 24 months in advance on real-world EHR data, compared to other baseline results. Our approach also provides an insight into the structural relationship among different diagnosis, Lab values, and procedures from EHR as per graph structures learned by our model. |
Tasks | |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.03761v1 |
https://arxiv.org/pdf/1912.03761v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-neural-network-on-electronic-health |
Repo | https://github.com/NYUMedML/GNN_for_EHR |
Framework | pytorch |
Cognitive Knowledge Graph Reasoning for One-shot Relational Learning
Title | Cognitive Knowledge Graph Reasoning for One-shot Relational Learning |
Authors | Zhengxiao Du, Chang Zhou, Ming Ding, Hongxia Yang, Jie Tang |
Abstract | Inferring new facts from existing knowledge graphs (KG) with explainable reasoning processes is a significant problem and has received much attention recently. However, few studies have focused on relation types unseen in the original KG, given only one or a few instances for training. To bridge this gap, we propose CogKR for one-shot KG reasoning. The one-shot relational learning problem is tackled through two modules: the summary module summarizes the underlying relationship of the given instances, based on which the reasoning module infers the correct answers. Motivated by the dual process theory in cognitive science, in the reasoning module, a cognitive graph is built by iteratively coordinating retrieval (System 1, collecting relevant evidence intuitively) and reasoning (System 2, conducting relational reasoning over collected information). The structural information offered by the cognitive graph enables our model to aggregate pieces of evidence from multiple reasoning paths and explain the reasoning process graphically. Experiments show that CogKR substantially outperforms previous state-of-the-art models on one-shot KG reasoning benchmarks, with relative improvements of 24.3%-29.7% on MRR. The source code is available at https://github.com/THUDM/CogKR. |
Tasks | Knowledge Graphs, Relational Reasoning |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05489v1 |
https://arxiv.org/pdf/1906.05489v1.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-knowledge-graph-reasoning-for-one |
Repo | https://github.com/THUDM/CogKR |
Framework | pytorch |
Understanding Deep Networks via Extremal Perturbations and Smooth Masks
Title | Understanding Deep Networks via Extremal Perturbations and Smooth Masks |
Authors | Ruth Fong, Mandela Patrick, Andrea Vedaldi |
Abstract | The problem of attribution is concerned with identifying the parts of an input that are responsible for a model’s output. An important family of attribution methods is based on measuring the effect of perturbations applied to the input. In this paper, we discuss some of the shortcomings of existing approaches to perturbation analysis and address them by introducing the concept of extremal perturbations, which are theoretically grounded and interpretable. We also introduce a number of technical innovations to compute extremal perturbations, including a new area constraint and a parametric family of smooth perturbations, which allow us to remove all tunable hyper-parameters from the optimization problem. We analyze the effect of perturbations as a function of their area, demonstrating excellent sensitivity to the spatial properties of the deep neural network under stimulation. We also extend perturbation analysis to the intermediate layers of a network. This application allows us to identify the salient channels necessary for classification, which, when visualized using feature inversion, can be used to elucidate model behavior. Lastly, we introduce TorchRay, an interpretability library built on PyTorch. |
Tasks | Interpretable Machine Learning |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08485v1 |
https://arxiv.org/pdf/1910.08485v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-deep-networks-via-extremal |
Repo | https://github.com/facebookresearch/TorchRay |
Framework | pytorch |
Learning Multi-level Dependencies for Robust Word Recognition
Title | Learning Multi-level Dependencies for Robust Word Recognition |
Authors | Zhiwei Wang, Hui Liu, Jiliang Tang, Songfan Yang, Gale Yan Huang, Zitao Liu |
Abstract | Robust language processing systems are becoming increasingly important given the recent awareness of dangerous situations where brittle machine learning models can be easily broken with the presence of noises. In this paper, we introduce a robust word recognition framework that captures multi-level sequential dependencies in noised sentences. The proposed framework employs a sequence-to-sequence model over characters of each word, whose output is given to a word-level bi-directional recurrent neural network. We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition. |
Tasks | |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.09789v1 |
https://arxiv.org/pdf/1911.09789v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-multi-level-dependencies-for-robust |
Repo | https://github.com/zw-s-github/MUDE |
Framework | pytorch |
Merge and Label: A novel neural network architecture for nested NER
Title | Merge and Label: A novel neural network architecture for nested NER |
Authors | Joseph Fisher, Andreas Vlachos |
Abstract | Named entity recognition (NER) is one of the best studied tasks in natural language processing. However, most approaches are not capable of handling nested structures which are common in many applications. In this paper we introduce a novel neural network architecture that first merges tokens and/or entities into entities forming nested structures, and then labels each of them independently. Unlike previous work, our merge and label approach predicts real-valued instead of discrete segmentation structures, which allow it to combine word and nested entity embeddings while maintaining differentiability. %which smoothly groups entities into single vectors across multiple levels. We evaluate our approach using the ACE 2005 Corpus, where it achieves state-of-the-art F1 of 74.6, further improved with contextual embeddings (BERT) to 82.4, an overall improvement of close to 8 F1 points over previous approaches trained on the same data. Additionally we compare it against BiLSTM-CRFs, the dominant approach for flat NER structures, demonstrating that its ability to predict nested structures does not impact performance in simpler cases. |
Tasks | Entity Embeddings, Named Entity Recognition, Nested Mention Recognition |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00464v1 |
https://arxiv.org/pdf/1907.00464v1.pdf | |
PWC | https://paperswithcode.com/paper/merge-and-label-a-novel-neural-network |
Repo | https://github.com/fishjh2/merge_label |
Framework | pytorch |
Lattice CNNs for Matching Based Chinese Question Answering
Title | Lattice CNNs for Matching Based Chinese Question Answering |
Authors | Yuxuan Lai, Yansong Feng, Xiaohan Yu, Zheng Wang, Kun Xu, Dongyan Zhao |
Abstract | Short text matching often faces the challenges that there are great word mismatch and expression diversity between the two texts, which would be further aggravated in languages like Chinese where there is no natural space to segment words explicitly. In this paper, we propose a novel lattice based CNN model (LCNs) to utilize multi-granularity information inherent in the word lattice while maintaining strong ability to deal with the introduced noisy information for matching based question answering in Chinese. We conduct extensive experiments on both document based question answering and knowledge based question answering tasks, and experimental results show that the LCNs models can significantly outperform the state-of-the-art matching models and strong baselines by taking advantages of better ability to distill rich but discriminative information from the word lattice input. |
Tasks | Question Answering, Text Matching |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09087v1 |
http://arxiv.org/pdf/1902.09087v1.pdf | |
PWC | https://paperswithcode.com/paper/lattice-cnns-for-matching-based-chinese |
Repo | https://github.com/Erutan-pku/LCN-for-Chinese-QA |
Framework | tf |
AMF: Aggregated Mondrian Forests for Online Learning
Title | AMF: Aggregated Mondrian Forests for Online Learning |
Authors | Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet |
Abstract | Random Forests (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants however are “offline” algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online random forest algorithm based on Mondrian Forests. Using a variant of the Context Tree Weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification. |
Tasks | |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.10529v1 |
https://arxiv.org/pdf/1906.10529v1.pdf | |
PWC | https://paperswithcode.com/paper/amf-aggregated-mondrian-forests-for-online |
Repo | https://github.com/stephanegaiffas/AMF |
Framework | none |