January 31, 2020

2886 words 14 mins read

Paper Group AWR 375

Paper Group AWR 375

Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network. HAKE: Human Activity Knowledge Engine. Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare. EQuANt (Enhanced Question Answer Network). Utilizing BERT for Aspect-Based Se …

Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network

Title Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network
Authors Shengyu Zhao, Tingfung Lau, Ji Luo, Eric I-Chao Chang, Yan Xu
Abstract 3D medical image registration is of great clinical importance. However, supervised learning methods require a large amount of accurately annotated corresponding control points (or morphing), which are very difficult to obtain. Unsupervised learning methods ease the burden of manual annotation by exploiting unlabeled data without supervision. In this paper, we propose a new unsupervised learning method using convolutional neural networks under an end-to-end framework, Volume Tweening Network (VTN), for 3D medical image registration. We propose three innovative technical components: (1) An end-to-end cascading scheme that resolves large displacement; (2) An efficient integration of affine registration network; and (3) An additional invertibility loss that encourages backward consistency. Experiments demonstrate that our algorithm is 880x faster (or 3.3x faster without GPU acceleration) than traditional optimization-based methods and achieves state-of-theart performance in medical image registration.
Tasks Image Registration, Medical Image Registration
Published 2019-02-13
URL https://arxiv.org/abs/1902.05020v3
PDF https://arxiv.org/pdf/1902.05020v3.pdf
PWC https://paperswithcode.com/paper/unsupervised-3d-end-to-end-medical-image
Repo https://github.com/microsoft/Recursive-Cascaded-Networks
Framework tf

HAKE: Human Activity Knowledge Engine

Title HAKE: Human Activity Knowledge Engine
Authors Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu
Abstract Human activity understanding is crucial for building automatic intelligent system. With the help of deep learning, activity understanding has made huge progress recently. But some challenges such as imbalanced data distribution, action ambiguity, complex visual patterns still remain. To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states. Upon existing activity datasets, we annotate the part states of all the active persons in all images, thus establish the relationship between instance activity and body part states. Furthermore, we propose a HAKE based part state recognition model with a knowledge extractor named Activity2Vec and a corresponding part state based reasoning network. With HAKE, our method can alleviate the learning difficulty brought by the long-tail data distribution, and bring in interpretability. Now our HAKE has more than 7 M+ part state annotations and is still under construction. We first validate our approach on a part of HAKE in this preliminary paper, where we show 7.2 mAP performance improvement on Human-Object Interaction recognition, and 12.38 mAP improvement on the one-shot subsets.
Tasks Human-Object Interaction Detection
Published 2019-04-13
URL https://arxiv.org/abs/1904.06539v5
PDF https://arxiv.org/pdf/1904.06539v5.pdf
PWC https://paperswithcode.com/paper/hake-human-activity-knowledge-engine
Repo https://github.com/DirtyHarryLYL/HAKE
Framework none

Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare

Title Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Authors Scott L. Fleming, Kuhan Jeyapragasan, Tony Duan, Daisy Ding, Saurabh Gombar, Nigam Shah, Emma Brunskill
Abstract There is an emerging trend in the reinforcement learning for healthcare literature. In order to prepare longitudinal, irregularly sampled, clinical datasets for reinforcement learning algorithms, many researchers will resample the time series data to short, regular intervals and use last-observation-carried-forward (LOCF) imputation to fill in these gaps. Typically, they will not maintain any explicit information about which values were imputed. In this work, we (1) call attention to this practice and discuss its potential implications; (2) propose an alternative representation of the patient state that addresses some of these issues; and (3) demonstrate in a novel but representative clinical dataset that our alternative representation yields consistently better results for achieving optimal control, as measured by off-policy policy evaluation, compared to representations that do not incorporate missingness information.
Tasks Imputation, Time Series
Published 2019-11-16
URL https://arxiv.org/abs/1911.07084v1
PDF https://arxiv.org/pdf/1911.07084v1.pdf
PWC https://paperswithcode.com/paper/missingness-as-stability-understanding-the
Repo https://github.com/scottfleming/rl-missingness
Framework none

EQuANt (Enhanced Question Answer Network)

Title EQuANt (Enhanced Question Answer Network)
Authors François-Xavier Aubet, Dominic Danks, Yuchen Zhu
Abstract Machine Reading Comprehension (MRC) is an important topic in the domain of automated question answering and in natural language processing more generally. Since the release of the SQuAD 1.1 and SQuAD 2 datasets, progress in the field has been particularly significant, with current state-of-the-art models now exhibiting near-human performance at both answering well-posed questions and detecting questions which are unanswerable given a corresponding context. In this work, we present Enhanced Question Answer Network (EQuANt), an MRC model which extends the successful QANet architecture of Yu et al. to cope with unanswerable questions. By training and evaluating EQuANt on SQuAD 2, we show that it is indeed possible to extend QANet to the unanswerable domain. We achieve results which are close to 2 times better than our chosen baseline obtained by evaluating a lightweight version of the original QANet architecture on SQuAD 2. In addition, we report that the performance of EQuANt on SQuAD 1.1 after being trained on SQuAD2 exceeds that of our lightweight QANet architecture trained and evaluated on SQuAD 1.1, demonstrating the utility of multi-task learning in the MRC context.
Tasks Machine Reading Comprehension, Multi-Task Learning, Question Answering, Reading Comprehension
Published 2019-06-24
URL https://arxiv.org/abs/1907.00708v2
PDF https://arxiv.org/pdf/1907.00708v2.pdf
PWC https://paperswithcode.com/paper/equant-enhanced-question-answer-network
Repo https://github.com/Francois-Aubet/EQuANt
Framework tf

Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence

Title Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
Authors Chi Sun, Luyao Huang, Xipeng Qiu
Abstract Aspect-based sentiment analysis (ABSA), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA). In this paper, we construct an auxiliary sentence from the aspect and convert ABSA to a sentence-pair classification task, such as question answering (QA) and natural language inference (NLI). We fine-tune the pre-trained model from BERT and achieve new state-of-the-art results on SentiHood and SemEval-2014 Task 4 datasets.
Tasks Aspect-Based Sentiment Analysis, Natural Language Inference, Question Answering, Sentiment Analysis
Published 2019-03-22
URL http://arxiv.org/abs/1903.09588v1
PDF http://arxiv.org/pdf/1903.09588v1.pdf
PWC https://paperswithcode.com/paper/utilizing-bert-for-aspect-based-sentiment
Repo https://github.com/HSLCY/ABSA-BERT-pair
Framework pytorch

Analytical Moment Regularizer for Gaussian Robust Networks

Title Analytical Moment Regularizer for Gaussian Robust Networks
Authors Modar Alfadly, Adel Bibi, Bernard Ghanem
Abstract Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours. One puzzling behaviour is the subtle sensitive reaction of DNNs to various noise attacks. Such a nuisance has strengthened the line of research around developing and training noise-robust networks. In this work, we propose a new training regularizer that aims to minimize the probabilistic expected training loss of a DNN subject to a generic Gaussian input. We provide an efficient and simple approach to approximate such a regularizer for arbitrary deep networks. This is done by leveraging the analytic expression of the output mean of a shallow neural network; avoiding the need for the memory and computationally expensive data augmentation. We conduct extensive experiments on LeNet and AlexNet on various datasets including MNIST, CIFAR10, and CIFAR100 demonstrating the effectiveness of our proposed regularizer. In particular, we show that networks that are trained with the proposed regularizer benefit from a boost in robustness equivalent to performing 3-21 folds of data augmentation.
Tasks Data Augmentation
Published 2019-04-24
URL http://arxiv.org/abs/1904.11005v1
PDF http://arxiv.org/pdf/1904.11005v1.pdf
PWC https://paperswithcode.com/paper/190411005
Repo https://github.com/ModarTensai/gaussian-regularizer
Framework pytorch

Optimal Multi-view Correction of Local Affine Frames

Title Optimal Multi-view Correction of Local Affine Frames
Authors Ivan Eichhardt, Daniel Barath
Abstract The technique requires the epipolar geometry to be pre-estimated between each image pair. It exploits the constraints which the camera movement implies, in order to apply a closed-form correction to the parameters of the input affinities. Also, it is shown that the rotations and scales obtained by partially affine-covariant detectors, e.g., AKAZE or SIFT, can be completed to be full affine frames by the proposed algorithm. It is validated both in synthetic experiments and on publicly available real-world datasets that the method always improves the output of the evaluated affine-covariant feature detectors. As a by-product, these detectors are compared and the ones obtaining the most accurate affine frames are reported. For demonstrating the applicability, we show that the proposed technique as a pre-processing step improves the accuracy of pose estimation for a camera rig, surface normal and homography estimation.
Tasks Homography Estimation, Pose Estimation
Published 2019-05-01
URL http://arxiv.org/abs/1905.00519v1
PDF http://arxiv.org/pdf/1905.00519v1.pdf
PWC https://paperswithcode.com/paper/optimal-multi-view-correction-of-local-affine
Repo https://github.com/eivan/multiview-LAFs-correction
Framework none

Simple BERT Models for Relation Extraction and Semantic Role Labeling

Title Simple BERT Models for Relation Extraction and Semantic Role Labeling
Authors Peng Shi, Jimmy Lin
Abstract We present simple BERT-based models for relation extraction and semantic role labeling. In recent years, state-of-the-art performance has been achieved using neural models by incorporating lexical and syntactic features such as part-of-speech tags and dependency trees. In this paper, extensive experiments on datasets for these two tasks show that without using any external features, a simple BERT-based model can achieve state-of-the-art performance. To our knowledge, we are the first to successfully apply BERT in this manner. Our models provide strong baselines for future research.
Tasks Relation Extraction, Semantic Role Labeling
Published 2019-04-10
URL http://arxiv.org/abs/1904.05255v1
PDF http://arxiv.org/pdf/1904.05255v1.pdf
PWC https://paperswithcode.com/paper/simple-bert-models-for-relation-extraction
Repo https://github.com/phueb/BabyBertSRL
Framework pytorch

Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds

Title Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds
Authors Huan Lei, Naveed Akhtar, Ajmal Mian
Abstract We propose a spherical kernel for efficient graph convolution of 3D point clouds. Our metric-based kernels systematically quantize the local 3D space to identify distinctive geometric relationships in the data. Similar to the regular grid CNN kernels, the spherical kernel maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning. The proposed kernel is applied to graph neural networks without edge-dependent filter generation, making it computationally attractive for large point clouds. In our graph networks, each vertex is associated with a single point location and edges connect the neighborhood points within a defined range. The graph gets coarsened in the network with farthest point sampling. Analogous to the standard CNNs, we define pooling and unpooling operations for our network. We demonstrate the effectiveness of the proposed spherical kernel with graph neural networks for point cloud classification and semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS datasets. The source code and the trained models can be downloaded from https://github.com/hlei-ziyan/SPH3D-GCN.
Tasks 3D Instance Segmentation, 3D Object Classification, 3D Part Segmentation, Semantic Segmentation
Published 2019-09-20
URL https://arxiv.org/abs/1909.09287v2
PDF https://arxiv.org/pdf/1909.09287v2.pdf
PWC https://paperswithcode.com/paper/spherical-kernel-for-efficient-graph
Repo https://github.com/hlei-ziyan/SPH3D-GCN
Framework tf

Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization

Title Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization
Authors Chao Huang, Haojie Liu, Tong Chen, Qiu Shen, Zhan Ma
Abstract We propose a MultiScale AutoEncoder(MSAE) based extreme image compression framework to offer visually pleasing reconstruction at a very low bitrate. Our method leverages the “priors” at different resolution scale to improve the compression efficiency, and also employs the generative adversarial network(GAN) with multiscale discriminators to perform the end-to-end trainable rate-distortion optimization. We compare the perceptual quality of our reconstructions with traditional compression algorithms using High-Efficiency Video Coding(HEVC) based Intra Profile and JPEG2000 on the public Cityscapes and ADE20K datasets, demonstrating the significant subjective quality improvement.
Tasks Image Compression
Published 2019-04-08
URL https://arxiv.org/abs/1904.03851v2
PDF https://arxiv.org/pdf/1904.03851v2.pdf
PWC https://paperswithcode.com/paper/extreme-image-compression-via-multiscale
Repo https://github.com/WikiChao/Extreme-Image-Compression
Framework pytorch

No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement

Title No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement
Authors Jia Yan, Jie Li, Xin Fu
Abstract No-reference image quality assessment (NR-IQA) aims to measure the image quality without reference image. However, contrast distortion has been overlooked in the current research of NR-IQA. In this paper, we propose a very simple but effective metric for predicting quality of contrast-altered images based on the fact that a high-contrast image is often more similar to its contrast enhanced image. Specifically, we first generate an enhanced image through histogram equalization. We then calculate the similarity of the original image and the enhanced one by using structural-similarity index (SSIM) as the first feature. Further, we calculate the histogram based entropy and cross entropy between the original image and the enhanced one respectively, to gain a sum of 4 features. Finally, we learn a regression module to fuse the aforementioned 5 features for inferring the quality score. Experiments on four publicly available databases validate the superiority and efficiency of the proposed technique.
Tasks Image Quality Assessment, No-Reference Image Quality Assessment
Published 2019-04-18
URL http://arxiv.org/abs/1904.08879v1
PDF http://arxiv.org/pdf/1904.08879v1.pdf
PWC https://paperswithcode.com/paper/no-reference-quality-assessment-of-contrast
Repo https://github.com/steffensbola/blind_iqa_contrast
Framework none

Language Models as Knowledge Bases?

Title Language Models as Knowledge Bases?
Authors Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
Abstract Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as “fill-in-the-blank” cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.
Tasks Language Modelling, Open-Domain Question Answering, Question Answering
Published 2019-09-03
URL https://arxiv.org/abs/1909.01066v2
PDF https://arxiv.org/pdf/1909.01066v2.pdf
PWC https://paperswithcode.com/paper/language-models-as-knowledge-bases
Repo https://github.com/facebookresearch/LAMA
Framework pytorch

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Title And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Authors Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou
Abstract In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using byte-aligned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5MB (20x compression factor) while preserving a top-1 accuracy of 76.1% on ImageNet object classification and by compressing a Mask R-CNN with a 26x factor.
Tasks Object Classification, Quantization
Published 2019-07-12
URL https://arxiv.org/abs/1907.05686v4
PDF https://arxiv.org/pdf/1907.05686v4.pdf
PWC https://paperswithcode.com/paper/and-the-bit-goes-down-revisiting-the
Repo https://github.com/facebookresearch/kill-the-bits
Framework pytorch

Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions

Title Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions
Authors Yasin Yazıcı, Bruno Lecouat, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, Vijay Chandrasekhar
Abstract We propose a GAN design which models multiple distributions effectively and discovers their commonalities and particularities. Each data distribution is modeled with a mixture of $K$ generator distributions. As the generators are partially shared between the modeling of different true data distributions, shared ones captures the commonality of the distributions, while non-shared ones capture unique aspects of them. We show the effectiveness of our method on various datasets (MNIST, Fashion MNIST, CIFAR-10, Omniglot, CelebA) with compelling results.
Tasks Omniglot
Published 2019-02-09
URL http://arxiv.org/abs/1902.03444v1
PDF http://arxiv.org/pdf/1902.03444v1.pdf
PWC https://paperswithcode.com/paper/venn-gan-discovering-commonalities-and
Repo https://github.com/yasinyazici/Venn_GAN
Framework tf

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke

Title CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke
Authors Hao Yang, Weijian Huang, Kehan Qi, Cheng Li, Xinfeng Liu, Meiyun Wang, Hairong Zheng, Shanshan Wang
Abstract Segmenting stroke lesions from T1-weighted MR images is of great value for large-scale stroke rehabilitation neuroimaging analyses. Nevertheless, there are great challenges with this task, such as large range of stroke lesion scales and the tissue intensity similarity. The famous encoder-decoder convolutional neural network, which although has made great achievements in medical image segmentation areas, may fail to address these challenges due to the insufficient uses of multi-scale features and context information. To address these challenges, this paper proposes a Cross-Level fusion and Context Inference Network (CLCI-Net) for the chronic stroke lesion segmentation from T1-weighted MR images. Specifically, a Cross-Level feature Fusion (CLF) strategy was developed to make full use of different scale features across different levels; Extending Atrous Spatial Pyramid Pooling (ASPP) with CLF, we have enriched multi-scale features to handle the different lesion sizes; In addition, convolutional long short-term memory (ConvLSTM) is employed to infer context information and thus capture fine structures to address the intensity similarity issue. The proposed approach was evaluated on an open-source dataset, the Anatomical Tracings of Lesions After Stroke (ATLAS) with the results showing that our network outperforms five state-of-the-art methods. We make our code and models available at https://github.com/YH0517/CLCI_Net.
Tasks Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published 2019-07-16
URL https://arxiv.org/abs/1907.07008v2
PDF https://arxiv.org/pdf/1907.07008v2.pdf
PWC https://paperswithcode.com/paper/clci-net-cross-level-fusion-and-context
Repo https://github.com/YH0517/CLCI_Net
Framework tf
comments powered by Disqus