January 31, 2020

2886 words 14 mins read

Paper Group AWR 375

Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network. HAKE: Human Activity Knowledge Engine. Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare. EQuANt (Enhanced Question Answer Network). Utilizing BERT for Aspect-Based Se …

Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network


Title	Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network
Authors	Shengyu Zhao, Tingfung Lau, Ji Luo, Eric I-Chao Chang, Yan Xu
Abstract	3D medical image registration is of great clinical importance. However, supervised learning methods require a large amount of accurately annotated corresponding control points (or morphing), which are very difficult to obtain. Unsupervised learning methods ease the burden of manual annotation by exploiting unlabeled data without supervision. In this paper, we propose a new unsupervised learning method using convolutional neural networks under an end-to-end framework, Volume Tweening Network (VTN), for 3D medical image registration. We propose three innovative technical components: (1) An end-to-end cascading scheme that resolves large displacement; (2) An efficient integration of affine registration network; and (3) An additional invertibility loss that encourages backward consistency. Experiments demonstrate that our algorithm is 880x faster (or 3.3x faster without GPU acceleration) than traditional optimization-based methods and achieves state-of-theart performance in medical image registration.
Tasks	Image Registration, Medical Image Registration
Published	2019-02-13
URL	https://arxiv.org/abs/1902.05020v3
PDF	https://arxiv.org/pdf/1902.05020v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-3d-end-to-end-medical-image
Repo	https://github.com/microsoft/Recursive-Cascaded-Networks
Framework	tf

HAKE: Human Activity Knowledge Engine


Title	HAKE: Human Activity Knowledge Engine
Authors	Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu
Abstract	Human activity understanding is crucial for building automatic intelligent system. With the help of deep learning, activity understanding has made huge progress recently. But some challenges such as imbalanced data distribution, action ambiguity, complex visual patterns still remain. To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states. Upon existing activity datasets, we annotate the part states of all the active persons in all images, thus establish the relationship between instance activity and body part states. Furthermore, we propose a HAKE based part state recognition model with a knowledge extractor named Activity2Vec and a corresponding part state based reasoning network. With HAKE, our method can alleviate the learning difficulty brought by the long-tail data distribution, and bring in interpretability. Now our HAKE has more than 7 M+ part state annotations and is still under construction. We first validate our approach on a part of HAKE in this preliminary paper, where we show 7.2 mAP performance improvement on Human-Object Interaction recognition, and 12.38 mAP improvement on the one-shot subsets.
Tasks	Human-Object Interaction Detection
Published	2019-04-13
URL	https://arxiv.org/abs/1904.06539v5
PDF	https://arxiv.org/pdf/1904.06539v5.pdf
PWC	https://paperswithcode.com/paper/hake-human-activity-knowledge-engine
Repo	https://github.com/DirtyHarryLYL/HAKE
Framework	none

Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare


Title	Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Authors	Scott L. Fleming, Kuhan Jeyapragasan, Tony Duan, Daisy Ding, Saurabh Gombar, Nigam Shah, Emma Brunskill
Abstract	There is an emerging trend in the reinforcement learning for healthcare literature. In order to prepare longitudinal, irregularly sampled, clinical datasets for reinforcement learning algorithms, many researchers will resample the time series data to short, regular intervals and use last-observation-carried-forward (LOCF) imputation to fill in these gaps. Typically, they will not maintain any explicit information about which values were imputed. In this work, we (1) call attention to this practice and discuss its potential implications; (2) propose an alternative representation of the patient state that addresses some of these issues; and (3) demonstrate in a novel but representative clinical dataset that our alternative representation yields consistently better results for achieving optimal control, as measured by off-policy policy evaluation, compared to representations that do not incorporate missingness information.
Tasks	Imputation, Time Series
Published	2019-11-16
URL	https://arxiv.org/abs/1911.07084v1
PDF	https://arxiv.org/pdf/1911.07084v1.pdf
PWC	https://paperswithcode.com/paper/missingness-as-stability-understanding-the
Repo	https://github.com/scottfleming/rl-missingness
Framework	none

EQuANt (Enhanced Question Answer Network)


Title	EQuANt (Enhanced Question Answer Network)
Authors	François-Xavier Aubet, Dominic Danks, Yuchen Zhu
Abstract	Machine Reading Comprehension (MRC) is an important topic in the domain of automated question answering and in natural language processing more generally. Since the release of the SQuAD 1.1 and SQuAD 2 datasets, progress in the field has been particularly significant, with current state-of-the-art models now exhibiting near-human performance at both answering well-posed questions and detecting questions which are unanswerable given a corresponding context. In this work, we present Enhanced Question Answer Network (EQuANt), an MRC model which extends the successful QANet architecture of Yu et al. to cope with unanswerable questions. By training and evaluating EQuANt on SQuAD 2, we show that it is indeed possible to extend QANet to the unanswerable domain. We achieve results which are close to 2 times better than our chosen baseline obtained by evaluating a lightweight version of the original QANet architecture on SQuAD 2. In addition, we report that the performance of EQuANt on SQuAD 1.1 after being trained on SQuAD2 exceeds that of our lightweight QANet architecture trained and evaluated on SQuAD 1.1, demonstrating the utility of multi-task learning in the MRC context.
Tasks	Machine Reading Comprehension, Multi-Task Learning, Question Answering, Reading Comprehension
Published	2019-06-24
URL	https://arxiv.org/abs/1907.00708v2
PDF	https://arxiv.org/pdf/1907.00708v2.pdf
PWC	https://paperswithcode.com/paper/equant-enhanced-question-answer-network
Repo	https://github.com/Francois-Aubet/EQuANt
Framework	tf

Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence


Title	Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
Authors	Chi Sun, Luyao Huang, Xipeng Qiu
Abstract	Aspect-based sentiment analysis (ABSA), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA). In this paper, we construct an auxiliary sentence from the aspect and convert ABSA to a sentence-pair classification task, such as question answering (QA) and natural language inference (NLI). We fine-tune the pre-trained model from BERT and achieve new state-of-the-art results on SentiHood and SemEval-2014 Task 4 datasets.
Tasks	Aspect-Based Sentiment Analysis, Natural Language Inference, Question Answering, Sentiment Analysis
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09588v1
PDF	http://arxiv.org/pdf/1903.09588v1.pdf
PWC	https://paperswithcode.com/paper/utilizing-bert-for-aspect-based-sentiment
Repo	https://github.com/HSLCY/ABSA-BERT-pair
Framework	pytorch

Analytical Moment Regularizer for Gaussian Robust Networks


Title	Analytical Moment Regularizer for Gaussian Robust Networks
Authors	Modar Alfadly, Adel Bibi, Bernard Ghanem
Abstract	Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours. One puzzling behaviour is the subtle sensitive reaction of DNNs to various noise attacks. Such a nuisance has strengthened the line of research around developing and training noise-robust networks. In this work, we propose a new training regularizer that aims to minimize the probabilistic expected training loss of a DNN subject to a generic Gaussian input. We provide an efficient and simple approach to approximate such a regularizer for arbitrary deep networks. This is done by leveraging the analytic expression of the output mean of a shallow neural network; avoiding the need for the memory and computationally expensive data augmentation. We conduct extensive experiments on LeNet and AlexNet on various datasets including MNIST, CIFAR10, and CIFAR100 demonstrating the effectiveness of our proposed regularizer. In particular, we show that networks that are trained with the proposed regularizer benefit from a boost in robustness equivalent to performing 3-21 folds of data augmentation.
Tasks	Data Augmentation
Published	2019-04-24
URL	http://arxiv.org/abs/1904.11005v1
PDF	http://arxiv.org/pdf/1904.11005v1.pdf
PWC	https://paperswithcode.com/paper/190411005
Repo	https://github.com/ModarTensai/gaussian-regularizer
Framework	pytorch

Optimal Multi-view Correction of Local Affine Frames


Title	Optimal Multi-view Correction of Local Affine Frames
Authors	Ivan Eichhardt, Daniel Barath
Abstract	The technique requires the epipolar geometry to be pre-estimated between each image pair. It exploits the constraints which the camera movement implies, in order to apply a closed-form correction to the parameters of the input affinities. Also, it is shown that the rotations and scales obtained by partially affine-covariant detectors, e.g., AKAZE or SIFT, can be completed to be full affine frames by the proposed algorithm. It is validated both in synthetic experiments and on publicly available real-world datasets that the method always improves the output of the evaluated affine-covariant feature detectors. As a by-product, these detectors are compared and the ones obtaining the most accurate affine frames are reported. For demonstrating the applicability, we show that the proposed technique as a pre-processing step improves the accuracy of pose estimation for a camera rig, surface normal and homography estimation.
Tasks	Homography Estimation, Pose Estimation
Published	2019-05-01
URL	http://arxiv.org/abs/1905.00519v1
PDF	http://arxiv.org/pdf/1905.00519v1.pdf
PWC	https://paperswithcode.com/paper/optimal-multi-view-correction-of-local-affine
Repo	https://github.com/eivan/multiview-LAFs-correction
Framework	none

Simple BERT Models for Relation Extraction and Semantic Role Labeling


Title	Simple BERT Models for Relation Extraction and Semantic Role Labeling
Authors	Peng Shi, Jimmy Lin
Abstract	We present simple BERT-based models for relation extraction and semantic role labeling. In recent years, state-of-the-art performance has been achieved using neural models by incorporating lexical and syntactic features such as part-of-speech tags and dependency trees. In this paper, extensive experiments on datasets for these two tasks show that without using any external features, a simple BERT-based model can achieve state-of-the-art performance. To our knowledge, we are the first to successfully apply BERT in this manner. Our models provide strong baselines for future research.
Tasks	Relation Extraction, Semantic Role Labeling
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05255v1
PDF	http://arxiv.org/pdf/1904.05255v1.pdf
PWC	https://paperswithcode.com/paper/simple-bert-models-for-relation-extraction
Repo	https://github.com/phueb/BabyBertSRL
Framework	pytorch

Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds


Title	Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds
Authors	Huan Lei, Naveed Akhtar, Ajmal Mian
Abstract	We propose a spherical kernel for efficient graph convolution of 3D point clouds. Our metric-based kernels systematically quantize the local 3D space to identify distinctive geometric relationships in the data. Similar to the regular grid CNN kernels, the spherical kernel maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning. The proposed kernel is applied to graph neural networks without edge-dependent filter generation, making it computationally attractive for large point clouds. In our graph networks, each vertex is associated with a single point location and edges connect the neighborhood points within a defined range. The graph gets coarsened in the network with farthest point sampling. Analogous to the standard CNNs, we define pooling and unpooling operations for our network. We demonstrate the effectiveness of the proposed spherical kernel with graph neural networks for point cloud classification and semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS datasets. The source code and the trained models can be downloaded from https://github.com/hlei-ziyan/SPH3D-GCN.
Tasks	3D Instance Segmentation, 3D Object Classification, 3D Part Segmentation, Semantic Segmentation
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09287v2
PDF	https://arxiv.org/pdf/1909.09287v2.pdf
PWC	https://paperswithcode.com/paper/spherical-kernel-for-efficient-graph
Repo	https://github.com/hlei-ziyan/SPH3D-GCN
Framework	tf

Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization


Title	Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization
Authors	Chao Huang, Haojie Liu, Tong Chen, Qiu Shen, Zhan Ma
Abstract	We propose a MultiScale AutoEncoder(MSAE) based extreme image compression framework to offer visually pleasing reconstruction at a very low bitrate. Our method leverages the “priors” at different resolution scale to improve the compression efficiency, and also employs the generative adversarial network(GAN) with multiscale discriminators to perform the end-to-end trainable rate-distortion optimization. We compare the perceptual quality of our reconstructions with traditional compression algorithms using High-Efficiency Video Coding(HEVC) based Intra Profile and JPEG2000 on the public Cityscapes and ADE20K datasets, demonstrating the significant subjective quality improvement.
Tasks	Image Compression
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03851v2
PDF	https://arxiv.org/pdf/1904.03851v2.pdf
PWC	https://paperswithcode.com/paper/extreme-image-compression-via-multiscale
Repo	https://github.com/WikiChao/Extreme-Image-Compression
Framework	pytorch

No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement


Title	No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement
Authors	Jia Yan, Jie Li, Xin Fu
Abstract	No-reference image quality assessment (NR-IQA) aims to measure the image quality without reference image. However, contrast distortion has been overlooked in the current research of NR-IQA. In this paper, we propose a very simple but effective metric for predicting quality of contrast-altered images based on the fact that a high-contrast image is often more similar to its contrast enhanced image. Specifically, we first generate an enhanced image through histogram equalization. We then calculate the similarity of the original image and the enhanced one by using structural-similarity index (SSIM) as the first feature. Further, we calculate the histogram based entropy and cross entropy between the original image and the enhanced one respectively, to gain a sum of 4 features. Finally, we learn a regression module to fuse the aforementioned 5 features for inferring the quality score. Experiments on four publicly available databases validate the superiority and efficiency of the proposed technique.
Tasks	Image Quality Assessment, No-Reference Image Quality Assessment
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08879v1
PDF	http://arxiv.org/pdf/1904.08879v1.pdf
PWC	https://paperswithcode.com/paper/no-reference-quality-assessment-of-contrast
Repo	https://github.com/steffensbola/blind_iqa_contrast
Framework	none

Language Models as Knowledge Bases?


Title	Language Models as Knowledge Bases?
Authors	Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
Abstract	Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as “fill-in-the-blank” cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.
Tasks	Language Modelling, Open-Domain Question Answering, Question Answering
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01066v2
PDF	https://arxiv.org/pdf/1909.01066v2.pdf
PWC	https://paperswithcode.com/paper/language-models-as-knowledge-bases
Repo	https://github.com/facebookresearch/LAMA
Framework	pytorch

And the Bit Goes Down: Revisiting the Quantization of Neural Networks


Title	And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Authors	Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou
Abstract	In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using byte-aligned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5MB (20x compression factor) while preserving a top-1 accuracy of 76.1% on ImageNet object classification and by compressing a Mask R-CNN with a 26x factor.
Tasks	Object Classification, Quantization
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05686v4
PDF	https://arxiv.org/pdf/1907.05686v4.pdf
PWC	https://paperswithcode.com/paper/and-the-bit-goes-down-revisiting-the
Repo	https://github.com/facebookresearch/kill-the-bits
Framework	pytorch

Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions


Title	Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions
Authors	Yasin Yazıcı, Bruno Lecouat, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, Vijay Chandrasekhar
Abstract	We propose a GAN design which models multiple distributions effectively and discovers their commonalities and particularities. Each data distribution is modeled with a mixture of $K$ generator distributions. As the generators are partially shared between the modeling of different true data distributions, shared ones captures the commonality of the distributions, while non-shared ones capture unique aspects of them. We show the effectiveness of our method on various datasets (MNIST, Fashion MNIST, CIFAR-10, Omniglot, CelebA) with compelling results.
Tasks	Omniglot
Published	2019-02-09
URL	http://arxiv.org/abs/1902.03444v1
PDF	http://arxiv.org/pdf/1902.03444v1.pdf
PWC	https://paperswithcode.com/paper/venn-gan-discovering-commonalities-and
Repo	https://github.com/yasinyazici/Venn_GAN
Framework	tf

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke


Title	CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke
Authors	Hao Yang, Weijian Huang, Kehan Qi, Cheng Li, Xinfeng Liu, Meiyun Wang, Hairong Zheng, Shanshan Wang
Abstract	Segmenting stroke lesions from T1-weighted MR images is of great value for large-scale stroke rehabilitation neuroimaging analyses. Nevertheless, there are great challenges with this task, such as large range of stroke lesion scales and the tissue intensity similarity. The famous encoder-decoder convolutional neural network, which although has made great achievements in medical image segmentation areas, may fail to address these challenges due to the insufficient uses of multi-scale features and context information. To address these challenges, this paper proposes a Cross-Level fusion and Context Inference Network (CLCI-Net) for the chronic stroke lesion segmentation from T1-weighted MR images. Specifically, a Cross-Level feature Fusion (CLF) strategy was developed to make full use of different scale features across different levels; Extending Atrous Spatial Pyramid Pooling (ASPP) with CLF, we have enriched multi-scale features to handle the different lesion sizes; In addition, convolutional long short-term memory (ConvLSTM) is employed to infer context information and thus capture fine structures to address the intensity similarity issue. The proposed approach was evaluated on an open-source dataset, the Anatomical Tracings of Lesions After Stroke (ATLAS) with the results showing that our network outperforms five state-of-the-art methods. We make our code and models available at https://github.com/YH0517/CLCI_Net.
Tasks	Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07008v2
PDF	https://arxiv.org/pdf/1907.07008v2.pdf
PWC	https://paperswithcode.com/paper/clci-net-cross-level-fusion-and-context
Repo	https://github.com/YH0517/CLCI_Net
Framework	tf