Paper Group AWR 375
Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network. HAKE: Human Activity Knowledge Engine. Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare. EQuANt (Enhanced Question Answer Network). Utilizing BERT for Aspect-Based Se …
Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network
Title | Unsupervised 3D End-to-End Medical Image Registration with Volume Tweening Network |
Authors | Shengyu Zhao, Tingfung Lau, Ji Luo, Eric I-Chao Chang, Yan Xu |
Abstract | 3D medical image registration is of great clinical importance. However, supervised learning methods require a large amount of accurately annotated corresponding control points (or morphing), which are very difficult to obtain. Unsupervised learning methods ease the burden of manual annotation by exploiting unlabeled data without supervision. In this paper, we propose a new unsupervised learning method using convolutional neural networks under an end-to-end framework, Volume Tweening Network (VTN), for 3D medical image registration. We propose three innovative technical components: (1) An end-to-end cascading scheme that resolves large displacement; (2) An efficient integration of affine registration network; and (3) An additional invertibility loss that encourages backward consistency. Experiments demonstrate that our algorithm is 880x faster (or 3.3x faster without GPU acceleration) than traditional optimization-based methods and achieves state-of-theart performance in medical image registration. |
Tasks | Image Registration, Medical Image Registration |
Published | 2019-02-13 |
URL | https://arxiv.org/abs/1902.05020v3 |
https://arxiv.org/pdf/1902.05020v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-3d-end-to-end-medical-image |
Repo | https://github.com/microsoft/Recursive-Cascaded-Networks |
Framework | tf |
HAKE: Human Activity Knowledge Engine
Title | HAKE: Human Activity Knowledge Engine |
Authors | Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu |
Abstract | Human activity understanding is crucial for building automatic intelligent system. With the help of deep learning, activity understanding has made huge progress recently. But some challenges such as imbalanced data distribution, action ambiguity, complex visual patterns still remain. To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states. Upon existing activity datasets, we annotate the part states of all the active persons in all images, thus establish the relationship between instance activity and body part states. Furthermore, we propose a HAKE based part state recognition model with a knowledge extractor named Activity2Vec and a corresponding part state based reasoning network. With HAKE, our method can alleviate the learning difficulty brought by the long-tail data distribution, and bring in interpretability. Now our HAKE has more than 7 M+ part state annotations and is still under construction. We first validate our approach on a part of HAKE in this preliminary paper, where we show 7.2 mAP performance improvement on Human-Object Interaction recognition, and 12.38 mAP improvement on the one-shot subsets. |
Tasks | Human-Object Interaction Detection |
Published | 2019-04-13 |
URL | https://arxiv.org/abs/1904.06539v5 |
https://arxiv.org/pdf/1904.06539v5.pdf | |
PWC | https://paperswithcode.com/paper/hake-human-activity-knowledge-engine |
Repo | https://github.com/DirtyHarryLYL/HAKE |
Framework | none |
Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare
Title | Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare |
Authors | Scott L. Fleming, Kuhan Jeyapragasan, Tony Duan, Daisy Ding, Saurabh Gombar, Nigam Shah, Emma Brunskill |
Abstract | There is an emerging trend in the reinforcement learning for healthcare literature. In order to prepare longitudinal, irregularly sampled, clinical datasets for reinforcement learning algorithms, many researchers will resample the time series data to short, regular intervals and use last-observation-carried-forward (LOCF) imputation to fill in these gaps. Typically, they will not maintain any explicit information about which values were imputed. In this work, we (1) call attention to this practice and discuss its potential implications; (2) propose an alternative representation of the patient state that addresses some of these issues; and (3) demonstrate in a novel but representative clinical dataset that our alternative representation yields consistently better results for achieving optimal control, as measured by off-policy policy evaluation, compared to representations that do not incorporate missingness information. |
Tasks | Imputation, Time Series |
Published | 2019-11-16 |
URL | https://arxiv.org/abs/1911.07084v1 |
https://arxiv.org/pdf/1911.07084v1.pdf | |
PWC | https://paperswithcode.com/paper/missingness-as-stability-understanding-the |
Repo | https://github.com/scottfleming/rl-missingness |
Framework | none |
EQuANt (Enhanced Question Answer Network)
Title | EQuANt (Enhanced Question Answer Network) |
Authors | François-Xavier Aubet, Dominic Danks, Yuchen Zhu |
Abstract | Machine Reading Comprehension (MRC) is an important topic in the domain of automated question answering and in natural language processing more generally. Since the release of the SQuAD 1.1 and SQuAD 2 datasets, progress in the field has been particularly significant, with current state-of-the-art models now exhibiting near-human performance at both answering well-posed questions and detecting questions which are unanswerable given a corresponding context. In this work, we present Enhanced Question Answer Network (EQuANt), an MRC model which extends the successful QANet architecture of Yu et al. to cope with unanswerable questions. By training and evaluating EQuANt on SQuAD 2, we show that it is indeed possible to extend QANet to the unanswerable domain. We achieve results which are close to 2 times better than our chosen baseline obtained by evaluating a lightweight version of the original QANet architecture on SQuAD 2. In addition, we report that the performance of EQuANt on SQuAD 1.1 after being trained on SQuAD2 exceeds that of our lightweight QANet architecture trained and evaluated on SQuAD 1.1, demonstrating the utility of multi-task learning in the MRC context. |
Tasks | Machine Reading Comprehension, Multi-Task Learning, Question Answering, Reading Comprehension |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1907.00708v2 |
https://arxiv.org/pdf/1907.00708v2.pdf | |
PWC | https://paperswithcode.com/paper/equant-enhanced-question-answer-network |
Repo | https://github.com/Francois-Aubet/EQuANt |
Framework | tf |
Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
Title | Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence |
Authors | Chi Sun, Luyao Huang, Xipeng Qiu |
Abstract | Aspect-based sentiment analysis (ABSA), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA). In this paper, we construct an auxiliary sentence from the aspect and convert ABSA to a sentence-pair classification task, such as question answering (QA) and natural language inference (NLI). We fine-tune the pre-trained model from BERT and achieve new state-of-the-art results on SentiHood and SemEval-2014 Task 4 datasets. |
Tasks | Aspect-Based Sentiment Analysis, Natural Language Inference, Question Answering, Sentiment Analysis |
Published | 2019-03-22 |
URL | http://arxiv.org/abs/1903.09588v1 |
http://arxiv.org/pdf/1903.09588v1.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-bert-for-aspect-based-sentiment |
Repo | https://github.com/HSLCY/ABSA-BERT-pair |
Framework | pytorch |
Analytical Moment Regularizer for Gaussian Robust Networks
Title | Analytical Moment Regularizer for Gaussian Robust Networks |
Authors | Modar Alfadly, Adel Bibi, Bernard Ghanem |
Abstract | Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours. One puzzling behaviour is the subtle sensitive reaction of DNNs to various noise attacks. Such a nuisance has strengthened the line of research around developing and training noise-robust networks. In this work, we propose a new training regularizer that aims to minimize the probabilistic expected training loss of a DNN subject to a generic Gaussian input. We provide an efficient and simple approach to approximate such a regularizer for arbitrary deep networks. This is done by leveraging the analytic expression of the output mean of a shallow neural network; avoiding the need for the memory and computationally expensive data augmentation. We conduct extensive experiments on LeNet and AlexNet on various datasets including MNIST, CIFAR10, and CIFAR100 demonstrating the effectiveness of our proposed regularizer. In particular, we show that networks that are trained with the proposed regularizer benefit from a boost in robustness equivalent to performing 3-21 folds of data augmentation. |
Tasks | Data Augmentation |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.11005v1 |
http://arxiv.org/pdf/1904.11005v1.pdf | |
PWC | https://paperswithcode.com/paper/190411005 |
Repo | https://github.com/ModarTensai/gaussian-regularizer |
Framework | pytorch |
Optimal Multi-view Correction of Local Affine Frames
Title | Optimal Multi-view Correction of Local Affine Frames |
Authors | Ivan Eichhardt, Daniel Barath |
Abstract | The technique requires the epipolar geometry to be pre-estimated between each image pair. It exploits the constraints which the camera movement implies, in order to apply a closed-form correction to the parameters of the input affinities. Also, it is shown that the rotations and scales obtained by partially affine-covariant detectors, e.g., AKAZE or SIFT, can be completed to be full affine frames by the proposed algorithm. It is validated both in synthetic experiments and on publicly available real-world datasets that the method always improves the output of the evaluated affine-covariant feature detectors. As a by-product, these detectors are compared and the ones obtaining the most accurate affine frames are reported. For demonstrating the applicability, we show that the proposed technique as a pre-processing step improves the accuracy of pose estimation for a camera rig, surface normal and homography estimation. |
Tasks | Homography Estimation, Pose Estimation |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00519v1 |
http://arxiv.org/pdf/1905.00519v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-multi-view-correction-of-local-affine |
Repo | https://github.com/eivan/multiview-LAFs-correction |
Framework | none |
Simple BERT Models for Relation Extraction and Semantic Role Labeling
Title | Simple BERT Models for Relation Extraction and Semantic Role Labeling |
Authors | Peng Shi, Jimmy Lin |
Abstract | We present simple BERT-based models for relation extraction and semantic role labeling. In recent years, state-of-the-art performance has been achieved using neural models by incorporating lexical and syntactic features such as part-of-speech tags and dependency trees. In this paper, extensive experiments on datasets for these two tasks show that without using any external features, a simple BERT-based model can achieve state-of-the-art performance. To our knowledge, we are the first to successfully apply BERT in this manner. Our models provide strong baselines for future research. |
Tasks | Relation Extraction, Semantic Role Labeling |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05255v1 |
http://arxiv.org/pdf/1904.05255v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-bert-models-for-relation-extraction |
Repo | https://github.com/phueb/BabyBertSRL |
Framework | pytorch |
Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds
Title | Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds |
Authors | Huan Lei, Naveed Akhtar, Ajmal Mian |
Abstract | We propose a spherical kernel for efficient graph convolution of 3D point clouds. Our metric-based kernels systematically quantize the local 3D space to identify distinctive geometric relationships in the data. Similar to the regular grid CNN kernels, the spherical kernel maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning. The proposed kernel is applied to graph neural networks without edge-dependent filter generation, making it computationally attractive for large point clouds. In our graph networks, each vertex is associated with a single point location and edges connect the neighborhood points within a defined range. The graph gets coarsened in the network with farthest point sampling. Analogous to the standard CNNs, we define pooling and unpooling operations for our network. We demonstrate the effectiveness of the proposed spherical kernel with graph neural networks for point cloud classification and semantic segmentation using ModelNet, ShapeNet, RueMonge2014, ScanNet and S3DIS datasets. The source code and the trained models can be downloaded from https://github.com/hlei-ziyan/SPH3D-GCN. |
Tasks | 3D Instance Segmentation, 3D Object Classification, 3D Part Segmentation, Semantic Segmentation |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09287v2 |
https://arxiv.org/pdf/1909.09287v2.pdf | |
PWC | https://paperswithcode.com/paper/spherical-kernel-for-efficient-graph |
Repo | https://github.com/hlei-ziyan/SPH3D-GCN |
Framework | tf |
Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization
Title | Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization |
Authors | Chao Huang, Haojie Liu, Tong Chen, Qiu Shen, Zhan Ma |
Abstract | We propose a MultiScale AutoEncoder(MSAE) based extreme image compression framework to offer visually pleasing reconstruction at a very low bitrate. Our method leverages the “priors” at different resolution scale to improve the compression efficiency, and also employs the generative adversarial network(GAN) with multiscale discriminators to perform the end-to-end trainable rate-distortion optimization. We compare the perceptual quality of our reconstructions with traditional compression algorithms using High-Efficiency Video Coding(HEVC) based Intra Profile and JPEG2000 on the public Cityscapes and ADE20K datasets, demonstrating the significant subjective quality improvement. |
Tasks | Image Compression |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.03851v2 |
https://arxiv.org/pdf/1904.03851v2.pdf | |
PWC | https://paperswithcode.com/paper/extreme-image-compression-via-multiscale |
Repo | https://github.com/WikiChao/Extreme-Image-Compression |
Framework | pytorch |
No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement
Title | No-Reference Quality Assessment of Contrast-Distorted Images using Contrast Enhancement |
Authors | Jia Yan, Jie Li, Xin Fu |
Abstract | No-reference image quality assessment (NR-IQA) aims to measure the image quality without reference image. However, contrast distortion has been overlooked in the current research of NR-IQA. In this paper, we propose a very simple but effective metric for predicting quality of contrast-altered images based on the fact that a high-contrast image is often more similar to its contrast enhanced image. Specifically, we first generate an enhanced image through histogram equalization. We then calculate the similarity of the original image and the enhanced one by using structural-similarity index (SSIM) as the first feature. Further, we calculate the histogram based entropy and cross entropy between the original image and the enhanced one respectively, to gain a sum of 4 features. Finally, we learn a regression module to fuse the aforementioned 5 features for inferring the quality score. Experiments on four publicly available databases validate the superiority and efficiency of the proposed technique. |
Tasks | Image Quality Assessment, No-Reference Image Quality Assessment |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08879v1 |
http://arxiv.org/pdf/1904.08879v1.pdf | |
PWC | https://paperswithcode.com/paper/no-reference-quality-assessment-of-contrast |
Repo | https://github.com/steffensbola/blind_iqa_contrast |
Framework | none |
Language Models as Knowledge Bases?
Title | Language Models as Knowledge Bases? |
Authors | Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel |
Abstract | Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as “fill-in-the-blank” cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA. |
Tasks | Language Modelling, Open-Domain Question Answering, Question Answering |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01066v2 |
https://arxiv.org/pdf/1909.01066v2.pdf | |
PWC | https://paperswithcode.com/paper/language-models-as-knowledge-bases |
Repo | https://github.com/facebookresearch/LAMA |
Framework | pytorch |
And the Bit Goes Down: Revisiting the Quantization of Neural Networks
Title | And the Bit Goes Down: Revisiting the Quantization of Neural Networks |
Authors | Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou |
Abstract | In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using byte-aligned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5MB (20x compression factor) while preserving a top-1 accuracy of 76.1% on ImageNet object classification and by compressing a Mask R-CNN with a 26x factor. |
Tasks | Object Classification, Quantization |
Published | 2019-07-12 |
URL | https://arxiv.org/abs/1907.05686v4 |
https://arxiv.org/pdf/1907.05686v4.pdf | |
PWC | https://paperswithcode.com/paper/and-the-bit-goes-down-revisiting-the |
Repo | https://github.com/facebookresearch/kill-the-bits |
Framework | pytorch |
Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions
Title | Venn GAN: Discovering Commonalities and Particularities of Multiple Distributions |
Authors | Yasin Yazıcı, Bruno Lecouat, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, Vijay Chandrasekhar |
Abstract | We propose a GAN design which models multiple distributions effectively and discovers their commonalities and particularities. Each data distribution is modeled with a mixture of $K$ generator distributions. As the generators are partially shared between the modeling of different true data distributions, shared ones captures the commonality of the distributions, while non-shared ones capture unique aspects of them. We show the effectiveness of our method on various datasets (MNIST, Fashion MNIST, CIFAR-10, Omniglot, CelebA) with compelling results. |
Tasks | Omniglot |
Published | 2019-02-09 |
URL | http://arxiv.org/abs/1902.03444v1 |
http://arxiv.org/pdf/1902.03444v1.pdf | |
PWC | https://paperswithcode.com/paper/venn-gan-discovering-commonalities-and |
Repo | https://github.com/yasinyazici/Venn_GAN |
Framework | tf |
CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke
Title | CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke |
Authors | Hao Yang, Weijian Huang, Kehan Qi, Cheng Li, Xinfeng Liu, Meiyun Wang, Hairong Zheng, Shanshan Wang |
Abstract | Segmenting stroke lesions from T1-weighted MR images is of great value for large-scale stroke rehabilitation neuroimaging analyses. Nevertheless, there are great challenges with this task, such as large range of stroke lesion scales and the tissue intensity similarity. The famous encoder-decoder convolutional neural network, which although has made great achievements in medical image segmentation areas, may fail to address these challenges due to the insufficient uses of multi-scale features and context information. To address these challenges, this paper proposes a Cross-Level fusion and Context Inference Network (CLCI-Net) for the chronic stroke lesion segmentation from T1-weighted MR images. Specifically, a Cross-Level feature Fusion (CLF) strategy was developed to make full use of different scale features across different levels; Extending Atrous Spatial Pyramid Pooling (ASPP) with CLF, we have enriched multi-scale features to handle the different lesion sizes; In addition, convolutional long short-term memory (ConvLSTM) is employed to infer context information and thus capture fine structures to address the intensity similarity issue. The proposed approach was evaluated on an open-source dataset, the Anatomical Tracings of Lesions After Stroke (ATLAS) with the results showing that our network outperforms five state-of-the-art methods. We make our code and models available at https://github.com/YH0517/CLCI_Net. |
Tasks | Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07008v2 |
https://arxiv.org/pdf/1907.07008v2.pdf | |
PWC | https://paperswithcode.com/paper/clci-net-cross-level-fusion-and-context |
Repo | https://github.com/YH0517/CLCI_Net |
Framework | tf |