February 1, 2020

3112 words 15 mins read

Paper Group AWR 134

Paper Group AWR 134

Domain Knowledge Based Brain Tumor Segmentation and Overall Survival Prediction. Learning Semantic Annotations for Tabular Data. Training convolutional neural networks with cheap convolutions and online distillation. Working women and caste in India: A study of social disadvantage using feature attribution. Bayesian Loss for Crowd Count Estimation …

Domain Knowledge Based Brain Tumor Segmentation and Overall Survival Prediction

Title Domain Knowledge Based Brain Tumor Segmentation and Overall Survival Prediction
Authors Xiaoqing Guo, Chen Yang, Pak Lun Lam, Peter Y. M. Woo, Yixuan Yuan
Abstract Automatically segmenting sub-regions of gliomas (necrosis, edema and enhancing tumor) and accurately predicting overall survival (OS) time from multimodal MRI sequences have important clinical significance in diagnosis, prognosis and treatment of gliomas. However, due to the high degree variations of heterogeneous appearance and individual physical state, the segmentation of sub-regions and OS prediction are very challenging. To deal with these challenges, we utilize a 3D dilated multi-fiber network (DMFNet) with weighted dice loss for brain tumor segmentation, which incorporates prior volume statistic knowledge and obtains a balance between small and large objects in MRI scans. For OS prediction, we propose a DenseNet based 3D neural network with position encoding convolutional layer (PECL) to extract meaningful features from T1 contrast MRI, T2 MRI and previously segmented subregions. Both labeled data and unlabeled data are utilized to prevent over-fitting for semi-supervised learning. Those learned deep features along with handcrafted features (such as ages, volume of tumor) and position encoding segmentation features are fed to a Gradient Boosting Decision Tree (GBDT) to predict a specific OS day
Tasks Brain Tumor Segmentation
Published 2019-12-16
URL https://arxiv.org/abs/1912.07224v1
PDF https://arxiv.org/pdf/1912.07224v1.pdf
PWC https://paperswithcode.com/paper/domain-knowledge-based-brain-tumor
Repo https://github.com/Guo-Xiaoqing/BraTS_OS
Framework none

Learning Semantic Annotations for Tabular Data

Title Learning Semantic Annotations for Tabular Data
Authors Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Charles Sutton
Abstract The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data. Unlike traditional lexical matching-based methods, we propose a deep prediction model that can fully exploit a table’s contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN), and inter-column semantics features learned by a knowledge base (KB) lookup and query answering algorithm.It exhibits good performance not only on individual table sets, but also when transferring from one table set to another.
Tasks
Published 2019-05-30
URL https://arxiv.org/abs/1906.00781v1
PDF https://arxiv.org/pdf/1906.00781v1.pdf
PWC https://paperswithcode.com/paper/190600781
Repo https://github.com/alan-turing-institute/SemAIDA
Framework none

Training convolutional neural networks with cheap convolutions and online distillation

Title Training convolutional neural networks with cheap convolutions and online distillation
Authors Jiao Xie, Shaohui Lin, Yichen Zhang, Linkai Luo
Abstract The large memory and computation consumption in convolutional neural networks (CNNs) has been one of the main barriers for deploying them on resource-limited systems. To this end, most cheap convolutions (e.g., group convolution, depth-wise convolution, and shift convolution) have recently been used for memory and computation reduction but with the specific architecture designing. Furthermore, it results in a low discriminability of the compressed networks by directly replacing the standard convolution with these cheap ones. In this paper, we propose to use knowledge distillation to improve the performance of the compact student networks with cheap convolutions. In our case, the teacher is a network with the standard convolution, while the student is a simple transformation of the teacher architecture without complicated redesigning. In particular, we propose a novel online distillation method, which online constructs the teacher network without pre-training and conducts mutual learning between the teacher and student network, to improve the performance of the student model. Extensive experiments demonstrate that the proposed approach achieves superior performance to simultaneously reduce memory and computation overhead of cutting-edge CNNs on different datasets, including CIFAR-10/100 and ImageNet ILSVRC 2012, compared to the state-of-the-art CNN compression and acceleration methods. The codes are publicly available at https://github.com/EthanZhangYC/OD-cheap-convolution.
Tasks
Published 2019-09-28
URL https://arxiv.org/abs/1909.13063v3
PDF https://arxiv.org/pdf/1909.13063v3.pdf
PWC https://paperswithcode.com/paper/training-convolutional-neural-networks-with-2
Repo https://github.com/EthanZhangYC/OD-cheap-convolution
Framework pytorch

Working women and caste in India: A study of social disadvantage using feature attribution

Title Working women and caste in India: A study of social disadvantage using feature attribution
Authors Kuhu Joshi, Chaitanya K. Joshi
Abstract Women belonging to the socially disadvantaged caste-groups in India have historically been engaged in labour-intensive, blue-collar work. We study whether there has been any change in the ability to predict a woman’s work-status and work-type based on her caste by interpreting machine learning models using feature attribution. We find that caste is now a less important determinant of work for the younger generation of women compared to the older generation. Moreover, younger women from disadvantaged castes are now more likely to be working in white-collar jobs.
Tasks
Published 2019-04-27
URL https://arxiv.org/abs/1905.03092v2
PDF https://arxiv.org/pdf/1905.03092v2.pdf
PWC https://paperswithcode.com/paper/190503092
Repo https://github.com/chaitjo/working-women
Framework none

Bayesian Loss for Crowd Count Estimation with Point Supervision

Title Bayesian Loss for Crowd Count Estimation with Point Supervision
Authors Zhiheng Ma, Xing Wei, Xiaopeng Hong, Yihong Gong
Abstract In crowd counting datasets, each person is annotated by a point, which is usually the center of the head. And the task is to estimate the total count in a crowd scene. Most of the state-of-the-art methods are based on density map estimation, which convert the sparse point annotations into a “ground truth” density map through a Gaussian kernel, and then use it as the learning target to train a density map estimator. However, such a “ground-truth” density map is imperfect due to occlusions, perspective effects, variations in object shapes, etc. On the contrary, we propose \emph{Bayesian loss}, a novel loss function which constructs a density contribution probability model from the point annotations. Instead of constraining the value at every pixel in the density map, the proposed training loss adopts a more reliable supervision on the count expectation at each annotated point. Without bells and whistles, the loss function makes substantial improvements over the baseline loss on all tested datasets. Moreover, our proposed loss function equipped with a standard backbone network, without using any external detectors or multi-scale architectures, plays favourably against the state of the arts. Our method outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset. The source code is available at \url{https://github.com/ZhihengCV/Baysian-Crowd-Counting}.
Tasks Crowd Counting
Published 2019-08-10
URL https://arxiv.org/abs/1908.03684v1
PDF https://arxiv.org/pdf/1908.03684v1.pdf
PWC https://paperswithcode.com/paper/bayesian-loss-for-crowd-count-estimation-with
Repo https://github.com/ZhihengCV/Bayesian-Crowd-Counting
Framework pytorch

PCC Net: Perspective Crowd Counting via Spatial Convolutional Network

Title PCC Net: Perspective Crowd Counting via Spatial Convolutional Network
Authors Junyu Gao, Qi Wang, Xuelong Li
Abstract Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a Perspective Crowd Counting Network (PCC Net), which consists of three parts: 1) Density Map Estimation (DME) focuses on learning very local features for density map estimation; 2) Random High-level Density Classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; 3) Fore-/Background Segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the DULR module is embedded in PCC Net to encode the perspective changes on four directions (Down, Up, Left and Right). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.
Tasks Crowd Counting
Published 2019-05-24
URL https://arxiv.org/abs/1905.10085v1
PDF https://arxiv.org/pdf/1905.10085v1.pdf
PWC https://paperswithcode.com/paper/pcc-net-perspective-crowd-counting-via
Repo https://github.com/gjy3035/PCC-Net
Framework pytorch

MOSS: End-to-End Dialog System Framework with Modular Supervision

Title MOSS: End-to-End Dialog System Framework with Modular Supervision
Authors Weixin Liang, Youzhi Tian, Chengcai Chen, Zhou Yu
Abstract A major bottleneck in training end-to-end task-oriented dialog system is the lack of data. To utilize limited training data more efficiently, we propose Modular Supervision Network (MOSS), an encoder-decoder training framework that could incorporate supervision from various intermediate dialog system modules including natural language understanding, dialog state tracking, dialog policy learning, and natural language generation. With only 60% of the training data, MOSS-all (i.e., MOSS with supervision from all four dialog modules) outperforms state-of-the-art models on CamRest676. Moreover, introducing modular supervision has even bigger benefits when the dialog task has a more complex dialog state and action space. With only 40% of the training data, MOSS-all outperforms the state-of-the-art model on a complex laptop network troubleshooting dataset, LaptopNetwork, that we introduced. LaptopNetwork consists of conversations between real customers and customer service agents in Chinese. Moreover, MOSS framework can accommodate dialogs that have supervision from different dialog modules at both the framework level and model level. Therefore, MOSS is extremely flexible to update in a real-world deployment.
Tasks Text Generation
Published 2019-09-12
URL https://arxiv.org/abs/1909.05528v1
PDF https://arxiv.org/pdf/1909.05528v1.pdf
PWC https://paperswithcode.com/paper/moss-end-to-end-dialog-system-framework-with
Repo https://github.com/YouzhiTian/MOSS-End-to-End-Dialog-System-Framework-with-Modular-Supervision
Framework pytorch

Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning

Title Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning
Authors Sean MacAvaney, Luca Soldaini, Nazli Goharian
Abstract While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. This is primarily due to a lack of data set that are suitable to train ranking algorithms. In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. Our model is evaluated in a zero-shot setting, meaning that we use them to predict relevance scores for query-document pairs in languages never seen during training. Our results show that the proposed approach can significantly outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and Spanish. We also show that augmenting the English training collection with some examples from the target language can sometimes improve performance.
Tasks Ad-Hoc Information Retrieval, Information Retrieval, Zero-Shot Learning
Published 2019-12-30
URL https://arxiv.org/abs/1912.13080v1
PDF https://arxiv.org/pdf/1912.13080v1.pdf
PWC https://paperswithcode.com/paper/teaching-a-new-dog-old-tricks-resurrecting
Repo https://github.com/Georgetown-IR-Lab/multilingual-neural-ir
Framework none

Predicting Diffusion Reach Probabilities via Representation Learning on Social Networks

Title Predicting Diffusion Reach Probabilities via Representation Learning on Social Networks
Authors Furkan Gursoy, Ahmet Onur Durahim
Abstract Diffusion reach probability between two nodes on a network is defined as the probability of a cascade originating from one node reaching to another node. An infinite number of cascades would enable calculation of true diffusion reach probabilities between any two nodes. However, there exists only a finite number of cascades and one usually has access only to a small portion of all available cascades. In this work, we addressed the problem of estimating diffusion reach probabilities given only a limited number of cascades and partial information about underlying network structure. Our proposed strategy employs node representation learning to generate and feed node embeddings into machine learning algorithms to create models that predict diffusion reach probabilities. We provide experimental analysis using synthetically generated cascades on two real-world social networks. Results show that proposed method is superior to using values calculated from available cascades when the portion of cascades is small.
Tasks Representation Learning
Published 2019-01-12
URL http://arxiv.org/abs/1901.03829v1
PDF http://arxiv.org/pdf/1901.03829v1.pdf
PWC https://paperswithcode.com/paper/predicting-diffusion-reach-probabilities-via
Repo https://github.com/furkangursoy/RLforDiffPred
Framework none

Matrix Nets: A New Deep Architecture for Object Detection

Title Matrix Nets: A New Deep Architecture for Object Detection
Authors Abdullah Rashwan, Agastya Kalra, Pascal Poupart
Abstract We present Matrix Nets (xNets), a new deep architecture for object detection. xNets map objects with different sizes and aspect ratios into layers where the sizes and the aspect ratios of the objects within their layers are nearly uniform. Hence, xNets provide a scale and aspect ratio aware architecture. We leverage xNets to enhance key-points based object detection. Our architecture achieves mAP of 47.8 on MS COCO, which is higher than any other single-shot detector while using half the number of parameters and training 3x faster than the next best architecture.
Tasks Object Detection
Published 2019-08-13
URL https://arxiv.org/abs/1908.04646v2
PDF https://arxiv.org/pdf/1908.04646v2.pdf
PWC https://paperswithcode.com/paper/matrix-nets-a-new-deep-architecture-for
Repo https://github.com/lizhe960118/CenterNet.git
Framework none

Recombinator-k-means: A population based algorithm that exploits k-means++ for recombination

Title Recombinator-k-means: A population based algorithm that exploits k-means++ for recombination
Authors Carlo Baldassi
Abstract We present a simple heuristic algorithm for efficiently optimizing the notoriously hard “minimum sum-of-squares clustering” problem, usually addressed by the classical k-means heuristic and its variants. The algorithm, called recombinator-k-means, is very similar to a genetic algorithmic scheme: it uses populations of configurations, that are optimized independently in parallel and then recombined in a next-iteration population batch by exploiting a variant of the k-means++ seeding algorithm. An additional reweighting mechanism ensures that the population eventually coalesces into a single solution. Extensive tests measuring optimization objective vs computational time on synthetic and real-word data show that it is the only choice, among state-of-the-art alternatives (simple restarts, random swap, genetic algorithm with pairwise-nearest-neighbor crossover), that consistently produces good results at all time scales, outperforming competitors on large and complicated datasets. The only parameter that requires tuning is the population size. The scheme is rather general (it could be applied even to k-medians or k-medoids, for example). Our implementation is publicly available at https://github.com/carlobaldassi/RecombinatorKMeans.jl.
Tasks
Published 2019-05-01
URL https://arxiv.org/abs/1905.00531v3
PDF https://arxiv.org/pdf/1905.00531v3.pdf
PWC https://paperswithcode.com/paper/190500531
Repo https://github.com/carlobaldassi/RecombinatorKMeans.jl
Framework none

Show Your Work: Improved Reporting of Experimental Results

Title Show Your Work: Improved Reporting of Experimental Results
Authors Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
Abstract Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs best. We argue for reporting additional details, especially performance on validation data obtained during model development. We present a novel technique for doing so: expected validation performance of the best-found model as a function of computation budget (i.e., the number of hyperparameter search trials or the overall training time). Using our approach, we find multiple recent model comparisons where authors would have reached a different conclusion if they had used more (or less) computation. Our approach also allows us to estimate the amount of computation required to obtain a given accuracy; applying it to several recently published results yields massive variation across papers, from hours to weeks. We conclude with a set of best practices for reporting experimental results which allow for robust future comparisons, and provide code to allow researchers to use our technique.
Tasks
Published 2019-09-06
URL https://arxiv.org/abs/1909.03004v1
PDF https://arxiv.org/pdf/1909.03004v1.pdf
PWC https://paperswithcode.com/paper/show-your-work-improved-reporting-of
Repo https://github.com/allenai/show-your-work
Framework none

Decision-Directed Data Decomposition

Title Decision-Directed Data Decomposition
Authors Brent D. Davis, Ethan Jackson, Daniel J. Lizotte
Abstract We present an algorithm, Decision-Directed Data Decomposition (D4), which decomposes a dataset into two components. The first contains most of the useful information for a specified supervised learning task. The second orthogonal component contains little information about the task but retains associations and information that were not targeted. The algorithm is simple and scalable. We illustrate its application in image and text processing domains. Our results show that 1) post-hoc application of D4 to an image representation space can remove information about specified concepts without impacting other concepts, 2) D4 is able to improve predictive generalization in certain settings, and 3) applying D4 to word embedding representations produces state-of-the-art results in debiasing.
Tasks Word Embeddings
Published 2019-09-18
URL https://arxiv.org/abs/1909.08159v2
PDF https://arxiv.org/pdf/1909.08159v2.pdf
PWC https://paperswithcode.com/paper/decision-directed-data-decomposition
Repo https://github.com/bdavis56/DDDD
Framework none

Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation

Title Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation
Authors Alexander Levine, Soheil Feizi
Abstract Recently, techniques have been developed to provably guarantee the robustness of a classifier to adversarial perturbations of bounded L_1 and L_2 magnitudes by using randomized smoothing: the robust classification is a consensus of base classifications on randomly noised samples where the noise is additive. In this paper, we extend this technique to the L_0 threat model. We propose an efficient and certifiably robust defense against sparse adversarial attacks by randomly ablating input features, rather than using additive noise. Experimentally, on MNIST, we can certify the classifications of over 50% of images to be robust to any distortion of at most 8 pixels. This is comparable to the observed empirical robustness of unprotected classifiers on MNIST to modern L_0 attacks, demonstrating the tightness of the proposed robustness certificate. We also evaluate our certificate on ImageNet and CIFAR-10. Our certificates represent an improvement on those provided in a concurrent work (Lee et al. 2019) which uses random noise rather than ablation (median certificates of 8 pixels versus 4 pixels on MNIST; 16 pixels versus 1 pixel on ImageNet.) Additionally, we empirically demonstrate that our classifier is highly robust to modern sparse adversarial attacks on MNIST. Our classifications are robust, in median, to adversarial perturbations of up to 31 pixels, compared to 22 pixels reported as the state-of-the-art defense, at the cost of a slight decrease (around 2.3%) in the classification accuracy. Code is available at https://github.com/alevine0/randomizedAblation/.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09272v1
PDF https://arxiv.org/pdf/1911.09272v1.pdf
PWC https://paperswithcode.com/paper/robustness-certificates-for-sparse
Repo https://github.com/alevine0/randomizedAblation
Framework pytorch

3D Ken Burns Effect from a Single Image

Title 3D Ken Burns Effect from a Single Image
Authors Simon Niklaus, Long Mai, Jimei Yang, Feng Liu
Abstract The Ken Burns effect allows animating still images with a virtual camera scan and zoom. Adding parallax, which results in the 3D Ken Burns effect, enables significantly more compelling results. Creating such effects manually is time-consuming and demands sophisticated editing skills. Existing automatic methods, however, require multiple input images from varying viewpoints. In this paper, we introduce a framework that synthesizes the 3D Ken Burns effect from a single image, supporting both a fully automatic mode and an interactive mode with the user controlling the camera. Our framework first leverages a depth prediction pipeline, which estimates scene depth that is suitable for view synthesis tasks. To address the limitations of existing depth estimation methods such as geometric distortions, semantic distortions, and inaccurate depth boundaries, we develop a semantic-aware neural network for depth prediction, couple its estimate with a segmentation-based depth adjustment process, and employ a refinement neural network that facilitates accurate depth predictions at object boundaries. According to this depth estimate, our framework then maps the input image to a point cloud and synthesizes the resulting video frames by rendering the point cloud from the corresponding camera positions. To address disocclusions while maintaining geometrically and temporally coherent synthesis results, we utilize context-aware color- and depth-inpainting to fill in the missing information in the extreme views of the camera path, thus extending the scene geometry of the point cloud. Experiments with a wide variety of image content show that our method enables realistic synthesis results. Our study demonstrates that our system allows users to achieve better results while requiring little effort compared to existing solutions for the 3D Ken Burns effect creation.
Tasks Depth Estimation
Published 2019-09-12
URL https://arxiv.org/abs/1909.05483v1
PDF https://arxiv.org/pdf/1909.05483v1.pdf
PWC https://paperswithcode.com/paper/3d-ken-burns-effect-from-a-single-image
Repo https://github.com/sniklaus/3d-ken-burns
Framework pytorch
comments powered by Disqus