Paper Group ANR 972
Exclusive Independent Probability Estimation using Deep 3D Fully Convolutional DenseNets: Application to IsoIntense Infant Brain MRI Segmentation. Deep Learning Super-Resolution Enables Rapid Simultaneous Morphological and Quantitative Magnetic Resonance Imaging. Seq2Slate: Re-ranking and Slate Optimization with RNNs. Large Scale Clustering with Va …
Exclusive Independent Probability Estimation using Deep 3D Fully Convolutional DenseNets: Application to IsoIntense Infant Brain MRI Segmentation
Title | Exclusive Independent Probability Estimation using Deep 3D Fully Convolutional DenseNets: Application to IsoIntense Infant Brain MRI Segmentation |
Authors | Seyed Raein Hashemi, Sanjay P. Prabhu, Simon K. Warfield, Ali Gholipour |
Abstract | The most recent fast and accurate image segmentation methods are built upon fully convolutional deep neural networks. In this paper, we propose new deep learning strategies for DenseNets to improve segmenting images with subtle differences in intensity values and features. We aim to segment brain tissue on infant brain MRI at about 6 months of age where white matter and gray matter of the developing brain show similar T1 and T2 relaxation times, thus appear to have similar intensity values on both T1- and T2-weighted MRI scans. Brain tissue segmentation at this age is, therefore, very challenging. To this end, we propose an exclusive multi-label training strategy to segment the mutually exclusive brain tissues with similarity loss functions that automatically balance the training based on class prevalence. Using our proposed training strategy based on similarity loss functions and patch prediction fusion we decrease the number of parameters in the network, reduce the complexity of the training process focusing the attention on less number of tasks, while mitigating the effects of data imbalance between labels and inaccuracies near patch borders. By taking advantage of these strategies we were able to perform fast image segmentation (90 seconds per 3D volume), using a network with less parameters than many state-of-the-art networks, overcoming issues such as 3Dvs2D training and large vs small patch size selection, while achieving the top performance in segmenting brain tissue among all methods tested in first and second round submissions of the isointense infant brain MRI segmentation (iSeg) challenge according to the official challenge test results. Our proposed strategy improves the training process through balanced training and by reducing its complexity while providing a trained model that works for any size input image and is fast and more accurate than many state-of-the-art methods. |
Tasks | Infant Brain Mri Segmentation, Semantic Segmentation |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.08168v3 |
http://arxiv.org/pdf/1809.08168v3.pdf | |
PWC | https://paperswithcode.com/paper/exclusive-independent-probability-estimation |
Repo | |
Framework | |
Deep Learning Super-Resolution Enables Rapid Simultaneous Morphological and Quantitative Magnetic Resonance Imaging
Title | Deep Learning Super-Resolution Enables Rapid Simultaneous Morphological and Quantitative Magnetic Resonance Imaging |
Authors | Akshay Chaudhari, Zhongnan Fang, Jin Hyung Lee, Garry Gold, Brian Hargreaves |
Abstract | Obtaining magnetic resonance images (MRI) with high resolution and generating quantitative image-based biomarkers for assessing tissue biochemistry is crucial in clinical and research applications. How- ever, acquiring quantitative biomarkers requires high signal-to-noise ratio (SNR), which is at odds with high-resolution in MRI, especially in a single rapid sequence. In this paper, we demonstrate how super-resolution can be utilized to maintain adequate SNR for accurate quantification of the T2 relaxation time biomarker, while simultaneously generating high- resolution images. We compare the efficacy of resolution enhancement using metrics such as peak SNR and structural similarity. We assess accuracy of cartilage T2 relaxation times by comparing against a standard reference method. Our evaluation suggests that SR can successfully maintain high-resolution and generate accurate biomarkers for accelerating MRI scans and enhancing the value of clinical and research MRI. |
Tasks | Super-Resolution |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.04447v1 |
http://arxiv.org/pdf/1808.04447v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-super-resolution-enables-rapid |
Repo | |
Framework | |
Seq2Slate: Re-ranking and Slate Optimization with RNNs
Title | Seq2Slate: Re-ranking and Slate Optimization with RNNs |
Authors | Irwan Bello, Sayali Kulkarni, Sagar Jain, Craig Boutilier, Ed Chi, Elad Eban, Xiyang Luo, Alan Mackey, Ofer Meshi |
Abstract | Ranking is a central task in machine learning and information retrieval. In this task, it is especially important to present the user with a slate of items that is appealing as a whole. This in turn requires taking into account interactions between items, since intuitively, placing an item on the slate affects the decision of which other items should be placed alongside it. In this work, we propose a sequence-to-sequence model for ranking called seq2slate. At each step, the model predicts the next `best’ item to place on the slate given the items already selected. The sequential nature of the model allows complex dependencies between the items to be captured directly in a flexible and scalable way. We show how to learn the model end-to-end from weak supervision in the form of easily obtained click-through data. We further demonstrate the usefulness of our approach in experiments on standard ranking benchmarks as well as in a real-world recommendation system. | |
Tasks | Information Retrieval |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02019v3 |
http://arxiv.org/pdf/1810.02019v3.pdf | |
PWC | https://paperswithcode.com/paper/seq2slate-re-ranking-and-slate-optimization |
Repo | |
Framework | |
Large Scale Clustering with Variational EM for Gaussian Mixture Models
Title | Large Scale Clustering with Variational EM for Gaussian Mixture Models |
Authors | Florian Hirschberger, Dennis Forster, Jörg Lücke |
Abstract | How can we efficiently find large numbers of clusters in large data sets with high-dimensional data points? Our aim is to explore the current efficiency and large-scale limits in fitting a parametric model for clustering to data distributions. To do so, we combine recent lines of research which have previously focused on separate specific methods for complexity reduction. We first show theoretically how the clustering objective of variational EM (which reduces complexity for many clusters) can be combined with coreset objectives (which reduce complexity for many data points). Secondly, we realize a concrete highly efficient iterative procedure which combines and translates the theoretical complexity gains of truncated variational EM and coresets into a practical algorithm. For very large scales, the high efficiency of parameter updates then requires (A) highly efficient coreset construction and (B) highly efficient initialization procedures (seeding) in order to avoid computational bottlenecks. Fortunately very efficient coreset construction has become available in the form of light-weight coresets, and very efficient initialization has become available in the form of AFK-MC$^2$ seeding. The resulting algorithm features balanced computational costs across all constituting components. In applications to standard large-scale benchmarks for clustering, we investigate the algorithm’s efficiency/quality trade-off. Compared to the best recent approaches, we observe speedups of up to one order of magnitude, and up to two orders of magnitude compared to the $k$-means++ baseline. To demonstrate that the observed efficiency enables previously considered unfeasible applications, we cluster the entire and unscaled 80 Mio. Tiny Images dataset into up to 32,000 clusters. To the knowledge of the authors, this represents the largest scale fit of a parametric data model for clustering reported so far. |
Tasks | Quantization |
Published | 2018-10-01 |
URL | https://arxiv.org/abs/1810.00803v3 |
https://arxiv.org/pdf/1810.00803v3.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-training-of-large-scale-gaussian |
Repo | |
Framework | |
Deep Generative Models with Learnable Knowledge Constraints
Title | Deep Generative Models with Learnable Knowledge Constraints |
Authors | Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Xiaodan Liang, Lianhui Qin, Haoye Dong, Eric Xing |
Abstract | The broad set of deep generative models (DGMs) has achieved remarkable advances. However, it is often difficult to incorporate rich structured domain knowledge with the end-to-end DGMs. Posterior regularization (PR) offers a principled framework to impose structured constraints on probabilistic models, but has limited applicability to the diverse DGMs that can lack a Bayesian formulation or even explicit density evaluation. PR also requires constraints to be fully specified a priori, which is impractical or suboptimal for complex knowledge with learnable uncertain parts. In this paper, we establish mathematical correspondence between PR and reinforcement learning (RL), and, based on the connection, expand PR to learn constraints as the extrinsic reward in RL. The resulting algorithm is model-agnostic to apply to any DGMs, and is flexible to adapt arbitrary constraints with the model jointly. Experiments on human image generation and templated sentence generation show models with learned knowledge constraints by our algorithm greatly improve over base generative models. |
Tasks | Image Generation |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09764v2 |
http://arxiv.org/pdf/1806.09764v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-models-with-learnable |
Repo | |
Framework | |
Asymmetric Loss Functions and Deep Densely Connected Networks for Highly Imbalanced Medical Image Segmentation: Application to Multiple Sclerosis Lesion Detection
Title | Asymmetric Loss Functions and Deep Densely Connected Networks for Highly Imbalanced Medical Image Segmentation: Application to Multiple Sclerosis Lesion Detection |
Authors | Seyed Raein Hashemi, Seyed Sadegh Mohseni Salehi, Deniz Erdogmus, Sanjay P. Prabhu, Simon K. Warfield, Ali Gholipour |
Abstract | Fully convolutional deep neural networks have been asserted to be fast and precise frameworks with great potential in image segmentation. One of the major challenges in training such networks raises when data is unbalanced, which is common in many medical imaging applications such as lesion segmentation where lesion class voxels are often much lower in numbers than non-lesion voxels. A trained network with unbalanced data may make predictions with high precision and low recall, being severely biased towards the non-lesion class which is particularly undesired in most medical applications where FNs are more important than FPs. Various methods have been proposed to address this problem, more recently similarity loss functions and focal loss. In this work we trained fully convolutional deep neural networks using an asymmetric similarity loss function to mitigate the issue of data imbalance and achieve much better tradeoff between precision and recall. To this end, we developed a 3D FC-DenseNet with large overlapping image patches as input and an asymmetric similarity loss layer based on Tversky index (using Fbeta scores). We used large overlapping image patches as inputs for intrinsic and extrinsic data augmentation, a patch selection algorithm, and a patch prediction fusion strategy using B-spline weighted soft voting to account for the uncertainty of prediction in patch borders. We applied this method to MS lesion segmentation based on two different datasets of MSSEG and ISBI longitudinal MS lesion segmentation challenge, where we achieved top performance in both challenges. Our network trained with focal loss ranked first according to the ISBI challenge overall score and resulted in the lowest reported lesion false positive rate among all submitted methods. Our network trained with the asymmetric similarity loss led to the lowest surface distance and the best lesion true positive rate. |
Tasks | Data Augmentation, Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.11078v4 |
http://arxiv.org/pdf/1803.11078v4.pdf | |
PWC | https://paperswithcode.com/paper/asymmetric-loss-functions-and-deep-densely |
Repo | |
Framework | |
ADARES: Adaptive Resource Management for Virtual Machines
Title | ADARES: Adaptive Resource Management for Virtual Machines |
Authors | Ignacio Cano, Lequn Chen, Pedro Fonseca, Tianqi Chen, Chern Cheah, Karan Gupta, Ramesh Chandra, Arvind Krishnamurthy |
Abstract | Virtual execution environments allow for consolidation of multiple applications onto the same physical server, thereby enabling more efficient use of server resources. However, users often statically configure the resources of virtual machines through guesswork, resulting in either insufficient resource allocations that hinder VM performance, or excessive allocations that waste precious data center resources. In this paper, we first characterize real-world resource allocation and utilization of VMs through the analysis of an extensive dataset, consisting of more than 250k VMs from over 3.6k private enterprise clusters. Our large-scale analysis confirms that VMs are often misconfigured, either overprovisioned or underprovisioned, and that this problem is pervasive across a wide range of private clusters. We then propose ADARES, an adaptive system that dynamically adjusts VM resources using machine learning techniques. In particular, ADARES leverages the contextual bandits framework to effectively manage the adaptations. Our system exploits easily collectible data, at the cluster, node, and VM levels, to make more sensible allocation decisions, and uses transfer learning to safely explore the configurations space and speed up training. Our empirical evaluation shows that ADARES can significantly improve system utilization without sacrificing performance. For instance, when compared to threshold and prediction-based baselines, it achieves more predictable VM-level performance and also reduces the amount of virtual CPUs and memory provisioned by up to 35% and 60% respectively for synthetic workloads on real clusters. |
Tasks | Multi-Armed Bandits, Transfer Learning |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01837v2 |
http://arxiv.org/pdf/1812.01837v2.pdf | |
PWC | https://paperswithcode.com/paper/adares-adaptive-resource-management-for |
Repo | |
Framework | |
A Unified Framework for Clustering Constrained Data without Locality Property
Title | A Unified Framework for Clustering Constrained Data without Locality Property |
Authors | Hu Ding, Jinhui Xu |
Abstract | In this paper, we consider a class of constrained clustering problems of points in $\mathbb{R}^{d}$, where $d$ could be rather high. A common feature of these problems is that their optimal clusterings no longer have the locality property (due to the additional constraints), which is a key property required by many algorithms for their unconstrained counterparts. To overcome the difficulty caused by the loss of locality, we present in this paper a unified framework, called {\em Peeling-and-Enclosing (PnE)}, to iteratively solve two variants of the constrained clustering problems, {\em constrained $k$-means clustering} ($k$-CMeans) and {\em constrained $k$-median clustering} ($k$-CMedian). Our framework is based on two standalone geometric techniques, called {\em Simplex Lemma} and {\em Weaker Simplex Lemma}, for $k$-CMeans and $k$-CMedian, respectively. The simplex lemma (or weaker simplex lemma) enables us to efficiently approximate the mean (or median) point of an unknown set of points by searching a small-size grid, independent of the dimensionality of the space, in a simplex (or the surrounding region of a simplex), and thus can be used to handle high dimensional data. If $k$ and $\frac{1}{\epsilon}$ are fixed numbers, our framework generates, in nearly linear time ({\em i.e.,} $O(n(\log n)^{k+1}d)$), $O((\log n)^{k})$ $k$-tuple candidates for the $k$ mean or median points, and one of them induces a $(1+\epsilon)$-approximation for $k$-CMeans or $k$-CMedian, where $n$ is the number of points. Combining this unified framework with a problem-specific selection algorithm (which determines the best $k$-tuple candidate), we obtain a $(1+\epsilon)$-approximation for each of the constrained clustering problems. We expect that our technique will be applicable to other constrained clustering problems without locality. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01049v1 |
http://arxiv.org/pdf/1810.01049v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-clustering |
Repo | |
Framework | |
Mirror, Mirror, on the Wall, Who’s Got the Clearest Image of Them All? - A Tailored Approach to Single Image Reflection Removal
Title | Mirror, Mirror, on the Wall, Who’s Got the Clearest Image of Them All? - A Tailored Approach to Single Image Reflection Removal |
Authors | Daniel Heydecker, Georg Maierhofer, Angelica I. Aviles-Rivero, Qingnan Fan, Dongdong Chen, Carola-Bibiane Schönlieb, Sabine Süsstrunk |
Abstract | Removing reflection artefacts from a single image is a problem of both theoretical and practical interest, which still presents challenges because of the massively ill-posed nature of the problem. In this work, we propose a technique based on a novel optimisation problem. Firstly, we introduce a simple user interaction scheme, which helps minimise information loss in reflection-free regions. Secondly, we introduce an $H^2$ fidelity term, which preserves fine detail while enforcing global colour similarity. We show that this combination allows us to mitigate some major drawbacks of the existing methods for reflection removal. We demonstrate, through numerical and visual experiments, that our method is able to outperform the state-of-the-art methods and compete with recent deep-learning approaches. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11589v2 |
http://arxiv.org/pdf/1805.11589v2.pdf | |
PWC | https://paperswithcode.com/paper/mirror-mirror-on-the-wall-whos-got-the |
Repo | |
Framework | |
Grouped Gaussian Processes for Solar Power Prediction
Title | Grouped Gaussian Processes for Solar Power Prediction |
Authors | Astrid Dahl, Edwin V. Bonilla |
Abstract | We consider multi-task regression models where the observations are assumed to be a linear combination of several latent node functions and weight functions, which are both drawn from Gaussian process priors. Driven by the problem of developing scalable methods for forecasting distributed solar and other renewable power generation, we propose coupled priors over groups of (node or weight) processes to exploit spatial dependence between functions. We estimate forecast models for solar power at multiple distributed sites and ground wind speed at multiple proximate weather stations. Our results show that our approach maintains or improves point-prediction accuracy relative to competing solar benchmarks and improves over wind forecast benchmark models on all measures. Our approach consistently dominates the equivalent model without coupled priors, achieving faster gains in forecast accuracy. At the same time our approach provides better quantification of predictive uncertainties. |
Tasks | Gaussian Processes |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02543v3 |
http://arxiv.org/pdf/1806.02543v3.pdf | |
PWC | https://paperswithcode.com/paper/grouped-gaussian-processes-for-solar-power |
Repo | |
Framework | |
Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing
Title | Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing |
Authors | Zehong Hu, Yitao Liang, Yang Liu, Jie Zhang |
Abstract | Incentive mechanisms for crowdsourcing are designed to incentivize financially self-interested workers to generate and report high-quality labels. Existing mechanisms are often developed as one-shot static solutions, assuming a certain level of knowledge about worker models (expertise levels, costs of exerting efforts, etc.). In this paper, we propose a novel inference aided reinforcement mechanism that learns to incentivize high-quality data sequentially and requires no such prior assumptions. Specifically, we first design a Gibbs sampling augmented Bayesian inference algorithm to estimate workers’ labeling strategies from the collected labels at each step. Then we propose a reinforcement incentive learning (RIL) method, building on top of the above estimates, to uncover how workers respond to different payments. RIL dynamically determines the payment without accessing any ground-truth labels. We theoretically prove that RIL is able to incentivize rational workers to provide high-quality labels. Empirical results show that our mechanism performs consistently well under both rational and non-fully rational (adaptive learning) worker models. Besides, the payments offered by RIL are more robust and have lower variances compared to the existing one-shot mechanisms. |
Tasks | Bayesian Inference |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00206v1 |
http://arxiv.org/pdf/1806.00206v1.pdf | |
PWC | https://paperswithcode.com/paper/inference-aided-reinforcement-learning-for |
Repo | |
Framework | |
Variational Semi-supervised Aspect-term Sentiment Analysis via Transformer
Title | Variational Semi-supervised Aspect-term Sentiment Analysis via Transformer |
Authors | Xingyi Cheng, Weidi Xu, Taifeng Wang, Wei Chu |
Abstract | Aspect-term sentiment analysis (ATSA) is a longstanding challenge in natural language understanding. It requires fine-grained semantical reasoning about a target entity appeared in the text. As manual annotation over the aspects is laborious and time-consuming, the amount of labeled data is limited for supervised learning. This paper proposes a semi-supervised method for the ATSA problem by using the Variational Autoencoder based on Transformer (VAET), which models the latent distribution via variational inference. By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier. Our method is classifier agnostic, i.e., the classifier is an independent module and various advanced supervised models can be integrated. Experimental results are obtained on the SemEval 2014 task 4 and show that our method is effective with four classical classifiers. The proposed method outperforms two general semisupervised methods and achieves state-of-the-art performance. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2018-10-24 |
URL | https://arxiv.org/abs/1810.10437v3 |
https://arxiv.org/pdf/1810.10437v3.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-target-level-sentiment |
Repo | |
Framework | |
Image Registration and Predictive Modeling: Learning the Metric on the Space of Diffeomorphisms
Title | Image Registration and Predictive Modeling: Learning the Metric on the Space of Diffeomorphisms |
Authors | Ayagoz Mussabayeva, Alexey Kroshnin, Anvar Kurmukov, Yulia Dodonova, Li Shen, Shan Cong, Lei Wang, Boris A. Gutman |
Abstract | We present a method for metric optimization in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, by treating the induced Riemannian metric on the space of diffeomorphisms as a kernel in a machine learning context. For simplicity, we choose the kernel Fischer Linear Discriminant Analysis (KLDA) as the framework. Optimizing the kernel parameters in an Expectation-Maximization framework, we define model fidelity via the hinge loss of the decision function. The resulting algorithm optimizes the parameters of the LDDMM norm-inducing differential operator as a solution to a group-wise registration and classification problem. In practice, this may lead to a biology-aware registration, focusing its attention on the predictive task at hand such as identifying the effects of disease. We first tested our algorithm on a synthetic dataset, showing that our parameter selection improves registration quality and classification accuracy. We then tested the algorithm on 3D subcortical shapes from the Schizophrenia cohort Schizconnect. Our Schizpohrenia-Control predictive model showed significant improvement in ROC AUC compared to baseline parameters. |
Tasks | Image Registration |
Published | 2018-08-10 |
URL | http://arxiv.org/abs/1808.04439v1 |
http://arxiv.org/pdf/1808.04439v1.pdf | |
PWC | https://paperswithcode.com/paper/image-registration-and-predictive-modeling |
Repo | |
Framework | |
Learning Strict Identity Mappings in Deep Residual Networks
Title | Learning Strict Identity Mappings in Deep Residual Networks |
Authors | Xin Yu, Zhiding Yu, Srikumar Ramalingam |
Abstract | A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation. The ability to train very deep networks naturally pushed the researchers to use enormous resources to achieve the best performance. Consequently, in many applications super deep residual networks were employed for just a marginal improvement in performance. In this paper, we propose epsilon-ResNet that allows us to automatically discard redundant layers, which produces responses that are smaller than a threshold epsilon, with a marginal or no loss in performance. The epsilon-ResNet architecture can be achieved using a few additional rectified linear units in the original ResNet. Our method does not use any additional variables nor numerous trials like other hyper-parameter optimization techniques. The layer selection is achieved using a single training process and the evaluation is performed on CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets. In some instances, we achieve about 80% reduction in the number of parameters. |
Tasks | Object Detection, Semantic Segmentation |
Published | 2018-04-05 |
URL | https://arxiv.org/abs/1804.01661v5 |
https://arxiv.org/pdf/1804.01661v5.pdf | |
PWC | https://paperswithcode.com/paper/resnet-sparsifier-learning-strict-identity |
Repo | |
Framework | |
Popularity-Aware Item Weighting for Long-Tail Recommendation
Title | Popularity-Aware Item Weighting for Long-Tail Recommendation |
Authors | Himan Abdollahpouri, Robin Burke, Bamshad Mobasher |
Abstract | Many recommender systems suffer from the popularity bias problem: popular items are being recommended frequently while less popular, niche products, are recommended rarely if not at all. However, those ignored products are exactly the products that businesses need to find customers for and their recommendations would be more beneficial. In this paper, we examine an item weighting approach to improve long-tail recommendation. Our approach works as a simple yet powerful add-on to existing recommendation algorithms for making a tunable trade-off between accuracy and long-tail coverage. |
Tasks | Recommendation Systems |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05382v3 |
http://arxiv.org/pdf/1802.05382v3.pdf | |
PWC | https://paperswithcode.com/paper/popularity-aware-item-weighting-for-long-tail |
Repo | |
Framework | |