Paper Group ANR 214
Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks. PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits. Understanding Behavior of Clinical Models under Domain Shifts. Structure Learning of Markov Random Fields through Grow-Shrink Maximum Pseudolikelihood Estimation. Cluster-based App …
Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks
Title | Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks |
Authors | Hoo-Chang Shin, Neil A Tenenholtz, Jameson K Rogers, Christopher G Schwarz, Matthew L Senjem, Jeffrey L Gunter, Katherine Andriole, Mark Michalski |
Abstract | Data diversity is critical to success when training deep learning models. Medical imaging data sets are often imbalanced as pathologic findings are generally rare, which introduces significant challenges when training deep learning models. In this work, we propose a method to generate synthetic abnormal MRI images with brain tumors by training a generative adversarial network using two publicly available data sets of brain MRI. We demonstrate two unique benefits that the synthetic images provide. First, we illustrate improved performance on tumor segmentation by leveraging the synthetic images as a form of data augmentation. Second, we demonstrate the value of generative models as an anonymization tool, achieving comparable tumor segmentation results when trained on the synthetic data versus when trained on real subject data. Together, these results offer a potential solution to two of the largest challenges facing machine learning in medical imaging, namely the small incidence of pathological findings, and the restrictions around sharing of patient data. |
Tasks | Data Augmentation, Image Generation, Medical Image Generation |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10225v2 |
http://arxiv.org/pdf/1807.10225v2.pdf | |
PWC | https://paperswithcode.com/paper/medical-image-synthesis-for-data-augmentation |
Repo | |
Framework | |
PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
Title | PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits |
Authors | Bianca Dumitrascu, Karen Feng, Barbara E Engelhardt |
Abstract | We address the problem of regret minimization in logistic contextual bandits, where a learner decides among sequential actions or arms given their respective contexts to maximize binary rewards. Using a fast inference procedure with Polya-Gamma distributed augmentation variables, we propose an improved version of Thompson Sampling, a Bayesian formulation of contextual bandits with near-optimal performance. Our approach, Polya-Gamma augmented Thompson Sampling (PG-TS), achieves state-of-the-art performance on simulated and real data. PG-TS explores the action space efficiently and exploits high-reward arms, quickly converging to solutions of low regret. Its explicit estimation of the posterior distribution of the context feature covariance leads to substantial empirical gains over approximate approaches. PG-TS is the first approach to demonstrate the benefits of Polya-Gamma augmentation in bandits and to propose an efficient Gibbs sampler for approximating the analytically unsolvable integral of logistic contextual bandits. |
Tasks | Multi-Armed Bandits |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07458v1 |
http://arxiv.org/pdf/1805.07458v1.pdf | |
PWC | https://paperswithcode.com/paper/pg-ts-improved-thompson-sampling-for-logistic |
Repo | |
Framework | |
Understanding Behavior of Clinical Models under Domain Shifts
Title | Understanding Behavior of Clinical Models under Domain Shifts |
Authors | Jayaraman J. Thiagarajan, Deepta Rajan, Prasanna Sattigeri |
Abstract | The hypothesis that computational models can be reliable enough to be adopted in prognosis and patient care is revolutionizing healthcare. Deep learning, in particular, has been a game changer in building predictive models, thus leading to community-wide data curation efforts. However, due to inherent variabilities in population characteristics and biological systems, these models are often biased to the training datasets. This can be limiting when models are deployed in new environments, when there are systematic domain shifts not known a priori. In this paper, we propose to emulate a large class of domain shifts, that can occur in clinical settings, with a given dataset, and argue that evaluating the behavior of predictive models in light of those shifts is an effective way to quantify their reliability. More specifically, we develop an approach for building realistic scenarios, based on analysis of \textit{disease landscapes} in multi-label classification. Using the openly available MIMIC-III EHR dataset for phenotyping, for the first time, our work sheds light into data regimes where deep clinical models can fail to generalize. This work emphasizes the need for novel validation mechanisms driven by real-world domain shifts in AI for healthcare. |
Tasks | Domain Adaptation, Multi-Label Classification, Unsupervised Domain Adaptation |
Published | 2018-09-20 |
URL | https://arxiv.org/abs/1809.07806v2 |
https://arxiv.org/pdf/1809.07806v2.pdf | |
PWC | https://paperswithcode.com/paper/can-deep-clinical-models-handle-real-world |
Repo | |
Framework | |
Structure Learning of Markov Random Fields through Grow-Shrink Maximum Pseudolikelihood Estimation
Title | Structure Learning of Markov Random Fields through Grow-Shrink Maximum Pseudolikelihood Estimation |
Authors | Yuya Takashina, Shuyo Nakatani, Masato Inoue |
Abstract | Learning the structure of Markov random fields (MRFs) plays an important role in multivariate analysis. The importance has been increasing with the recent rise of statistical relational models since the MRF serves as a building block of these models such as Markov logic networks. There are two fundamental ways to learn structures of MRFs: methods based on parameter learning and those based on independence test. The former methods more or less assume certain forms of distribution, so they potentially perform poorly when the assumption is not satisfied. The latter can learn an MRF structure without a strong distributional assumption, but sometimes it is unclear what objective function is maximized/minimized in these methods. In this paper, we follow the latter, but we explicitly define the optimization problem of MRF structure learning as maximum pseudolikelihood estimation (MPLE) with respect to the edge set. As a result, the proposed solution successfully deals with the {\em symmetricity} in MRFs, whereas such symmetricity is not taken into account in most existing independence test techniques. The proposed method achieved higher accuracy than previous methods when there were asymmetric dependencies in our experiments. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.00944v1 |
http://arxiv.org/pdf/1807.00944v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-learning-of-markov-random-fields |
Repo | |
Framework | |
Cluster-based Approach to Improve Affect Recognition from Passively Sensed Data
Title | Cluster-based Approach to Improve Affect Recognition from Passively Sensed Data |
Authors | Mawulolo K. Ameko, Lihua Cai, Mehdi Boukhechba, Alexander Daros, Philip I. Chow, Bethany A. Teachman, Matthew S. Gerber, Laura E. Barnes |
Abstract | Negative affect is a proxy for mental health in adults. By being able to predict participants’ negative affect states unobtrusively, researchers and clinicians will be better positioned to deliver targeted, just-in-time mental health interventions via mobile applications. This work attempts to personalize the passive recognition of negative affect states via group-based modeling of user behavior patterns captured from mobility, communication, and activity patterns. Results show that group models outperform generalized models in a dataset based on two weeks of users’ daily lives. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1802.00029v1 |
http://arxiv.org/pdf/1802.00029v1.pdf | |
PWC | https://paperswithcode.com/paper/cluster-based-approach-to-improve-affect |
Repo | |
Framework | |
Winner-Take-All as Basic Probabilistic Inference Unit of Neuronal Circuits
Title | Winner-Take-All as Basic Probabilistic Inference Unit of Neuronal Circuits |
Authors | Zhaofei Yu, Yonghong Tian, Tiejun Huang, Jian K. Liu |
Abstract | Experimental observations of neuroscience suggest that the brain is working a probabilistic way when computing information with uncertainty. This processing could be modeled as Bayesian inference. However, it remains unclear how Bayesian inference could be implemented at the level of neuronal circuits of the brain. In this study, we propose a novel general-purpose neural implementation of probabilistic inference based on a ubiquitous network of cortical microcircuits, termed winner-take-all (WTA) circuit. We show that each WTA circuit could encode the distribution of states defined on a variable. By connecting multiple WTA circuits together, the joint distribution can be represented for arbitrary probabilistic graphical models. Moreover, we prove that the neural dynamics of WTA circuit is able to implement one of the most powerful inference methods in probabilistic graphical models, mean-field inference. We show that the synaptic drive of each spiking neuron in the WTA circuit encodes the marginal probability of the variable in each state, and the firing probability (or firing rate) of each neuron is proportional to the marginal probability. Theoretical analysis and experimental results demonstrate that the WTA circuits can get comparable inference result as mean-field approximation. Taken together, our results suggest that the WTA circuit could be seen as the minimal inference unit of neuronal circuits. |
Tasks | Bayesian Inference |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.00675v1 |
http://arxiv.org/pdf/1808.00675v1.pdf | |
PWC | https://paperswithcode.com/paper/winner-take-all-as-basic-probabilistic |
Repo | |
Framework | |
Unsupervised Deep Representations for Learning Audience Facial Behaviors
Title | Unsupervised Deep Representations for Learning Audience Facial Behaviors |
Authors | Suman Saha, Rajitha Navarathna, Leonhard Helminger, Romann Weber |
Abstract | In this paper, we present an unsupervised learning approach for analyzing facial behavior based on a deep generative model combined with a convolutional neural network (CNN). We jointly train a variational auto-encoder (VAE) and a generative adversarial network (GAN) to learn a powerful latent representation from footage of audiences viewing feature-length movies. We show that the learned latent representation successfully encodes meaningful signatures of behaviors related to audience engagement (smiling & laughing) and disengagement (yawning). Our results provide a proof of concept for a more general methodology for annotating hard-to-label multimedia data featuring sparse examples of signals of interest. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.04136v1 |
http://arxiv.org/pdf/1805.04136v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-representations-for |
Repo | |
Framework | |
Withholding or withdrawing invasive interventions may not accelerate time to death among dying ICU patients
Title | Withholding or withdrawing invasive interventions may not accelerate time to death among dying ICU patients |
Authors | Daniele Ramazzotti, Peter Clardy, Leo Anthony Celi, David J. Stone, Robert S. Rudin |
Abstract | We considered observational data available from the MIMIC-III open-access ICU database and collected within a study period between year 2002 up to 2011. If a patient had multiple admissions to the ICU during the 30 days before death, only the first stay was analyzed, leading to a final set of 6,436 unique ICU admissions during the study period. We tested two hypotheses: (i) administration of invasive intervention during the ICU stay immediately preceding end-of-life would decrease over the study time period and (ii) time-to-death from ICU admission would also decrease, due to the decrease in invasive intervention administration. To investigate the latter hypothesis, we performed a subgroups analysis by considering patients with lowest and highest severity. To do so, we stratified the patients based on their SAPS I scores, and we considered patients within the first and the third tertiles of the score. We then assessed differences in trends within these groups between years 2002-05 vs. 2008-11. Comparing the period 2002-2005 vs. 2008-2011, we found a reduction in endotracheal ventilation among patients who died within 30 days of ICU admission (120.8 vs. 68.5 hours for the lowest severity patients, p<0.001; 47.7 vs. 46.0 hours for the highest severity patients, p=0.004). This is explained in part by an increase in the use of non-invasive ventilation. Comparing the period 2002-2005 vs. 2008-2011, we found a reduction in the use of vasopressors and inotropes among patients with the lowest severity who died within 30 days of ICU admission (41.8 vs. 36.2 hours, p<0.001) but not among those with the highest severity. Despite a reduction in the use of invasive interventions, we did not find a reduction in the time to death between 2002-2005 vs. 2008-2011 (7.8 days vs. 8.2 days for the lowest severity patients, p=0.32; 2.1 days vs. 2.0 days for the highest severity patients, p=0.74). |
Tasks | |
Published | 2018-08-04 |
URL | http://arxiv.org/abs/1808.02017v2 |
http://arxiv.org/pdf/1808.02017v2.pdf | |
PWC | https://paperswithcode.com/paper/withholding-or-withdrawing-invasive |
Repo | |
Framework | |
New Losses for Generative Adversarial Learning
Title | New Losses for Generative Adversarial Learning |
Authors | Victor Berger, Michèle Sebag |
Abstract | Generative Adversarial Networks (Goodfellow et al., 2014), a major breakthrough in the field of generative modeling, learn a discriminator to estimate some distance between the target and the candidate distributions. This paper examines mathematical issues regarding the way the gradients for the generative model are computed in this context, and notably how to take into account how the discriminator itself depends on the generator parameters. A unifying methodology is presented to define mathematically sound training objectives for generative models taking this dependency into account in a robust way, covering both GAN, VAE and some GAN variants as particular cases. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01290v2 |
http://arxiv.org/pdf/1807.01290v2.pdf | |
PWC | https://paperswithcode.com/paper/new-losses-for-generative-adversarial |
Repo | |
Framework | |
Adaptive, Personalized Diversity for Visual Discovery
Title | Adaptive, Personalized Diversity for Visual Discovery |
Authors | Choon Hui Teo, Houssam Nassif, Daniel Hill, Sriram Srinavasan, Mitchell Goodman, Vijai Mohan, SVN Vishwanathan |
Abstract | Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user’s behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01477v1 |
http://arxiv.org/pdf/1810.01477v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-personalized-diversity-for-visual |
Repo | |
Framework | |
Training Generative Adversarial Networks via Primal-Dual Subgradient Methods: A Lagrangian Perspective on GAN
Title | Training Generative Adversarial Networks via Primal-Dual Subgradient Methods: A Lagrangian Perspective on GAN |
Authors | Xu Chen, Jiang Wang, Hao Ge |
Abstract | We relate the minimax game of generative adversarial networks (GANs) to finding the saddle points of the Lagrangian function for a convex optimization problem, where the discriminator outputs and the distribution of generator outputs play the roles of primal variables and dual variables, respectively. This formulation shows the connection between the standard GAN training process and the primal-dual subgradient methods for convex optimization. The inherent connection does not only provide a theoretical convergence proof for training GANs in the function space, but also inspires a novel objective function for training. The modified objective function forces the distribution of generator outputs to be updated along the direction according to the primal-dual subgradient methods. A toy example shows that the proposed method is able to resolve mode collapse, which in this case cannot be avoided by the standard GAN or Wasserstein GAN. Experiments on both Gaussian mixture synthetic data and real-world image datasets demonstrate the performance of the proposed method on generating diverse samples. |
Tasks | |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.01765v1 |
http://arxiv.org/pdf/1802.01765v1.pdf | |
PWC | https://paperswithcode.com/paper/training-generative-adversarial-networks-via-1 |
Repo | |
Framework | |
Semi-supervised Text Regression with Conditional Generative Adversarial Networks
Title | Semi-supervised Text Regression with Conditional Generative Adversarial Networks |
Authors | Tao Li, Xudong Liu, Shihan Su |
Abstract | Enormous online textual information provides intriguing opportunities for understandings of social and economic semantics. In this paper, we propose a novel text regression model based on a conditional generative adversarial network (GAN), with an attempt to associate textual data and social outcomes in a semi-supervised manner. Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01165v2 |
http://arxiv.org/pdf/1810.01165v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-text-regression-with |
Repo | |
Framework | |
Deep Recurrent Neural Network for Multi-target Filtering
Title | Deep Recurrent Neural Network for Multi-target Filtering |
Authors | Mehryar Emambakhsh, Alessandro Bay, Eduard Vazquez |
Abstract | This paper addresses the problem of fixed motion and measurement models for multi-target filtering using an adaptive learning framework. This is performed by defining target tuples with random finite set terminology and utilisation of recurrent neural networks with a long short-term memory architecture. A novel data association algorithm compatible with the predicted tracklet tuples is proposed, enabling the update of occluded targets, in addition to assigning birth, survival and death of targets. The algorithm is evaluated over a commonly used filtering simulation scenario, with highly promising results. |
Tasks | |
Published | 2018-06-18 |
URL | http://arxiv.org/abs/1806.06594v2 |
http://arxiv.org/pdf/1806.06594v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-recurrent-neural-network-for-multi |
Repo | |
Framework | |
Probabilistic Model of Visual Segmentation
Title | Probabilistic Model of Visual Segmentation |
Authors | Jonathan Vacher, Pascal Mamassian, Ruben Coen-Cagli |
Abstract | Visual segmentation is a key perceptual function that partitions visual space and allows for detection, recognition and discrimination of objects in complex environments. The processes underlying human segmentation of natural images are still poorly understood. In part, this is because we lack segmentation models consistent with experimental and theoretical knowledge in visual neuroscience. Biological sensory systems have been shown to approximate probabilistic inference to interpret their inputs. This requires a generative model that captures both the statistics of the sensory inputs and expectations about the causes of those inputs. Following this hypothesis, we propose a probabilistic generative model of visual segmentation that combines knowledge about 1) the sensitivity of neurons in the visual cortex to statistical regularities in natural images; and 2) the preference of humans to form contiguous partitions of visual space. We develop an efficient algorithm for training and inference based on expectation-maximization and validate it on synthetic data. Importantly, with the appropriate choice of the prior, we derive an intuitive closed–form update rule for assigning pixels to segments: at each iteration, the pixel assignment probabilities to segments is the sum of the evidence (i.e. local pixel statistics) and prior (i.e. the assignments of neighboring pixels) weighted by their relative uncertainty. The model performs competitively on natural images from the Berkeley Segmentation Dataset (BSD), and we illustrate how the likelihood and prior components improve segmentation relative to traditional mixture models. Furthermore, our model explains some variability across human subjects as reflecting local uncertainty about the number of segments. Our model thus provides a viable approach to probe human visual segmentation. |
Tasks | Semantic Segmentation |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1806.00111v3 |
http://arxiv.org/pdf/1806.00111v3.pdf | |
PWC | https://paperswithcode.com/paper/an-ideal-observer-model-to-probe-human-visual |
Repo | |
Framework | |
BRIEF: Backward Reduction of CNNs with Information Flow Analysis
Title | BRIEF: Backward Reduction of CNNs with Information Flow Analysis |
Authors | Yu-Hsun Lin, Chun-Nan Chou, Edward Y. Chang |
Abstract | This paper proposes BRIEF, a backward reduction algorithm that explores compact CNN-model designs from the information flow perspective. This algorithm can remove substantial non-zero weighting parameters (redundant neural channels) of a network by considering its dynamic behavior, which traditional model-compaction techniques cannot achieve. With the aid of our proposed algorithm, we achieve significant model reduction on ResNet-34 in the ImageNet scale (32.3% reduction), which is 3X better than the previous result (10.8%). Even for highly optimized models such as SqueezeNet and MobileNet, we can achieve additional 10.81% and 37.56% reduction, respectively, with negligible performance degradation. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05726v3 |
http://arxiv.org/pdf/1807.05726v3.pdf | |
PWC | https://paperswithcode.com/paper/brief-backward-reduction-of-cnns-with |
Repo | |
Framework | |