Paper Group ANR 146
Improved Predictive Models for Acute Kidney Injury with IDEAs: Intraoperative Data Embedded Analytics. Train on Validation: Squeezing the Data Lemon. Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching. An Overview of Blockchain Integration with Robotics and Artificial Intelligence. A particle-based variationa …
Improved Predictive Models for Acute Kidney Injury with IDEAs: Intraoperative Data Embedded Analytics
Title | Improved Predictive Models for Acute Kidney Injury with IDEAs: Intraoperative Data Embedded Analytics |
Authors | Lasith Adhikari, Tezcan Ozrazgat-Baslanti, Paul Thottakkara, Ashkan Ebadi, Amir Motaei, Parisa Rashidi, Xiaolin Li, Azra Bihorac |
Abstract | Acute kidney injury (AKI) is a common and serious complication after a surgery which is associated with morbidity and mortality. The majority of existing perioperative AKI risk score prediction models are limited in their generalizability and do not fully utilize the physiological intraoperative time-series data. Thus, there is a need for intelligent, accurate, and robust systems, able to leverage information from large-scale data to predict patient’s risk of developing postoperative AKI. A retrospective single-center cohort of 2,911 adult patients who underwent surgery at the University of Florida Health has been used for this study. We used machine learning and statistical analysis techniques to develop perioperative models to predict the risk of AKI (risk during the first 3 days, 7 days, and until the discharge day) before and after the surgery. In particular, we examined the improvement in risk prediction by incorporating three intraoperative physiologic time series data, i.e., mean arterial blood pressure, minimum alveolar concentration, and heart rate. For an individual patient, the preoperative model produces a probabilistic AKI risk score, which will be enriched by integrating intraoperative statistical features through a machine learning stacking approach inside a random forest classifier. We compared the performance of our model based on the area under the receiver operating characteristics curve (AUROC), accuracy and net reclassification improvement (NRI). The predictive performance of the proposed model is better than the preoperative data only model. For AKI-7day outcome: The AUC was 0.86 (accuracy was 0.78) in the proposed model, while the preoperative AUC was 0.84 (accuracy 0.76). Furthermore, with the integration of intraoperative features, we were able to classify patients who were misclassified in the preoperative model. |
Tasks | Time Series |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.05452v1 |
http://arxiv.org/pdf/1805.05452v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-predictive-models-for-acute-kidney |
Repo | |
Framework | |
Train on Validation: Squeezing the Data Lemon
Title | Train on Validation: Squeezing the Data Lemon |
Authors | Guy Tennenholtz, Tom Zahavy, Shie Mannor |
Abstract | Model selection on validation data is an essential step in machine learning. While the mixing of data between training and validation is considered taboo, practitioners often violate it to increase performance. Here, we offer a simple, practical method for using the validation set for training, which allows for a continuous, controlled trade-off between performance and overfitting of model selection. We define the notion of on-average-validation-stable algorithms as one in which using small portions of validation data for training does not overfit the model selection process. We then prove that stable algorithms are also validation stable. Finally, we demonstrate our method on the MNIST and CIFAR-10 datasets using stable algorithms as well as state-of-the-art neural networks. Our results show significant increase in test performance with a minor trade-off in bias admitted to the model selection process. |
Tasks | Model Selection |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.05846v1 |
http://arxiv.org/pdf/1802.05846v1.pdf | |
PWC | https://paperswithcode.com/paper/train-on-validation-squeezing-the-data-lemon |
Repo | |
Framework | |
Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching
Title | Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching |
Authors | Xiao Yang, Madian Khabsa, Miaosen Wang, Wei Wang, Madian Khabsa, Ahmed Awadallah, Daniel Kifer, C. Lee Giles |
Abstract | Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue. We employ a generative model to iteratively sample a subset of challenging negative samples to fool our classification model. Both models are alternatively optimized using REINFORCE algorithm. The proposed method is completely different from previous ones, where negative samples in training set are directly used or uniformly down-sampled. Further, we propose using Multi-scale Matching which explicitly inspects the correlation between words and ngrams of different levels of granularity. We evaluate the proposed method on SemEval 2016 and SemEval 2017 datasets and achieves state-of-the-art or similar performance. |
Tasks | Answer Selection, Question Answering |
Published | 2018-04-22 |
URL | http://arxiv.org/abs/1804.08058v2 |
http://arxiv.org/pdf/1804.08058v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-training-for-community-question |
Repo | |
Framework | |
An Overview of Blockchain Integration with Robotics and Artificial Intelligence
Title | An Overview of Blockchain Integration with Robotics and Artificial Intelligence |
Authors | Vasco Lopes, Luís A. Alexandre |
Abstract | Blockchain technology is growing everyday at a fast-passed rhythm and it’s possible to integrate it with many systems, namely Robotics with AI services. However, this is still a recent field and there isn’t yet a clear understanding of what it could potentially become. In this paper, we conduct an overview of many different methods and platforms that try to leverage the power of blockchain into robotic systems, to improve AI services or to solve problems that are present in the major blockchains, which can lead to the ability of creating robotic systems with increased capabilities and security. We present an overview, discuss the methods and conclude the paper with our view on the future of the integration of these technologies. |
Tasks | |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00329v1 |
http://arxiv.org/pdf/1810.00329v1.pdf | |
PWC | https://paperswithcode.com/paper/an-overview-of-blockchain-integration-with |
Repo | |
Framework | |
A particle-based variational approach to Bayesian Non-negative Matrix Factorization
Title | A particle-based variational approach to Bayesian Non-negative Matrix Factorization |
Authors | M. Arjumand Masood, Finale Doshi-Velez |
Abstract | Bayesian Non-negative Matrix Factorization (NMF) is a promising approach for understanding uncertainty and structure in matrix data. However, a large volume of applied work optimizes traditional non-Bayesian NMF objectives that fail to provide a principled understanding of the non-identifiability inherent in NMF– an issue ideally addressed by a Bayesian approach. Despite their suitability, current Bayesian NMF approaches have failed to gain popularity in an applied setting; they sacrifice flexibility in modeling for tractable computation, tend to get stuck in local modes, and require many thousands of samples for meaningful uncertainty estimates. We address these issues through a particle-based variational approach to Bayesian NMF that only requires the joint likelihood to be differentiable for tractability, uses a novel initialization technique to identify multiple modes in the posterior, and allows domain experts to inspect a `small’ set of factorizations that faithfully represent the posterior. We introduce and employ a class of likelihood and prior distributions for NMF that formulate a Bayesian model using popular non-Bayesian NMF objectives. On several real datasets, we obtain better particle approximations to the Bayesian NMF posterior in less time than baselines and demonstrate the significant role that multimodality plays in NMF-related tasks. | |
Tasks | |
Published | 2018-03-16 |
URL | http://arxiv.org/abs/1803.06321v1 |
http://arxiv.org/pdf/1803.06321v1.pdf | |
PWC | https://paperswithcode.com/paper/a-particle-based-variational-approach-to |
Repo | |
Framework | |
Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos
Title | Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos |
Authors | Zhaoyang Zhang, Zhanghui Kuang, Ping Luo, Litong Feng, Wei Zhang |
Abstract | Video Analytics Software as a Service (VA SaaS) has been rapidly growing in recent years. VA SaaS is typically accessed by users using a lightweight client. Because the transmission bandwidth between the client and cloud is usually limited and expensive, it brings great benefits to design cloud video analysis algorithms with a limited data transmission requirement. Although considerable research has been devoted to video analysis, to our best knowledge, little of them has paid attention to the transmission bandwidth limitation in SaaS. As the first attempt in this direction, this work introduces a problem of few-frame action recognition, which aims at maintaining high recognition accuracy, when accessing only a few frames during both training and test. Unlike previous work that processed dense frames, we present Temporal Sequence Distillation (TSD), which distills a long video sequence into a very short one for transmission. By end-to-end training with 3D CNNs for video action recognition, TSD learns a compact and discriminative temporal and spatial representation of video frames. On Kinetics dataset, TSD+I3D typically requires only 50% of the number of frames compared to I3D, a state-of-the-art video action recognition algorithm, to achieve almost the same accuracies. The proposed TSD has three appealing advantages. Firstly, TSD has a lightweight architecture and can be deployed in the client, eg. mobile devices, to produce compressed representative frames to save transmission bandwidth. Secondly, TSD significantly reduces the computations to run video action recognition with compressed frames on the cloud, while maintaining high recognition accuracies. Thirdly, TSD can be plugged in as a preprocessing module of any existing 3D CNNs. Extensive experiments show the effectiveness and characteristics of TSD. |
Tasks | Action Recognition In Videos, Temporal Action Localization |
Published | 2018-08-15 |
URL | http://arxiv.org/abs/1808.05085v1 |
http://arxiv.org/pdf/1808.05085v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-sequence-distillation-towards-few |
Repo | |
Framework | |
The Limitations of Model Uncertainty in Adversarial Settings
Title | The Limitations of Model Uncertainty in Adversarial Settings |
Authors | Kathrin Grosse, David Pfaff, Michael Thomas Smith, Michael Backes |
Abstract | Machine learning models are vulnerable to adversarial examples: minor perturbations to input samples intended to deliberately cause misclassification. While an obvious security threat, adversarial examples yield as well insights about the applied model itself. We investigate adversarial examples in the context of Bayesian neural network’s (BNN’s) uncertainty measures. As these measures are highly non-smooth, we use a smooth Gaussian process classifier (GPC) as substitute. We show that both confidence and uncertainty can be unsuspicious even if the output is wrong. Intriguingly, we find subtle differences in the features influencing uncertainty and confidence for most tasks. |
Tasks | Gaussian Processes |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02606v2 |
https://arxiv.org/pdf/1812.02606v2.pdf | |
PWC | https://paperswithcode.com/paper/the-limitations-of-model-uncertainty-in |
Repo | |
Framework | |
Unsupervised representation learning using convolutional and stacked auto-encoders: a domain and cross-domain feature space analysis
Title | Unsupervised representation learning using convolutional and stacked auto-encoders: a domain and cross-domain feature space analysis |
Authors | Gabriel B. Cavallari, Leonardo Sampaio Ferraz Ribeiro, Moacir Antonelli Ponti |
Abstract | A feature learning task involves training models that are capable of inferring good representations (transformations of the original space) from input data alone. When working with limited or unlabelled data, and also when multiple visual domains are considered, methods that rely on large annotated datasets, such as Convolutional Neural Networks (CNNs), cannot be employed. In this paper we investigate different auto-encoder (AE) architectures, which require no labels, and explore training strategies to learn representations from images. The models are evaluated considering both the reconstruction error of the images and the feature spaces in terms of their discriminative power. We study the role of dense and convolutional layers on the results, as well as the depth and capacity of the networks, since those are shown to affect both the dimensionality reduction and the capability of generalising for different visual domains. Classification results with AE features were as discriminative as pre-trained CNN features. Our findings can be used as guidelines for the design of unsupervised representation learning methods within and across domains. |
Tasks | Dimensionality Reduction, Representation Learning, Unsupervised Representation Learning |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00473v1 |
http://arxiv.org/pdf/1811.00473v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-representation-learning-using |
Repo | |
Framework | |
Parameter Estimation for the Single-Look $\mathcal{G}^0$ Distribution
Title | Parameter Estimation for the Single-Look $\mathcal{G}^0$ Distribution |
Authors | Débora Chan, Andrea Rey, Juliana Gambini, Alejandro C. Frery |
Abstract | The statistical properties of Synthetic Aperture Radar (SAR) image texture reveals useful target characteristics. It is well-known that these images are affected by speckle, and prone to contamination as double bounce and corner reflectors. The $\mathcal{G}^0$ distribution is flexible enough to model different degrees of texture in speckled data. It is indexed by three parameters: $\alpha$, related to the texture, $\gamma$, a scale parameter, and $L$, the number of looks which is related to the signal-to-noise ratio. Quality estimation of $\alpha$ is essential due to its immediate interpretability. In this article, we compare the behavior of a number of parameter estimation techniques in the noisiest case, namely single look data. We evaluate them using Monte Carlo methods for non-contaminated and contaminated data, considering convergence rate, bias, mean squared error (MSE) and computational cost. The results are verified with simulated and actual SAR images. |
Tasks | |
Published | 2018-09-29 |
URL | http://arxiv.org/abs/1810.00216v1 |
http://arxiv.org/pdf/1810.00216v1.pdf | |
PWC | https://paperswithcode.com/paper/parameter-estimation-for-the-single-look |
Repo | |
Framework | |
Scaling-up Split-Merge MCMC with Locality Sensitive Sampling (LSS)
Title | Scaling-up Split-Merge MCMC with Locality Sensitive Sampling (LSS) |
Authors | Chen Luo, Anshumali Shrivastava |
Abstract | Split-Merge MCMC (Monte Carlo Markov Chain) is one of the essential and popular variants of MCMC for problems when an MCMC state consists of an unknown number of components. It is well known that state-of-the-art methods for split-merge MCMC do not scale well. Strategies for rapid mixing requires smart and informative proposals to reduce the rejection rate. However, all known smart proposals involve expensive operations to suggest informative transitions. As a result, the cost of each iteration is prohibitive for massive scale datasets. It is further known that uninformative but computationally efficient proposals, such as random split-merge, leads to extremely slow convergence. This tradeoff between mixing time and per update cost seems hard to get around. In this paper, we show a sweet spot. We leverage some unique properties of weighted MinHash, which is a popular LSH, to design a novel class of split-merge proposals which are significantly more informative than random sampling but at the same time efficient to compute. Overall, we obtain a superior tradeoff between convergence and per update cost. As a direct consequence, our proposals are around 6X faster than the state-of-the-art sampling methods on two large real datasets KDDCUP and PubMed with several millions of entities and thousands of clusters. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07444v3 |
http://arxiv.org/pdf/1802.07444v3.pdf | |
PWC | https://paperswithcode.com/paper/scaling-up-split-merge-mcmc-with-locality |
Repo | |
Framework | |
MoNet: Moments Embedding Network
Title | MoNet: Moments Embedding Network |
Authors | Mengran Gou, Fei Xiong, Octavia Camps, Mario Sznaier |
Abstract | Bilinear pooling has been recently proposed as a feature encoding layer, which can be used after the convolutional layers of a deep network, to improve performance in multiple vision tasks. Different from conventional global average pooling or fully connected layer, bilinear pooling gathers 2nd order information in a translation invariant fashion. However, a serious drawback of this family of pooling layers is their dimensionality explosion. Approximate pooling methods with compact properties have been explored towards resolving this weakness. Additionally, recent results have shown that significant performance gains can be achieved by adding 1st order information and applying matrix normalization to regularize unstable higher order information. However, combining compact pooling with matrix normalization and other order information has not been explored until now. In this paper, we unify bilinear pooling and the global Gaussian embedding layers through the empirical moment matrix. In addition, we propose a novel sub-matrix square-root layer, which can be used to normalize the output of the convolution layer directly and mitigate the dimensionality problem with off-the-shelf compact pooling methods. Our experiments on three widely used fine-grained classification datasets illustrate that our proposed architecture, MoNet, can achieve similar or better performance than with the state-of-art G2DeNet. Furthermore, when combined with compact pooling technique, MoNet obtains comparable performance with encoded features with 96% less dimensions. |
Tasks | |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07303v2 |
http://arxiv.org/pdf/1802.07303v2.pdf | |
PWC | https://paperswithcode.com/paper/monet-moments-embedding-network |
Repo | |
Framework | |
Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies
Title | Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies |
Authors | Anjalie Field, Doron Kliger, Shuly Wintner, Jennifer Pan, Dan Jurafsky, Yulia Tsvetkov |
Abstract | Amidst growing concern over media manipulation, NLP attention has focused on overt strategies like censorship and “fake news’". Here, we draw on two concepts from the political science literature to explore subtler strategies for government media manipulation: agenda-setting (selecting what topics to cover) and framing (deciding how topics are covered). We analyze 13 years (100K articles) of the Russian newspaper Izvestia and identify a strategy of distraction: articles mention the U.S. more frequently in the month directly following an economic downturn in Russia. We introduce embedding-based methods for cross-lingually projecting English frames to Russian, and discover that these articles emphasize U.S. moral failings and threats to the U.S. Our work offers new ways to identify subtle media manipulation strategies at the intersection of agenda-setting and framing. |
Tasks | |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09386v2 |
http://arxiv.org/pdf/1808.09386v2.pdf | |
PWC | https://paperswithcode.com/paper/framing-and-agenda-setting-in-russian-news-a |
Repo | |
Framework | |
Dynamic Online Gradient Descent with Improved Query Complexity: A Theoretical Revisit
Title | Dynamic Online Gradient Descent with Improved Query Complexity: A Theoretical Revisit |
Authors | Yawei Zhao, En Zhu, Xinwang Liu, Jianping Yin |
Abstract | We provide a new theoretical analysis framework to investigate online gradient descent in the dynamic environment. Comparing with the previous work, the new framework recovers the state-of-the-art dynamic regret, but does not require extra gradient queries for every iteration. Specifically, when functions are $\alpha$ strongly convex and $\beta$ smooth, to achieve the state-of-the-art dynamic regret, the previous work requires $O(\kappa)$ with $\kappa = \frac{\beta}{\alpha}$ queries of gradients at every iteration. But, our framework shows that the query complexity can be improved to be $O(1)$, which does not depend on $\kappa$. The improvement is significant for ill-conditioned problems because that their objective function usually has a large $\kappa$. |
Tasks | |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10186v3 |
http://arxiv.org/pdf/1812.10186v3.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-online-gradient-descent-with-improved |
Repo | |
Framework | |
Alternating Segmentation and Simulation for Contrast Adaptive Tissue Classification
Title | Alternating Segmentation and Simulation for Contrast Adaptive Tissue Classification |
Authors | Dzung L. Pham, Snehashis Roy |
Abstract | A key feature of magnetic resonance (MR) imaging is its ability to manipulate how the intrinsic tissue parameters of the anatomy ultimately contribute to the contrast properties of the final, acquired image. This flexibility, however, can lead to substantial challenges for segmentation algorithms, particularly supervised methods. These methods require atlases or training data, which are composed of MR image and labeled image pairs. In most cases, the training data are obtained with a fixed acquisition protocol, leading to suboptimal performance when an input data set that requires segmentation has differing contrast properties. This drawback is increasingly significant with the recent movement towards multi-center research studies involving multiple scanners and acquisition protocols. In this work, we propose a new framework for supervised segmentation approaches that is robust to contrast differences between the training MR image and the input image. Our approach uses a generative simulation model within the segmentation process to compensate for the contrast differences. We allow the contrast of the MR image in the training data to vary by simulating a new contrast from the corresponding label image. The model parameters are optimized by a cost function measuring the consistency between the input MR image and its simulation based on a current estimate of the segmentation labels. We provide a proof of concept of this approach by combining a supervised classifier with a simple simulation model, and apply the resulting algorithm to synthetic images and actual MR images. |
Tasks | |
Published | 2018-11-17 |
URL | http://arxiv.org/abs/1811.07087v1 |
http://arxiv.org/pdf/1811.07087v1.pdf | |
PWC | https://paperswithcode.com/paper/alternating-segmentation-and-simulation-for |
Repo | |
Framework | |
Benchmarks of ResNet Architecture for Atrial Fibrillation Classification
Title | Benchmarks of ResNet Architecture for Atrial Fibrillation Classification |
Authors | Roman Khudorozhkov, Dmitry Podvyaznikov |
Abstract | In this work we apply variations of ResNet architecture to the task of atrial fibrillation classification. Variations differ in number of filter after first convolution, ResNet block layout, number of filters in block convolutions and number of ResNet blocks between downsampling operations. We have found a range of model size in which models with quite different configurations show similar performance. It is likely that overall number of parameters plays dominant role in model performance. However, configuration parameters like layout have values that constantly lead to better results, which allows to suggest that these parameters should be defined and fixed in the first place, while others may be varied in a reasonable range to satisfy any existing constraints. |
Tasks | |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00396v1 |
http://arxiv.org/pdf/1810.00396v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmarks-of-resnet-architecture-for-atrial |
Repo | |
Framework | |