January 31, 2020

3129 words 15 mins read

Paper Group ANR 55

Predicting colorectal polyp recurrence using time-to-event analysis of medical records. Why Does My Model Fail? Contrastive Local Explanations for Retail Forecasting. Towards robust word embeddings for noisy texts. Towards Diverse Paraphrase Generation Using Multi-Class Wasserstein GAN. A Multiple Continuous Signal Alignment Algorithm with Gaussian …

Predicting colorectal polyp recurrence using time-to-event analysis of medical records


Title	Predicting colorectal polyp recurrence using time-to-event analysis of medical records
Authors	Lia X. Harrington, Jason W. Wei, Arief A. Suriawinata, Todd A. Mackenzie, Saeed Hassanpour
Abstract	Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients’ electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07368v1
PDF	https://arxiv.org/pdf/1911.07368v1.pdf
PWC	https://paperswithcode.com/paper/predicting-colorectal-polyp-recurrence-using
Repo
Framework

Why Does My Model Fail? Contrastive Local Explanations for Retail Forecasting


Title	Why Does My Model Fail? Contrastive Local Explanations for Retail Forecasting
Authors	Ana Lucic, Hinda Haned, Maarten de Rijke
Abstract	In various business settings, there is an interest in using more complex machine learning techniques for sales forecasting. It is difficult to convince analysts, along with their superiors, to adopt these techniques since the models are considered to be “black boxes,” even if they perform better than current models in use. We examine the impact of contrastive explanations about large errors on users’ attitudes towards a “black-box’” model. We propose an algorithm, Monte Carlo Bounds for Reasonable Predictions. Given a large error, MC-BRP determines (1) feature values that would result in a reasonable prediction, and (2) general trends between each feature and the target, both based on Monte Carlo simulations. We evaluate on a real dataset with real users by conducting a user study with 75 participants to determine if explanations generated by MC-BRP help users understand why a prediction results in a large error, and if this promotes trust in an automatically-learned model. Our study shows that users are able to answer objective questions about the model’s predictions with overall 81.1% accuracy when provided with these contrastive explanations. We show that users who saw MC-BRP explanations understand why the model makes large errors in predictions significantly more than users in the control group. We also conduct an in-depth analysis on the difference in attitudes between Practitioners and Researchers, and confirm that our results hold when conditioning on the users’ background.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1908.00085v2
PDF	https://arxiv.org/pdf/1908.00085v2.pdf
PWC	https://paperswithcode.com/paper/contrastive-explanations-for-large-errors-in
Repo
Framework

Towards robust word embeddings for noisy texts


Title	Towards robust word embeddings for noisy texts
Authors	Yerai Doval, Jesús Vilares, Carlos Gómez-Rodríguez
Abstract	Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media. In this work, we propose a simple extension to the skipgram model in which we introduce the concept of bridge-words, which are artificial words added to the model to strengthen the similarity between standard words and their noisy variants. Our new embeddings outperform the state of the art on noisy texts on a wide range of evaluation tasks, both intrinsic and extrinsic, while retaining a good performance on standard texts. To the best of our knowledge, this is the first explicit approach at dealing with this type of noisy texts at the word embedding level that goes beyond the support for out-of-vocabulary words.
Tasks	Word Embeddings
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10876v3
PDF	https://arxiv.org/pdf/1911.10876v3.pdf
PWC	https://paperswithcode.com/paper/towards-robust-word-embeddings-for-noisy
Repo
Framework

Towards Diverse Paraphrase Generation Using Multi-Class Wasserstein GAN


Title	Towards Diverse Paraphrase Generation Using Multi-Class Wasserstein GAN
Authors	Zhecheng An, Sicong Liu
Abstract	Paraphrase generation is an important and challenging natural language processing (NLP) task. In this work, we propose a deep generative model to generate paraphrase with diversity. Our model is based on an encoder-decoder architecture. An additional transcoder is used to convert a sentence into its paraphrasing latent code. The transcoder takes an explicit pattern embedding variable as condition, so diverse paraphrase can be generated by sampling on the pattern embedding variable. We use a Wasserstein GAN to align the distributions of the real and generated paraphrase samples. We propose a multi-class extension to the Wasserstein GAN, which allows our generative model to learn from both positive and negative samples. The generated paraphrase distribution is forced to get closer to the positive real distribution, and be pushed away from the negative distribution in Wasserstein distance. We test our model in two datasets with both automatic metrics and human evaluation. Results show that our model can generate fluent and reliable paraphrase samples that outperform the state-of-art results, while also provides reasonable variability and diversity.
Tasks	Paraphrase Generation
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13827v1
PDF	https://arxiv.org/pdf/1909.13827v1.pdf
PWC	https://paperswithcode.com/paper/towards-diverse-paraphrase-generation-using
Repo
Framework

A Multiple Continuous Signal Alignment Algorithm with Gaussian Process Profiles and an Application to Paleoceanography


Title	A Multiple Continuous Signal Alignment Algorithm with Gaussian Process Profiles and an Application to Paleoceanography
Authors	Taehee Lee, Lorraine E. Lisiecki, Devin Rand, Geoffrey Gebbie, Charles E. Lawrence
Abstract	Aligning signals is essential for integrating fragmented knowledge in each signal or resolving signal classification problems. Motif finding, or profile analysis, is a preferred method for multiple signal alignments and can be classified into two categories, depending on whether the profile is constructive or latent. Existing methods in these categories have some limitations: constructive profiles are defined over finite sets and inferred latent profiles are often too abstract to represent the integrated information. Here we present a novel alignment method, the multiple continuous Signal Alignment algorithm with Gaussian Process Regression profiles (SA-GPR), which addresses the limitations of currently available methods. We present a novel stack construction algorithm as an example of our SA-GPR in the field of paleoceanography. Specifically, we create a dual-proxy stack of six high-resolution sediment cores from the Northeast Atlantic using alignments based on both radiocarbon age estimates and the oxygen isotope ratio of benthic foraminifera, which is a proxy for changes in global ice volume and deep-water temperature.
Tasks
Published	2019-07-20
URL	https://arxiv.org/abs/1907.08738v3
PDF	https://arxiv.org/pdf/1907.08738v3.pdf
PWC	https://paperswithcode.com/paper/dual-proxy-gaussian-process-stack-integrating
Repo
Framework

Thompson Sampling in Non-Episodic Restless Bandits


Title	Thompson Sampling in Non-Episodic Restless Bandits
Authors	Young Hun Jung, Marc Abeille, Ambuj Tewari
Abstract	Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging. We study learning algorithms over the unknown reward distributions and prove a sub-linear, $O(\sqrt{T}\log T)$, regret bound for a variant of Thompson sampling. Our analysis applies in the infinite time horizon setting, resolving the open question raised by Jung and Tewari (2019) whose analysis is limited to the episodic case. We adopt their policy mapping framework, which allows our algorithm to be efficient and simultaneously keeps the regret meaningful. Our algorithm adapts the TSDE algorithm of Ouyang et al. (2017) in a non-trivial manner to account for the special structure of restless bandits. We test our algorithm on a simulated dynamic channel access problem with several policy mappings, and the empirical regrets agree with the theoretical bound regardless of the choice of the policy mapping.
Tasks
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05654v1
PDF	https://arxiv.org/pdf/1910.05654v1.pdf
PWC	https://paperswithcode.com/paper/thompson-sampling-in-non-episodic-restless
Repo
Framework

A Data Mining Approach to Flight Arrival Delay Prediction for American Airlines


Title	A Data Mining Approach to Flight Arrival Delay Prediction for American Airlines
Authors	Navoneel Chakrabarty
Abstract	In the present scenario of domestic flights in USA, there have been numerous instances of flight delays and cancellations. In the United States, the American Airlines, Inc. have been one of the most entrusted and the world’s largest airline in terms of number of destinations served. But when it comes to domestic flights, AA has not lived up to the expectations in terms of punctuality or on-time performance. Flight Delays also result in airline companies operating commercial flights to incur huge losses. So, they are trying their best to prevent or avoid Flight Delays and Cancellations by taking certain measures. This study aims at analyzing flight information of US domestic flights operated by American Airlines, covering top 5 busiest airports of US and predicting possible arrival delay of the flight using Data Mining and Machine Learning Approaches. The Gradient Boosting Classifier Model is deployed by training and hyper-parameter tuning it, achieving a maximum accuracy of 85.73%. Such an Intelligent System is very essential in foretelling flights’on-time performance.
Tasks
Published	2019-03-15
URL	http://arxiv.org/abs/1903.06740v1
PDF	http://arxiv.org/pdf/1903.06740v1.pdf
PWC	https://paperswithcode.com/paper/a-data-mining-approach-to-flight-arrival
Repo
Framework

Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets


Title	Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets
Authors	Hongren Mao, Hung-yi Lee
Abstract	Paraphrase generation is an interesting and challenging NLP task which has numerous practical applications. In this paper, we analyze datasets commonly used for paraphrase generation research, and show that simply parroting input sentences surpasses state-of-the-art models in the literature when evaluated on standard metrics. Our findings illustrate that a model could be seemingly adept at generating paraphrases, despite only making trivial changes to the input sentence or even none at all.
Tasks	Paraphrase Generation
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07831v1
PDF	https://arxiv.org/pdf/1908.07831v1.pdf
PWC	https://paperswithcode.com/paper/190807831
Repo
Framework

Federated Learning with Personalization Layers


Title	Federated Learning with Personalization Layers
Authors	Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, Sunav Choudhary
Abstract	The emerging paradigm of federated learning strives to enable collaborative training of machine learning models on the network edge without centrally aggregating raw data and hence, improving data privacy. This sharply deviates from traditional machine learning and necessitates the design of algorithms robust to various sources of heterogeneity. Specifically, statistical heterogeneity of data across user devices can severely degrade the performance of standard federated averaging for traditional machine learning applications like personalization with deep learning. This paper pro-posesFedPer, a base + personalization layer approach for federated training of deep feedforward neural networks, which can combat the ill-effects of statistical heterogeneity. We demonstrate effectiveness ofFedPerfor non-identical data partitions ofCIFARdatasetsand on a personalized image aesthetics dataset from Flickr.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00818v1
PDF	https://arxiv.org/pdf/1912.00818v1.pdf
PWC	https://paperswithcode.com/paper/federated-learning-with-personalization
Repo
Framework


Title	A Framework for Detecting Event related Sentiments of a Community
Authors	Muhammad Aslam Jarwar
Abstract	Social media has revolutionized human communication and styles of interaction. Due to its easiness and effective medium, people share and exchange information, carry out discussion on various events, and express their opinions. For effective policy making and understanding the response of a community on different events, we need to monitor and analyze the social media. In social media, there are some users who are more influential, for example, a famous politician may have more influence than a common person. These influential users belong to specific communities. The main object of this research is to know the sentiments of a specific community on various events. For detecting the event based sentiments of a community we propose a generic framework. Our framework identifies the users of a specific community on twitter. After identifying the users of a community, we fetch their tweets and identify tweets belonging to specific events. The event based tweets are pre-processed. Pre-processed tweets are then analyzed for detecting sentiments of a community for specific events. Qualitative and quantitative evaluation confirms the effectiveness and usefulness of our proposed framework.
Tasks
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00232v2
PDF	http://arxiv.org/pdf/1903.00232v2.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-detecting-event-related
Repo
Framework

Privacy Enhanced Multimodal Neural Representations for Emotion Recognition


Title	Privacy Enhanced Multimodal Neural Representations for Emotion Recognition
Authors	Mimansa Jaiswal, Emily Mower Provost
Abstract	Many mobile applications and virtual conversational agents now aim to recognize and adapt to emotions. To enable this, data are transmitted from users’ devices and stored on central servers. Yet, these data contain sensitive information that could be used by mobile applications without user’s consent or, maliciously, by an eavesdropping adversary. In this work, we show how multimodal representations trained for a primary task, here emotion recognition, can unintentionally leak demographic information, which could override a selected opt-out option by the user. We analyze how this leakage differs in representations obtained from textual, acoustic, and multimodal data. We use an adversarial learning paradigm to unlearn the private information present in a representation and investigate the effect of varying the strength of the adversarial component on the primary task and on the privacy metric, defined here as the inability of an attacker to predict specific demographic information. We evaluate this paradigm on multiple datasets and show that we can improve the privacy metric while not significantly impacting the performance on the primary task. To the best of our knowledge, this is the first work to analyze how the privacy metric differs across modalities and how multiple privacy concerns can be tackled while still maintaining performance on emotion recognition.
Tasks	Emotion Recognition
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13212v1
PDF	https://arxiv.org/pdf/1910.13212v1.pdf
PWC	https://paperswithcode.com/paper/191013212
Repo
Framework

Camera Adversarial Transfer for Unsupervised Person Re-Identification


Title	Camera Adversarial Transfer for Unsupervised Person Re-Identification
Authors	Guillaume Delorme, Xavier Alameda-Pineda, Stephane Lathuilière, Radu Horaud
Abstract	Unsupervised person re-identification (Re-ID) methods consist of training with a carefully labeled source dataset, followed by generalization to an unlabeled target dataset, i.e. person-identity information is unavailable. Inspired by domain adaptation techniques, these methods avoid a costly, tedious and often unaffordable labeling process. This paper investigates the use of camera-index information, namely which camera captured which image, for unsupervised person Re-ID. More precisely, inspired by domain adaptation adversarial approaches, we develop an adversarial framework in which the output of the feature extractor should be useful for person Re-ID and in the same time should fool a camera discriminator. We refer to the proposed method as camera adversarial transfer (CAT). We evaluate adversarial variants and, alongside, the camera robustness achieved for each variant. We report cross-dataset ReID performance and we compare the variants of our method with several state-of-the-art methods, thus showing the interest of exploiting camera-index information within an adversarial framework for the unsupervised person Re-ID.
Tasks	Domain Adaptation, Person Re-Identification, Unsupervised Person Re-Identification
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01308v1
PDF	http://arxiv.org/pdf/1904.01308v1.pdf
PWC	https://paperswithcode.com/paper/camera-adversarial-transfer-for-unsupervised
Repo
Framework

Degrees of freedom for off-the-grid sparse estimation


Title	Degrees of freedom for off-the-grid sparse estimation
Authors	Clarice Poon, Gabriel Peyré
Abstract	A central question in modern machine learning and imaging sciences is to quantify the number of effective parameters of vastly over-parameterized models. The degrees of freedom is a mathematically convenient way to define this number of parameters. Its computation and properties are well understood when dealing with discretized linear models, possibly regularized using sparsity. In this paper, we argue that this way of thinking is plagued when dealing with models having very large parameter spaces. In this case it makes more sense to consider “off-the-grid” approaches, using a continuous parameter space. This type of approach is the one favoured when training multi-layer perceptrons, and is also becoming popular to solve super-resolution problems in imaging. Training these off-the-grid models with a sparsity inducing prior can be achieved by solving a convex optimization problem over the space of measures, which is often called the Beurling Lasso (Blasso), and is the continuous counterpart of the celebrated Lasso parameter selection method. In previous works, the degrees of freedom for the Lasso was shown to coincide with the size of the smallest solution support. Our main contribution is a proof of a continuous counterpart to this result for the Blasso. Our findings suggest that discretized methods actually vastly over-estimate the number of intrinsic continuous degrees of freedom. Our second contribution is a detailed study of the case of sampling Fourier coefficients in 1D, which corresponds to a super-resolution problem. We show that our formula for the degrees of freedom is valid outside of a set of measure zero of observations, which in turn justifies its use to compute an unbiased estimator of the prediction risk using the Stein Unbiased Risk Estimator (SURE).
Tasks	Super-Resolution
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03577v1
PDF	https://arxiv.org/pdf/1911.03577v1.pdf
PWC	https://paperswithcode.com/paper/degrees-of-freedom-for-off-the-grid-sparse
Repo
Framework

Addressing Data Bias Problems for Chest X-ray Image Report Generation


Title	Addressing Data Bias Problems for Chest X-ray Image Report Generation
Authors	Philipp Harzig, Yan-Ying Chen, Francine Chen, Rainer Lienhart
Abstract	Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload. However, the different patterns and data distribution of normal and abnormal cases can bias machine learning models. Previous attempts did not focus on isolating the generation of the abnormal and normal sentences in order to increase the variability of generated paragraphs. To address this, we propose to separate abnormal and normal sentence generation by using two different word LSTMs in a hierarchical LSTM model. We conduct an analysis on the distinctiveness of generated sentences compared to the BLEU score, which increases when less distinct reports are generated. We hope our findings will help to encourage the development of new metrics to better verify methods of automatic medical report generation.
Tasks	Medical Report Generation
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02123v1
PDF	https://arxiv.org/pdf/1908.02123v1.pdf
PWC	https://paperswithcode.com/paper/addressing-data-bias-problems-for-chest-x-ray
Repo
Framework

Relation extraction between the clinical entities based on the shortest dependency path based LSTM


Title	Relation extraction between the clinical entities based on the shortest dependency path based LSTM
Authors	Dhanachandra Ningthoujam, Shweta Yadav, Pushpak Bhattacharyya, Asif Ekbal
Abstract	Owing to the exponential rise in the electronic medical records, information extraction in this domain is becoming an important area of research in recent years. Relation extraction between the medical concepts such as medical problem, treatment, and test etc. is also one of the most important tasks in this area. In this paper, we present an efficient relation extraction system based on the shortest dependency path (SDP) generated from the dependency parsed tree of the sentence. Instead of relying on many handcrafted features and the whole sequence of tokens present in a sentence, our system relies only on the SDP between the target entities. For every pair of entities, the system takes only the words in the SDP, their dependency labels, Part-of-Speech information and the types of the entities as the input. We develop a dependency parser for extracting dependency information. We perform our experiments on the benchmark i2b2 dataset for clinical relation extraction challenge 2010. Experimental results show that our system outperforms the existing systems.
Tasks	Relation Extraction
Published	2019-03-24
URL	http://arxiv.org/abs/1903.09941v1
PDF	http://arxiv.org/pdf/1903.09941v1.pdf
PWC	https://paperswithcode.com/paper/relation-extraction-between-the-clinical
Repo
Framework