February 2, 2020

3033 words 15 mins read

Paper Group AWR 58

Paper Group AWR 58

TIGS: An Inference Algorithm for Text Infilling with Gradient Search. Estimating the effective dimension of large biological datasets using Fisher separability analysis. Adversarial Attacks on Deep Neural Networks for Time Series Classification. Feature Relevance Bounds for Ordinal Regression. What to Expect of Classifiers? Reasoning about Logistic …

Title TIGS: An Inference Algorithm for Text Infilling with Gradient Search
Authors Dayiheng Liu, Jie Fu, Pengfei Liu, Jiancheng Lv
Abstract Text infilling is defined as a task for filling in the missing part of a sentence or paragraph, which is suitable for many real-world natural language generation scenarios. However, given a well-trained sequential generative model, generating missing symbols conditioned on the context is challenging for existing greedy approximate inference algorithms. In this paper, we propose an iterative inference algorithm based on gradient search, which is the first inference algorithm that can be broadly applied to any neural sequence generative models for text infilling tasks. We compare the proposed method with strong baselines on three text infilling tasks with various mask ratios and different mask strategies. The results show that our proposed method is effective and efficient for fill-in-the-blank tasks, consistently outperforming all baselines.
Tasks Text Generation, Text Infilling
Published 2019-05-26
URL https://arxiv.org/abs/1905.10752v1
PDF https://arxiv.org/pdf/1905.10752v1.pdf
PWC https://paperswithcode.com/paper/190510752
Repo https://github.com/dayihengliu/Text-Infilling-Gradient-Search
Framework tf

Estimating the effective dimension of large biological datasets using Fisher separability analysis

Title Estimating the effective dimension of large biological datasets using Fisher separability analysis
Authors Luca Albergante, Jonathan Bac, Andrei Zinovyev
Abstract Modern large-scale datasets are frequently said to be high-dimensional. However, their data point clouds frequently possess structures, significantly decreasing their intrinsic dimensionality (ID) due to the presence of clusters, points being located close to low-dimensional varieties or fine-grained lumping. We test a recently introduced dimensionality estimator, based on analysing the separability properties of data points, on several benchmarks and real biological datasets. We show that the introduced measure of ID has performance competitive with state-of-the-art measures, being efficient across a wide range of dimensions and performing better in the case of noisy samples. Moreover, it allows estimating the intrinsic dimension in situations where the intrinsic manifold assumption is not valid.
Tasks
Published 2019-01-18
URL http://arxiv.org/abs/1901.06328v1
PDF http://arxiv.org/pdf/1901.06328v1.pdf
PWC https://paperswithcode.com/paper/estimating-the-effective-dimension-of-large
Repo https://github.com/auranic/FisherSeparabilityAnalysis
Framework none

Adversarial Attacks on Deep Neural Networks for Time Series Classification

Title Adversarial Attacks on Deep Neural Networks for Time Series Classification
Authors Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller
Abstract Time Series Classification (TSC) problems are encountered in many real life data mining tasks ranging from medicine and security to human activity recognition and food safety. With the recent success of deep neural networks in various domains such as computer vision and natural language processing, researchers started adopting these techniques for solving time series data mining problems. However, to the best of our knowledge, no previous work has considered the vulnerability of deep learning models to adversarial time series examples, which could potentially make them unreliable in situations where the decision taken by the classifier is crucial such as in medicine and security. For computer vision problems, such attacks have been shown to be very easy to perform by altering the image and adding an imperceptible amount of noise to trick the network into wrongly classifying the input image. Following this line of work, we propose to leverage existing adversarial attack mechanisms to add a special noise to the input time series in order to decrease the network’s confidence when classifying instances at test time. Our results reveal that current state-of-the-art deep learning time series classifiers are vulnerable to adversarial attacks which can have major consequences in multiple domains such as food safety and quality assurance.
Tasks Activity Recognition, Adversarial Attack, Human Activity Recognition, Time Series, Time Series Classification
Published 2019-03-17
URL http://arxiv.org/abs/1903.07054v2
PDF http://arxiv.org/pdf/1903.07054v2.pdf
PWC https://paperswithcode.com/paper/adversarial-attacks-on-deep-neural-networks
Repo https://github.com/hfawaz/ijcnn19attacks
Framework tf

Feature Relevance Bounds for Ordinal Regression

Title Feature Relevance Bounds for Ordinal Regression
Authors Lukas Pfannschmidt, Jonathan Jakob, Michael Biehl, Peter Tino, Barbara Hammer
Abstract The increasing occurrence of ordinal data, mainly sociodemographic, led to a renewed research interest in ordinal regression, i.e. the prediction of ordered classes. Besides model accuracy, the interpretation of these models itself is of high relevance, and existing approaches therefore enforce e.g. model sparsity. For high dimensional or highly correlated data, however, this might be misleading due to strong variable dependencies. In this contribution, we aim for an identification of feature relevance bounds which - besides identifying all relevant features - explicitly differentiates between strongly and weakly relevant features.
Tasks
Published 2019-02-20
URL http://arxiv.org/abs/1902.07662v1
PDF http://arxiv.org/pdf/1902.07662v1.pdf
PWC https://paperswithcode.com/paper/feature-relevance-bounds-for-ordinal
Repo https://github.com/lpfann/fri
Framework none

What to Expect of Classifiers? Reasoning about Logistic Regression with Missing Features

Title What to Expect of Classifiers? Reasoning about Logistic Regression with Missing Features
Authors Pasha Khosravi, Yitao Liang, YooJung Choi, Guy Van den Broeck
Abstract While discriminative classifiers often yield strong predictive performance, missing feature values at prediction time can still be a challenge. Classifiers may not behave as expected under certain ways of substituting the missing values, since they inherently make assumptions about the data distribution they were trained on. In this paper, we propose a novel framework that classifies examples with missing features by computing the expected prediction with respect to a feature distribution. Moreover, we use geometric programming to learn a naive Bayes distribution that embeds a given logistic regression classifier and can efficiently take its expected predictions. Empirical evaluations show that our model achieves the same performance as the logistic regression with all features observed, and outperforms standard imputation techniques when features go missing during prediction time. Furthermore, we demonstrate that our method can be used to generate “sufficient explanations” of logistic regression classifications, by removing features that do not affect the classification.
Tasks Imputation
Published 2019-03-05
URL https://arxiv.org/abs/1903.01620v2
PDF https://arxiv.org/pdf/1903.01620v2.pdf
PWC https://paperswithcode.com/paper/what-to-expect-of-classifiers-reasoning-about
Repo https://github.com/UCLA-StarAI/NaCL
Framework none

Benchmarking HillVallEA for the GECCO 2019 Competition on Multimodal Optimization

Title Benchmarking HillVallEA for the GECCO 2019 Competition on Multimodal Optimization
Authors S. C. Maree, T. Alderliesten, P. A. N. Bosman
Abstract This report presents benchmarking results of the Hill-Valley Evolutionary Algorithm version 2019 (HillVallEA19) on the CEC2013 niching benchmark suite under the restrictions of the GECCO 2019 niching competition on multimodal optimization. Performance is compared to algorithms that participated in previous editions of the niching competition.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.10988v1
PDF https://arxiv.org/pdf/1907.10988v1.pdf
PWC https://paperswithcode.com/paper/benchmarking-hillvallea-for-the-gecco-2019
Repo https://github.com/scmaree/HillVallEA
Framework none

Sinkhorn Divergence of Topological Signature Estimates for Time Series Classification

Title Sinkhorn Divergence of Topological Signature Estimates for Time Series Classification
Authors Colin Stephen
Abstract Distinguishing between classes of time series sampled from dynamic systems is a common challenge in systems and control engineering, for example in the context of health monitoring, fault detection, and quality control. The challenge is increased when no underlying model of a system is known, measurement noise is present, and long signals need to be interpreted. In this paper we address these issues with a new non parametric classifier based on topological signatures. Our model learns classes as weighted kernel density estimates (KDEs) over persistent homology diagrams and predicts new trajectory labels using Sinkhorn divergences on the space of diagram KDEs to quantify proximity. We show that this approach accurately discriminates between states of chaotic systems that are close in parameter space, and its performance is robust to noise.
Tasks Fault Detection, Time Series, Time Series Classification
Published 2019-02-14
URL http://arxiv.org/abs/1902.05326v1
PDF http://arxiv.org/pdf/1902.05326v1.pdf
PWC https://paperswithcode.com/paper/sinkhorn-divergence-of-topological-signature
Repo https://github.com/colinstephen/icmla2018
Framework none

Coherent Point Drift Networks: Unsupervised Learning of Non-Rigid Point Set Registration

Title Coherent Point Drift Networks: Unsupervised Learning of Non-Rigid Point Set Registration
Authors Lingjing Wang, Xiang Li, Jianchun Chen, Yi Fang
Abstract Given new pairs of source and target point sets, standard point set registration methods often repeatedly conduct the independent iterative search of desired geometric transformation to align the source point set with the target one. This limits their use in applications to handle the real-time point set registration with large volume dataset. This paper presents a novel method, named coherent point drift networks (CPD-Net), for the unsupervised learning of geometric transformation towards real-time non-rigid point set registration. In contrast to previous efforts (e.g. coherent point drift), CPD-Net can learn displacement field function to estimate geometric transformation from a training dataset, consequently, to predict the desired geometric transformation for the alignment of previously unseen pairs without any additional iterative optimization process. Furthermore, CPD-Net leverages the power of deep neural networks to fit an arbitrary function, that adaptively accommodates different levels of complexity of the desired geometric transformation. Particularly, CPD-Net is proved with a theoretical guarantee to learn a continuous displacement vector function that could further avoid imposing additional parametric smoothness constraint as in previous works. Our experiments verify the impressive performance of CPD-Net for non-rigid point set registration on various 2D/3D datasets, even in the presence of significant displacement noise, outliers, and missing points. Our code will be available at https://github.com/nyummvc/CPD-Net.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03039v5
PDF https://arxiv.org/pdf/1906.03039v5.pdf
PWC https://paperswithcode.com/paper/coherent-point-drift-networks-unsupervised
Repo https://github.com/Lingjing324/CPD-Net
Framework pytorch

Early Recognition of Sepsis with Gaussian Process Temporal Convolutional Networks and Dynamic Time Warping

Title Early Recognition of Sepsis with Gaussian Process Temporal Convolutional Networks and Dynamic Time Warping
Authors Michael Moor, Max Horn, Bastian Rieck, Damian Roqueiro, Karsten Borgwardt
Abstract Sepsis is a life-threatening host response to infection associated with high mortality, morbidity, and health costs. Its management is highly time-sensitive since each hour of delayed treatment increases mortality due to irreversible organ damage. Meanwhile, despite decades of clinical research, robust biomarkers for sepsis are missing. Therefore, detecting sepsis early by utilizing the affluence of high-resolution intensive care records has become a challenging machine learning problem. Recent advances in deep learning and data mining promise to deliver a powerful set of tools to efficiently address this task. This empirical study proposes two novel approaches for the early detection of sepsis: a deep learning model and a lazy learner based on time series distances. Our deep learning model employs a temporal convolutional network that is embedded in a Multi-task Gaussian Process Adapter framework, making it directly applicable to irregularly-spaced time series data. Our lazy learner, by contrast, is an ensemble approach that employs dynamic time warping. We frame the timely detection of sepsis as a supervised time series classification task. For this, we derive the most recent sepsis definition in an hourly resolution to provide the first fully accessible early sepsis detection environment. Seven hours before sepsis onset, our methods improve area under the precision–recall curve from 0.25 to 0.35/0.40 over the state of the art. This demonstrates that they are well-suited for detecting sepsis in the crucial earlier stages when management is most effective.
Tasks Time Series, Time Series Classification
Published 2019-02-05
URL http://arxiv.org/abs/1902.01659v3
PDF http://arxiv.org/pdf/1902.01659v3.pdf
PWC https://paperswithcode.com/paper/temporal-convolutional-networks-and-dynamic
Repo https://github.com/BorgwardtLab/mgp-tcn
Framework tf

Text Infilling

Title Text Infilling
Authors Wanrong Zhu, Zhiting Hu, Eric Xing
Abstract Recent years have seen remarkable progress of text generation in different contexts, such as the most common setting of generating text from scratch, and the emerging paradigm of retrieval-and-rewriting. Text infilling, which fills missing text portions of a sentence or paragraph, is also of numerous use in real life, yet is under-explored. Previous work has focused on restricted settings by either assuming single word per missing portion or limiting to a single missing portion to the end of the text. This paper studies the general task of text infilling, where the input text can have an arbitrary number of portions to be filled, each of which may require an arbitrary unknown number of tokens. We study various approaches for the task, including a self-attention model with segment-aware position encoding and bidirectional context modeling. We create extensive supervised data by masking out text with varying strategies. Experiments show the self-attention model greatly outperforms others, creating a strong baseline for future research.
Tasks Text Generation, Text Infilling
Published 2019-01-01
URL http://arxiv.org/abs/1901.00158v2
PDF http://arxiv.org/pdf/1901.00158v2.pdf
PWC https://paperswithcode.com/paper/text-infilling
Repo https://github.com/VegB/Text_Infilling
Framework tf

Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space

Title Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space
Authors Matthew C. Fontaine, Julian Togelius, Stefanos Nikolaidis, Amy K. Hoover
Abstract Quality Diversity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are a new class of population-based stochastic algorithms designed to generate a diverse collection of quality solutions. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the dynamic self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments with standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains. Code is available for both the continuous optimization benchmark (https://github.com/tehqin/QualDivBenchmark) and Hearthstone (https://github.com/tehqin/EvoStone) domains.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02400v1
PDF https://arxiv.org/pdf/1912.02400v1.pdf
PWC https://paperswithcode.com/paper/covariance-matrix-adaptation-for-the-rapid
Repo https://github.com/tehqin/QualDivBenchmark
Framework none

Keep Calm and Switch On! Preserving Sentiment and Fluency in Semantic Text Exchange

Title Keep Calm and Switch On! Preserving Sentiment and Fluency in Semantic Text Exchange
Authors Steven Y. Feng, Aaron W. Li, Jesse Hoey
Abstract In this paper, we present a novel method for measurably adjusting the semantics of text while preserving its sentiment and fluency, a task we call semantic text exchange. This is useful for text data augmentation and the semantic correction of text generated by chatbots and virtual assistants. We introduce a pipeline called SMERTI that combines entity replacement, similarity masking, and text infilling. We measure our pipeline’s success by its Semantic Text Exchange Score (STES): the ability to preserve the original text’s sentiment and fluency while adjusting semantic content. We propose to use masking (replacement) rate threshold as an adjustable parameter to control the amount of semantic change in the text. Our experiments demonstrate that SMERTI can outperform baseline models on Yelp reviews, Amazon reviews, and news headlines.
Tasks Data Augmentation, Text Infilling
Published 2019-08-30
URL https://arxiv.org/abs/1909.00088v1
PDF https://arxiv.org/pdf/1909.00088v1.pdf
PWC https://paperswithcode.com/paper/keep-calm-and-switch-on-preserving-sentiment
Repo https://github.com/styfeng/SMERTI
Framework none

Tuning parameter calibration for prediction in personalized medicine

Title Tuning parameter calibration for prediction in personalized medicine
Authors Shih-Ting Huang, Yannick Düren, Kristoffer H. Hellton, Johannes Lederer
Abstract Personalized medicine has become an important part of medicine, for instance predicting individual drug responses based on genomic information. However, many current statistical methods are not tailored to this task, because they overlook the individual heterogeneity of patients. In this paper, we look at personalized medicine from a linear regression standpoint. We introduce an alternative version of the ridge estimator and target individuals by establishing a tuning parameter calibration scheme that minimizes prediction errors of individual patients. In stark contrast, classical schemes such as cross-validation minimize prediction errors only on average. We show that our pipeline is optimal in terms of oracle inequalities, fast, and highly effective both in simulations and on real data.
Tasks Calibration
Published 2019-09-23
URL https://arxiv.org/abs/1909.10635v3
PDF https://arxiv.org/pdf/1909.10635v3.pdf
PWC https://paperswithcode.com/paper/tuning-parameter-calibration-for-prediction
Repo https://github.com/LedererLab/personalized_medicine
Framework none

Scalable and Efficient Hypothesis Testing with Random Forests

Title Scalable and Efficient Hypothesis Testing with Random Forests
Authors Tim Coleman, Wei Peng, Lucas Mentch
Abstract Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods. While their black-box nature has made their mathematical analysis difficult, recent work has established important statistical properties like consistency and asymptotic normality by considering subsampling in lieu of bootstrapping. Though such results open the door to traditional inference procedures, all formal methods suggested thus far place severe restrictions on the testing framework and their computational overhead precludes their practical scientific use. Here we propose a permutation-style testing approach to formally assess feature significance. We establish asymptotic validity of the test via exchangeability arguments and show that the test maintains high power with orders of magnitude fewer computations. As importantly, the procedure scales easily to big data settings where large training and testing sets may be employed without the need to construct additional models. Simulations and applications to ecological data where random forests have recently shown promise are provided.
Tasks
Published 2019-04-16
URL https://arxiv.org/abs/1904.07830v3
PDF https://arxiv.org/pdf/1904.07830v3.pdf
PWC https://paperswithcode.com/paper/scalable-and-efficient-hypothesis-testing
Repo https://github.com/tim-coleman/RFtest
Framework none

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Title On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
Authors Sunil Thulasidasan, Gopinath Chennupati, Jeff Bilmes, Tanmoy Bhattacharya, Sarah Michalak
Abstract Mixup~\cite{zhang2017mixup} is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels. While simple to implement, it has been shown to be a surprisingly effective method of data augmentation for image classification: DNNs trained with mixup show noticeable gains in classification performance on a number of image classification benchmarks. In this work, we discuss a hitherto untouched aspect of mixup training – the calibration and predictive uncertainty of models trained with mixup. We find that DNNs trained with mixup are significantly better calibrated – i.e., the predicted softmax scores are much better indicators of the actual likelihood of a correct prediction – than DNNs trained in the regular fashion. We conduct experiments on a number of image classification architectures and datasets – including large-scale datasets like ImageNet – and find this to be the case. Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration. Finally, we also observe that mixup-trained DNNs are less prone to over-confident predictions on out-of-distribution and random-noise data. We conclude that the typical overconfidence seen in neural networks, even on in-distribution data is likely a consequence of training with hard labels, suggesting that mixup be employed for classification tasks where predictive uncertainty is a significant concern.
Tasks Calibration, Data Augmentation, Image Classification
Published 2019-05-27
URL https://arxiv.org/abs/1905.11001v5
PDF https://arxiv.org/pdf/1905.11001v5.pdf
PWC https://paperswithcode.com/paper/on-mixup-training-improved-calibration-and
Repo https://github.com/MacroMayhem/OnMixup
Framework pytorch
comments powered by Disqus