April 3, 2020

3225 words 16 mins read

Paper Group ANR 77

Half-empty or half-full? A Hybrid Approach to Predict Recycling Behavior of Consumers to Increase Reverse Vending Machine Uptime. A Neural Architecture for Person Ontology population. HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection. Comparing Different Deep Learning Architectures for Classification of Chest Radiographs …

Half-empty or half-full? A Hybrid Approach to Predict Recycling Behavior of Consumers to Increase Reverse Vending Machine Uptime


Title	Half-empty or half-full? A Hybrid Approach to Predict Recycling Behavior of Consumers to Increase Reverse Vending Machine Uptime
Authors	Jannis Walk, Robin Hirt, Niklas Kühl, Erik R. Hersløv
Abstract	Reverse Vending Machines (RVMs) are a proven instrument for facilitating closed-loop plastic packaging recycling. A good customer experience at the RVM is crucial for a further proliferation of this technology. Bin full events are the major reason for Reverse Vending Machine (RVM) downtime at the world leader in the RVM market. The paper at hand develops and evaluates an approach based on machine learning and statistical approximation to foresee bin full events and, thus increase uptime of RVMs. Our approach relies on forecasting the hourly time series of returned beverage containers at a given RVM. We contribute by developing and evaluating an approach for hourly forecasts in a retail setting - this combination of application domain and forecast granularity is novel. A trace-driven simulation confirms that the forecasting-based approach leads to less downtime and costs than naive emptying strategies.
Tasks	Time Series
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13304v1
PDF	https://arxiv.org/pdf/2003.13304v1.pdf
PWC	https://paperswithcode.com/paper/half-empty-or-half-full-a-hybrid-approach-to
Repo
Framework

A Neural Architecture for Person Ontology population


Title	A Neural Architecture for Person Ontology population
Authors	Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald
Abstract	A person ontology comprising concepts, attributes and relationships of people has a number of applications in data protection, didentification, population of knowledge graphs for business intelligence and fraud prevention. While artificial neural networks have led to improvements in Entity Recognition, Entity Classification, and Relation Extraction, creating an ontology largely remains a manual process, because it requires a fixed set of semantic relations between concepts. In this work, we present a system for automatically populating a person ontology graph from unstructured data using neural models for Entity Classification and Relation Extraction. We introduce a new dataset for these tasks and discuss our results.
Tasks	Knowledge Graphs, Relation Extraction
Published	2020-01-22
URL	https://arxiv.org/abs/2001.08013v1
PDF	https://arxiv.org/pdf/2001.08013v1.pdf
PWC	https://paperswithcode.com/paper/a-neural-architecture-for-person-ontology
Repo
Framework

HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection


Title	HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection
Authors	Mischa Schmidt, Julia Gastinger, Sébastien Nicolas, Anett Schülke
Abstract	Automated algorithm selection and hyperparameter tuning facilitates the application of machine learning. Traditional multi-armed bandit strategies look to the history of observed rewards to identify the most promising arms for optimizing expected total reward in the long run. When considering limited time budgets and computational resources, this backward view of rewards is inappropriate as the bandit should look into the future for anticipating the highest final reward at the end of a specified time budget. This work addresses that insight by introducing HAMLET, which extends the bandit approach with learning curve extrapolation and computation time-awareness for selecting among a set of machine learning algorithms. Results show that the HAMLET Variants 1-3 exhibit equal or better performance than other bandit-based algorithm selection strategies in experiments with recorded hyperparameter tuning traces for the majority of considered time budgets. The best performing HAMLET Variant 3 combines learning curve extrapolation with the well-known upper confidence bound exploration bonus. That variant performs better than all non-HAMLET policies with statistical significance at the 95% level for 1,485 runs.
Tasks
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11261v1
PDF	https://arxiv.org/pdf/2001.11261v1.pdf
PWC	https://paperswithcode.com/paper/hamlet-a-learning-curve-enabled-multi-armed
Repo
Framework

Comparing Different Deep Learning Architectures for Classification of Chest Radiographs


Title	Comparing Different Deep Learning Architectures for Classification of Chest Radiographs
Authors	Keno K. Bressem, Lisa Adams, Christoph Erxleben, Bernd Hamm, Stefan Niehues, Janis Vahldiek
Abstract	Chest radiographs are among the most frequently acquired images in radiology and are often the subject of computer vision research. However, most of the models used to classify chest radiographs are derived from openly available deep neural networks, trained on large image-datasets. These datasets routinely differ from chest radiographs in that they are mostly color images and contain several possible image classes, while radiographs are greyscale images and often only contain fewer image classes. Therefore, very deep neural networks, which can represent more complex relationships in image-features, might not be required for the comparatively simpler task of classifying grayscale chest radiographs. We compared fifteen different architectures of artificial neural networks regarding training-time and performance on the openly available CheXpert dataset to identify the most suitable models for deep learning tasks on chest radiographs. We could show, that smaller networks such as ResNet-34, AlexNet or VGG-16 have the potential to classify chest radiographs as precisely as deeper neural networks such as DenseNet-201 or ResNet-151, while being less computationally demanding.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08991v1
PDF	https://arxiv.org/pdf/2002.08991v1.pdf
PWC	https://paperswithcode.com/paper/comparing-different-deep-learning
Repo
Framework

Data-based design of stabilizing switching signals for discrete-time switched linear systems


Title	Data-based design of stabilizing switching signals for discrete-time switched linear systems
Authors	Atreyee Kundu
Abstract	This paper deals with stabilization of discrete-time switched linear systems when explicit knowledge of the state-space models of their subsystems are not available. Given the sets of indices of the stable and unstable subsystems, the set of admissible switches between the subsystems, the admissible dwell times on the subsystems and a simulation model from which finite traces of state trajectories of the switched system can be collected, we devise an algorithm that designs periodic switching signals which preserve stability of the resulting switched system. We combine two ingredients: (a) data-based stability analysis of discrete-time linear systems and (b) multiple Lyapunov-like functions and graph walks based design of stabilizing switching signals, for this purpose. A numerical example is presented to demonstrate the proposed algorithm.
Tasks
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05774v1
PDF	https://arxiv.org/pdf/2003.05774v1.pdf
PWC	https://paperswithcode.com/paper/data-based-design-of-stabilizing-switching
Repo
Framework

Topological Data Analysis in Text Classification: Extracting Features with Additive Information


Title	Topological Data Analysis in Text Classification: Extracting Features with Additive Information
Authors	Shafie Gholizadeh, Ketki Savle, Armin Seyeditabari, Wlodek Zadrozny
Abstract	While the strength of Topological Data Analysis has been explored in many studies on high dimensional numeric data, it is still a challenging task to apply it to text. As the primary goal in topological data analysis is to define and quantify the shapes in numeric data, defining shapes in the text is much more challenging, even though the geometries of vector spaces and conceptual spaces are clearly relevant for information retrieval and semantics. In this paper, we examine two different methods of extraction of topological features from text, using as the underlying representations of words the two most popular methods, namely word embeddings and TF-IDF vectors. To extract topological features from the word embedding space, we interpret the embedding of a text document as high dimensional time series, and we analyze the topology of the underlying graph where the vertices correspond to different embedding dimensions. For topological data analysis with the TF-IDF representations, we analyze the topology of the graph whose vertices come from the TF-IDF vectors of different blocks in the textual document. In both cases, we apply homological persistence to reveal the geometric structures under different distance resolutions. Our results show that these topological features carry some exclusive information that is not captured by conventional text mining methods. In our experiments we observe adding topological features to the conventional features in ensemble models improves the classification results (up to 5%). On the other hand, as expected, topological features by themselves may be not sufficient for effective classification. It is an open problem to see whether TDA features from word embeddings might be sufficient, as they seem to perform within a range of few points from top results obtained with a linear support vector classifier.
Tasks	Information Retrieval, Text Classification, Time Series, Topological Data Analysis, Word Embeddings
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13138v1
PDF	https://arxiv.org/pdf/2003.13138v1.pdf
PWC	https://paperswithcode.com/paper/topological-data-analysis-in-text
Repo
Framework

On the Convergence of Adam and Adagrad


Title	On the Convergence of Adam and Adagrad
Authors	Alexandre Défossez, Léon Bottou, Francis Bach, Nicolas Usunier
Abstract	We provide a simple proof of the convergence of the optimization algorithms Adam and Adagrad with the assumptions of smooth gradients and almost sure uniform bound on the $\ell_\infty$ norm of the gradients. This work builds on the techniques introduced by Ward et al. (2019) and extends them to the Adam optimizer. We show that in expectation, the squared norm of the objective gradient averaged over the trajectory has an upper-bound which is explicit in the constants of the problem, parameters of the optimizer and the total number of iterations N. This bound can be made arbitrarily small. In particular, Adam with a learning rate $\alpha=1/\sqrt{N}$ and a momentum parameter on squared gradients $\beta_2=1 - 1/N$ achieves the same rate of convergence $O(\ln(N)/\sqrt{N})$ as Adagrad. Thus, it is possible to use Adam as a finite horizon version of Adagrad, much like constant step size SGD can be used instead of its asymptotically converging decaying step size version.
Tasks
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02395v1
PDF	https://arxiv.org/pdf/2003.02395v1.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-adam-and-adagrad
Repo
Framework

Anomalous Instance Detection in Deep Learning: A Survey


Title	Anomalous Instance Detection in Deep Learning: A Survey
Authors	Saikiran Bulusu, Bhavya Kailkhura, Bo Li, Pramod K. Varshney, Dawn Song
Abstract	Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.
Tasks	Anomaly Detection
Published	2020-03-16
URL	https://arxiv.org/abs/2003.06979v1
PDF	https://arxiv.org/pdf/2003.06979v1.pdf
PWC	https://paperswithcode.com/paper/anomalous-instance-detection-in-deep-learning
Repo
Framework

A copula-based visualization technique for a neural network


Title	A copula-based visualization technique for a neural network
Authors	Yusuke Kubo, Yuto Komori, Toyonobu Okuyama, Hiroshi Tokieda
Abstract	Interpretability of machine learning is defined as the extent to which humans can comprehend the reason of a decision. However, a neural network is not considered interpretable due to the ambiguity in its decision-making process. Therefore, in this study, we propose a new algorithm that reveals which feature values the trained neural network considers important and which paths are mainly traced in the process of decision-making. In the proposed algorithm, the score estimated by the correlation coefficients between the neural network layers that can be calculated by applying the concept of a pair copula was defined. We compared the estimated score with the feature importance values of Random Forest, which is sometimes regarded as a highly interpretable algorithm, in the experiment and confirmed that the results were consistent with each other. This algorithm suggests an approach for compressing a neural network and its parameter tuning because the algorithm identifies the paths that contribute to the classification or prediction results.
Tasks	Decision Making, Feature Importance
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12317v1
PDF	https://arxiv.org/pdf/2003.12317v1.pdf
PWC	https://paperswithcode.com/paper/a-copula-based-visualization-technique-for-a
Repo
Framework

Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach


Title	Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach
Authors	Yaoshu Wang, Chuan Xiao, Jianbin Qin, Xin Cao, Yifang Sun, Wei Wang, Makoto Onizuka
Abstract	Due to the outstanding capability of capturing underlying data distributions, deep learning techniques have been recently utilized for a series of traditional database problems. In this paper, we investigate the possibilities of utilizing deep learning for cardinality estimation of similarity selection. Answering this problem accurately and efficiently is essential to many data management applications, especially for query optimization. Moreover, in some applications the estimated cardinality is supposed to be consistent and interpretable. Hence a monotonic estimation w.r.t. the query threshold is preferred. We propose a novel and generic method that can be applied to any data type and distance function. Our method consists of a feature extraction model and a regression model. The feature extraction model transforms original data and threshold to a Hamming space, in which a deep learning-based regression model is utilized to exploit the incremental property of cardinality w.r.t. the threshold for both accuracy and monotonicity. We develop a training strategy tailored to our model as well as techniques for fast estimation. We also discuss how to handle updates. We demonstrate the accuracy and the efficiency of our method through experiments, and show how it improves the performance of a query optimizer.
Tasks
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06442v3
PDF	https://arxiv.org/pdf/2002.06442v3.pdf
PWC	https://paperswithcode.com/paper/monotonic-cardinality-estimation-of
Repo
Framework

Estimating the Effects of Continuous-valued Interventions using Generative Adversarial Networks


Title	Estimating the Effects of Continuous-valued Interventions using Generative Adversarial Networks
Authors	Ioana Bica, James Jordon, Mihaela van der Schaar
Abstract	While much attention has been given to the problem of estimating the effect of discrete interventions from observational data, relatively little work has been done in the setting of continuous-valued interventions, such as treatments associated with a dosage parameter. In this paper, we tackle this problem by building on a modification of the generative adversarial networks (GANs) framework. Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions. The key idea is to use a significantly modified GAN model to learn to generate counterfactual outcomes, which can then be used to learn an inference model, using standard supervised methods, capable of estimating these counterfactuals for a new sample. To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator - we build a hierarchical discriminator that leverages the structure of the continuous intervention setting. Moreover, we provide theoretical results to support our use of the GAN framework and of the hierarchical discriminator. In the experiments section, we introduce a new semi-synthetic data simulation for use in the continuous intervention setting and demonstrate improvements over the existing benchmark models.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12326v1
PDF	https://arxiv.org/pdf/2002.12326v1.pdf
PWC	https://paperswithcode.com/paper/estimating-the-effects-of-continuous-valued
Repo
Framework

Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC


Title	Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC
Authors	Eric Brachmann, Carsten Rother
Abstract	We describe a learning-based system that estimates the camera position and orientation from a single input image relative to a known environment. The system is flexible w.r.t. the amount of information available at test and at training time, catering to different applications. Input images can be RGB-D or RGB, and a 3D model of the environment can be utilized for training but is not necessary. In the minimal case, our system requires only RGB images and ground truth poses at training time, and it requires only a single RGB image at test time. The framework consists of a deep neural network and fully differentiable pose optimization. The neural network predicts so called scene coordinates, i.e. dense correspondences between the input image and 3D scene space of the environment. The pose optimization implements robust fitting of pose parameters using differentiable RANSAC (DSAC) to facilitate end-to-end training. The system, an extension of DSAC++ and referred to as DSAC*, achieves state-of-the-art accuracy an various public datasets for RGB-based re-localization, and competitive accuracy for RGB-D based re-localization.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12324v1
PDF	https://arxiv.org/pdf/2002.12324v1.pdf
PWC	https://paperswithcode.com/paper/visual-camera-re-localization-from-rgb-and
Repo
Framework

Understanding the robustness of deep neural network classifiers for breast cancer screening


Title	Understanding the robustness of deep neural network classifiers for breast cancer screening
Authors	Witold Oleszkiewicz, Taro Makino, Stanisław Jastrzębski, Tomasz Trzciński, Linda Moy, Kyunghyun Cho, Laura Heacock, Krzysztof J. Geras
Abstract	Deep neural networks (DNNs) show promise in breast cancer screening, but their robustness to input perturbations must be better understood before they can be clinically implemented. There exists extensive literature on this subject in the context of natural images that can potentially be built upon. However, it cannot be assumed that conclusions about robustness will transfer from natural images to mammogram images, due to significant differences between the two image modalities. In order to determine whether conclusions will transfer, we measure the sensitivity of a radiologist-level screening mammogram image classifier to four commonly studied input perturbations that natural image classifiers are sensitive to. We find that mammogram image classifiers are also sensitive to these perturbations, which suggests that we can build on the existing literature. We also perform a detailed analysis on the effects of low-pass filtering, and find that it degrades the visibility of clinically meaningful features called microcalcifications. Since low-pass filtering removes semantically meaningful information that is predictive of breast cancer, we argue that it is undesirable for mammogram image classifiers to be invariant to it. This is in contrast to natural images, where we do not want DNNs to be sensitive to low-pass filtering due to its tendency to remove information that is human-incomprehensible.
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10041v1
PDF	https://arxiv.org/pdf/2003.10041v1.pdf
PWC	https://paperswithcode.com/paper/understanding-the-robustness-of-deep-neural
Repo
Framework

Convolutional Support Vector Machine


Title	Convolutional Support Vector Machine
Authors	Wei-Chang Yeh
Abstract	The support vector machine (SVM) and deep learning (e.g., convolutional neural networks (CNNs)) are the two most famous algorithms in small and big data, respectively. Nonetheless, smaller datasets may be very important, costly, and not easy to obtain in a short time. This paper proposes a novel convolutional SVM (CSVM) that has both advantages of CNN and SVM to improve the accuracy and effectiveness of mining smaller datasets. The proposed CSVM adapts the convolution product from CNN to learn new information hidden deeply in the datasets. In addition, it uses a modified simplified swarm optimization (SSO) to help train the CSVM to update classifiers, and then the traditional SVM is implemented as the fitness for the SSO to estimate the accuracy. To evaluate the performance of the proposed CSVM, experiments were conducted to test five well-known benchmark databases for the classification problem. Numerical experiments compared favorably with those obtained using SVM, 3-layer artificial NN (ANN), and 4-layer ANN. The results of these experiments verify that the proposed CSVM with the proposed SSO can effectively increase classification accuracy.
Tasks
Published	2020-02-11
URL	https://arxiv.org/abs/2002.07221v1
PDF	https://arxiv.org/pdf/2002.07221v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-support-vector-machine
Repo
Framework

Active Learning for Identification of Linear Dynamical Systems


Title	Active Learning for Identification of Linear Dynamical Systems
Authors	Andrew Wagenmaker, Kevin Jamieson
Abstract	We propose an algorithm to actively estimate the parameters of a linear dynamical system. Given complete control over the system’s input, our algorithm adaptively chooses the inputs to accelerate estimation. We show a finite time bound quantifying the estimation rate our algorithm attains and prove matching upper and lower bounds which guarantee its asymptotic optimality, up to constants. In addition, we show that this optimal rate is unattainable when using Gaussian noise to excite the system, even with optimally tuned covariance, and analyze several examples where our algorithm provably improves over rates obtained by playing noise. Our analysis critically relies on a novel result quantifying the error in estimating the parameters of a dynamical system when arbitrary periodic inputs are being played. We conclude with numerical examples that illustrate the effectiveness of our algorithm in practice.
Tasks	Active Learning
Published	2020-02-02
URL	https://arxiv.org/abs/2002.00495v1
PDF	https://arxiv.org/pdf/2002.00495v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-for-identification-of-linear
Repo
Framework