January 26, 2020

2898 words 14 mins read

Paper Group ANR 1604

Paper Group ANR 1604

Unit Impulse Response as an Explainer of Redundancy in a Deep Convolutional Neural Network. Student Performance Prediction with Optimum Multilabel Ensemble Model. Avoiding hashing and encouraging visual semantics in referential emergent language games. Convolutional STN for Weakly Supervised Object Localization and Beyond. Cost-efficient segmentati …

Unit Impulse Response as an Explainer of Redundancy in a Deep Convolutional Neural Network

Title Unit Impulse Response as an Explainer of Redundancy in a Deep Convolutional Neural Network
Authors Rachana Sathish, Debdoot Sheet
Abstract Convolutional neural networks (CNN) are generally designed with a heuristic initialization of network architecture and trained for a certain task. This often leads to overparametrization after learning and induces redundancy in the information flow paths within the network. This robustness and reliability is at the increased cost of redundant computations. Several methods have been proposed which leverage metrics that quantify the redundancy in each layer. However, layer-wise evaluation in these methods disregards the long-range redundancy which exists across depth on account of the distributed nature of the features learned by the model. In this paper, we propose (i) a mechanism to empirically demonstrate the robustness in performance of a CNN on account of redundancy across its depth, (ii) a method to identify the systemic redundancy in response of a CNN across depth using the understanding of unit impulse response, we subsequently demonstrate use of these methods to interpret redundancy in few networks as example. These techniques provide better insights into the internal dynamics of a CNN
Tasks
Published 2019-06-10
URL https://arxiv.org/abs/1906.03986v1
PDF https://arxiv.org/pdf/1906.03986v1.pdf
PWC https://paperswithcode.com/paper/unit-impulse-response-as-an-explainer-of
Repo
Framework

Student Performance Prediction with Optimum Multilabel Ensemble Model

Title Student Performance Prediction with Optimum Multilabel Ensemble Model
Authors Ephrem Admasu Yekun, Abrahaley Teklay
Abstract One of the important measures of quality of education is the performance of students in the academic settings. Nowadays, abundant data is stored in educational institutions about students which can help to discover insight on how students are learning and how to improve their performance ahead of time using data mining techniques. In this paper, we developed a student performance prediction model that predicts the performance of high school students for the next semester for five courses. We modeled our prediction system as a multi-label classification task and used support vector machine (SVM), Random Forest (RF), K-nearest Neighbors (KNN), and Mult-layer perceptron (MLP) as base-classifiers to train our model. We further improved the performance of the prediction model using state-of-the-art partitioning schemes to divide the label space into smaller spaces and use Label Powerset (LP) transformation method to transform each labelset into a multi-class classification task. The proposed model achieved better performance in terms of different evaluation metrics when compared to other multi-label learning tasks such as binary relevance and classifier chains.
Tasks Multi-Label Classification, Multi-Label Learning
Published 2019-09-06
URL https://arxiv.org/abs/1909.07444v1
PDF https://arxiv.org/pdf/1909.07444v1.pdf
PWC https://paperswithcode.com/paper/student-performance-prediction-with-optimum
Repo
Framework

Avoiding hashing and encouraging visual semantics in referential emergent language games

Title Avoiding hashing and encouraging visual semantics in referential emergent language games
Authors Daniela Mihai, Jonathon Hare
Abstract There has been an increasing interest in the area of emergent communication between agents which learn to play referential signalling games with realistic images. In this work, we consider the signalling game setting of Havrylov and Titov and investigate the effect of the feature extractor’s weights and of the task being solved on the visual semantics learned or captured by the models. We impose various augmentation to the input images and additional tasks in the game with the aim to induce visual representations which capture conceptual properties of images. Through our set of experiments, we demonstrate that communication systems which capture visual semantics can be learned in a completely self-supervised manner by playing the right types of game.
Tasks
Published 2019-11-13
URL https://arxiv.org/abs/1911.05546v1
PDF https://arxiv.org/pdf/1911.05546v1.pdf
PWC https://paperswithcode.com/paper/avoiding-hashing-and-encouraging-visual
Repo
Framework

Convolutional STN for Weakly Supervised Object Localization and Beyond

Title Convolutional STN for Weakly Supervised Object Localization and Beyond
Authors Akhil Meethal, Marco Pedersoli, Soufiane Belharbi, Eric Granger
Abstract Weakly-supervised object localization is a challenging task in which the object of interest should be localized while learning its appearance. State-of-the-art methods recycle the architecture of a standard CNN by using the activation maps of the last layer for localizing the object. While this approach is simple and works relatively well, object localization relies on different features than classification, thus, a specialized localization mechanism is required during training to improve performance. In this paper we propose a convolutional, multi-scale spatial localization network that provides accurate localization for the object of interest. Experimental results on CUB-200-2011 and ImageNet datasets show the improvements of our proposed approach w.r.t. state-of-the-art methods.
Tasks Object Localization, Weakly-Supervised Object Localization
Published 2019-12-03
URL https://arxiv.org/abs/1912.01522v1
PDF https://arxiv.org/pdf/1912.01522v1.pdf
PWC https://paperswithcode.com/paper/convolutional-stn-for-weakly-supervised
Repo
Framework

Cost-efficient segmentation of electron microscopy images using active learning

Title Cost-efficient segmentation of electron microscopy images using active learning
Authors Joris Roels, Yvan Saeys
Abstract Over the last decade, electron microscopy has improved up to a point that generating high quality gigavoxel sized datasets only requires a few hours. Automated image analysis, particularly image segmentation, however, has not evolved at the same pace. Even though state-of-the-art methods such as U-Net and DeepLab have improved segmentation performance substantially, the required amount of labels remains too expensive. Active learning is the subfield in machine learning that aims to mitigate this burden by selecting the samples that require labeling in a smart way. Many techniques have been proposed, particularly for image classification, to increase the steepness of learning curves. In this work, we extend these techniques to deep CNN based image segmentation. Our experiments on three different electron microscopy datasets show that active learning can improve segmentation quality by 10 to 15% in terms of Jaccard score compared to standard randomized sampling.
Tasks Active Learning, Image Classification, Semantic Segmentation
Published 2019-11-13
URL https://arxiv.org/abs/1911.05548v1
PDF https://arxiv.org/pdf/1911.05548v1.pdf
PWC https://paperswithcode.com/paper/cost-efficient-segmentation-of-electron
Repo
Framework

Machine learning for molecular simulation

Title Machine learning for molecular simulation
Authors Frank Noé, Alexandre Tkatchenko, Klaus-Robert Müller, Cecilia Clementi
Abstract Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for a machine learning revolution and have already been profoundly impacted by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, coarse-grained molecular dynamics, the extraction of free energy surfaces and kinetics and generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into machine learning structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation.
Tasks
Published 2019-11-07
URL https://arxiv.org/abs/1911.02792v1
PDF https://arxiv.org/pdf/1911.02792v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-molecular-simulation
Repo
Framework

Outlier Detection in High Dimensional Data

Title Outlier Detection in High Dimensional Data
Authors Firuz Kamalov, Ho Hon Leung
Abstract High-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on data set of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by the $F_1$-score. Our method also produces better-than-average execution times compared to the benchmark methods.
Tasks Density Estimation, Outlier Detection
Published 2019-09-09
URL https://arxiv.org/abs/1909.03681v1
PDF https://arxiv.org/pdf/1909.03681v1.pdf
PWC https://paperswithcode.com/paper/outlier-detection-in-high-dimensional-data
Repo
Framework

Towards Diverse and Accurate Image Captions via Reinforcing Determinantal Point Process

Title Towards Diverse and Accurate Image Captions via Reinforcing Determinantal Point Process
Authors Qingzhong Wang, Antoni B. Chan
Abstract Although significant progress has been made in the field of automatic image captioning, it is still a challenging task. Previous works normally pay much attention to improving the quality of the generated captions but ignore the diversity of captions. In this paper, we combine determinantal point process (DPP) and reinforcement learning (RL) and propose a novel reinforcing DPP (R-DPP) approach to generate a set of captions with high quality and diversity for an image. We show that R-DPP performs better on accuracy and diversity than using noise as a control signal (GANs, VAEs). Moreover, R-DPP is able to preserve the modes of the learned distribution. Hence, beam search algorithm can be applied to generate a single accurate caption, which performs better than other RL-based models.
Tasks Image Captioning
Published 2019-08-14
URL https://arxiv.org/abs/1908.04919v1
PDF https://arxiv.org/pdf/1908.04919v1.pdf
PWC https://paperswithcode.com/paper/towards-diverse-and-accurate-image-captions
Repo
Framework

The Cost of a Reductions Approach to Private Fair Optimization

Title The Cost of a Reductions Approach to Private Fair Optimization
Authors Daniel Alabi
Abstract We examine a reductions approach to fair optimization and learning where a black-box optimizer is used to learn a fair model for classification or regression [Alabi et al., 2018, Agarwal et al., 2018] and explore the creation of such fair models that adhere to data privacy guarantees (specifically differential privacy). For this approach, we consider two suites of use cases: the first is for optimizing convex performance measures of the confusion matrix (such as those derived from the $G$-mean and $H$-mean); the second is for satisfying statistical definitions of algorithmic fairness (such as equalized odds, demographic parity, and the gini index of inequality). The reductions approach to fair optimization can be abstracted as the constrained group-objective optimization problem where we aim to optimize an objective that is a function of losses of individual groups, subject to some constraints. We present two generic differentially private algorithms to solve this problem: an $(\epsilon, 0)$ exponential sampling algorithm and an $(\epsilon, \delta)$ algorithm that uses an approximate linear optimizer to incrementally move toward the best decision. Compared to a previous method for ensuring differential privacy subject to a relaxed form of the equalized odds fairness constraint, the $(\epsilon, \delta)$ differentially private algorithm we present provides asymptotically better sample complexity guarantees in certain parameter regimes. The technique of using an approximate linear optimizer oracle to achieve privacy might be applicable to other problems not considered in this paper. Finally, we show an algorithm-agnostic information-theoretic lower bound on the excess risk (or equivalently, the sample complexity) of any solution to the problem of $(\epsilon, 0)$ or $(\epsilon, \delta)$ private constrained group-objective optimization.
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09613v3
PDF https://arxiv.org/pdf/1906.09613v3.pdf
PWC https://paperswithcode.com/paper/the-cost-of-a-reductions-approach-to-private
Repo
Framework

(When) Is Truth-telling Favored in AI Debate?

Title (When) Is Truth-telling Favored in AI Debate?
Authors Vojtěch Kovařík, Ryan Carey
Abstract For some problems, humans may not be able to accurately judge the goodness of AI-proposed solutions. Irving et al. (2018) propose that in such cases, we may use a debate between two AI systems to amplify the problem-solving capabilities of a human judge. We introduce a mathematical framework that can model debates of this type and propose that the quality of debate designs should be measured by the accuracy of the most persuasive answer. We describe a simple instance of the debate framework called feature debate and analyze the degree to which such debates track the truth. We argue that despite being very simple, feature debates nonetheless capture many aspects of practical debates such as the incentives to confuse the judge or stall to prevent losing. We then outline how these models should be generalized to analyze a wider range of debate phenomena.
Tasks
Published 2019-11-11
URL https://arxiv.org/abs/1911.04266v2
PDF https://arxiv.org/pdf/1911.04266v2.pdf
PWC https://paperswithcode.com/paper/when-is-truth-telling-favored-in-ai-debate
Repo
Framework

Frustratingly Easy Natural Question Answering

Title Frustratingly Easy Natural Question Answering
Authors Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil
Abstract Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa. Additionally, a lot of systems on the QA leaderboards do not have associated research documentation in order to successfully replicate their experiments. In this paper, we outline these algorithmic components such as Attention-over-Attention, coupled with data augmentation and ensembling strategies that have shown to yield state-of-the-art results on benchmark datasets like SQuAD, even achieving super-human performance. Contrary to these prior results, when we evaluate on the recently proposed Natural Questions benchmark dataset, we find that an incredibly simple approach of transfer learning from BERT outperforms the previous state-of-the-art system trained on 4 million more examples than ours by 1.9 F1 points. Adding ensembling strategies further improves that number by 2.3 F1 points.
Tasks Data Augmentation, Question Answering, Transfer Learning
Published 2019-09-11
URL https://arxiv.org/abs/1909.05286v1
PDF https://arxiv.org/pdf/1909.05286v1.pdf
PWC https://paperswithcode.com/paper/frustratingly-easy-natural-question-answering
Repo
Framework

Deep Actor-Critic Reinforcement Learning for Anomaly Detection

Title Deep Actor-Critic Reinforcement Learning for Anomaly Detection
Authors Chen Zhong, M. Cenk Gursoy, Senem Velipasalar
Abstract Anomaly detection is widely applied in a variety of domains, involving for instance, smart home systems, network traffic monitoring, IoT applications and sensor networks. In this paper, we study deep reinforcement learning based active sequential testing for anomaly detection. We assume that there is an unknown number of abnormal processes at a time and the agent can only check with one sensor in each sampling step. To maximize the confidence level of the decision and minimize the stopping time concurrently, we propose a deep actor-critic reinforcement learning framework that can dynamically select the sensor based on the posterior probabilities. We provide simulation results for both the training phase and testing phase, and compare the proposed framework with the Chernoff test in terms of claim delay and loss.
Tasks Anomaly Detection
Published 2019-08-28
URL https://arxiv.org/abs/1908.10755v1
PDF https://arxiv.org/pdf/1908.10755v1.pdf
PWC https://paperswithcode.com/paper/deep-actor-critic-reinforcement-learning-for
Repo
Framework

Forward-Backward Decoding for Regularizing End-to-End TTS

Title Forward-Backward Decoding for Regularizing End-to-End TTS
Authors Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao
Abstract Neural end-to-end TTS can generate very high-quality synthesized speech, and even close to human recording within similar domain text. However, it performs unsatisfactory when scaling it to challenging test sets. One concern is that the encoder-decoder with attention-based network adopts autoregressive generative sequence model with the limitation of “exposure bias” To address this issue, we propose two novel methods, which learn to predict future by improving agreement between forward and backward decoding sequence. The first one is achieved by introducing divergence regularization terms into model training objective to reduce the mismatch between two directional models, namely L2R and R2L (which generates targets from left-to-right and right-to-left, respectively). While the second one operates on decoder-level and exploits the future information during decoding. In addition, we employ a joint training strategy to allow forward and backward decoding to improve each other in an interactive process. Experimental results show our proposed methods especially the second one (bidirectional decoder regularization), leads a significantly improvement on both robustness and overall naturalness, as outperforming baseline (the revised version of Tacotron2) with a MOS gap of 0.14 in a challenging test, and achieving close to human quality (4.42 vs. 4.49 in MOS) on general test.
Tasks
Published 2019-07-18
URL https://arxiv.org/abs/1907.09006v1
PDF https://arxiv.org/pdf/1907.09006v1.pdf
PWC https://paperswithcode.com/paper/forward-backward-decoding-for-regularizing
Repo
Framework

Joint Learning of Geometric and Probabilistic Constellation Shaping

Title Joint Learning of Geometric and Probabilistic Constellation Shaping
Authors Maximilian Stark, Fayçal Ait Aoudia, Jakob Hoydis
Abstract The choice of constellations largely affects the performance of communication systems. When designing constellations, both the locations and probability of occurrence of the points can be optimized. These approaches are referred to as geometric and probabilistic shaping, respectively. Usually, the geometry of the constellation is fixed, e.g., quadrature amplitude modulation (QAM) is used. In such cases, the achievable information rate can still be improved by probabilistic shaping. In this work, we show how autoencoders can be leveraged to perform probabilistic shaping of constellations. We devise an information-theoretical description of autoencoders, which allows learning of capacity-achieving symbol distributions and constellations. Recently, machine learning techniques to perform geometric shaping were proposed. However, probabilistic shaping is more challenging as it requires the optimization of discrete distributions. Furthermore, the proposed method enables joint probabilistic and geometric shaping of constellations over any channel model. Simulation results show that the learned constellations achieve information rates very close to capacity on an additive white Gaussian noise (AWGN) channel and outperform existing approaches on both AWGN and fading channels.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07748v3
PDF https://arxiv.org/pdf/1906.07748v3.pdf
PWC https://paperswithcode.com/paper/joint-learning-of-geometric-and-probabilistic
Repo
Framework

Regularizing Black-box Models for Improved Interpretability

Title Regularizing Black-box Models for Improved Interpretability
Authors Gregory Plumb, Maruan Al-Shedivat, Angel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar
Abstract Most of the work on interpretable machine learning has focused on designing either inherently interpretable models, which typically trade-off accuracy for interpretability, or post-hoc explanation systems, which tend to lack guarantees about the quality of their explanations. We explore a hybridization of these approaches by directly regularizing a black-box model for interpretability at training time - a method we call ExpO. We find that post-hoc explanations of an ExpO-regularized model are consistently more stable and of higher fidelity, which we show theoretically and support empirically. Critically, we also find ExpO leads to explanations that are more actionable, significantly more useful, and more intuitive as supported by a user study
Tasks Interpretable Machine Learning
Published 2019-02-18
URL https://arxiv.org/abs/1902.06787v4
PDF https://arxiv.org/pdf/1902.06787v4.pdf
PWC https://paperswithcode.com/paper/regularizing-black-box-models-for-improved
Repo
Framework
comments powered by Disqus