Paper Group ANR 1037
DeepPINK: reproducible feature selection in deep neural networks. Few-Shot Adaptation for Multimedia Semantic Indexing. Value Propagation Networks. Information Flow in Pregroup Models of Natural Language. Iterative Delegations in Liquid Democracy with Restricted Preferences. Policy Optimization with Second-Order Advantage Information. Learning Trea …
DeepPINK: reproducible feature selection in deep neural networks
Title | DeepPINK: reproducible feature selection in deep neural networks |
Authors | Yang Young Lu, Yingying Fan, Jinchi Lv, William Stafford Noble |
Abstract | Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility. |
Tasks | Feature Selection |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.01185v2 |
http://arxiv.org/pdf/1809.01185v2.pdf | |
PWC | https://paperswithcode.com/paper/deeppink-reproducible-feature-selection-in |
Repo | |
Framework | |
Few-Shot Adaptation for Multimedia Semantic Indexing
Title | Few-Shot Adaptation for Multimedia Semantic Indexing |
Authors | Nakamasa Inoue, Koichi Shinoda |
Abstract | We propose a few-shot adaptation framework, which bridges zero-shot learning and supervised many-shot learning, for semantic indexing of image and video data. Few-shot adaptation provides robust parameter estimation with few training examples, by optimizing the parameters of zero-shot learning and supervised many-shot learning simultaneously. In this method, first we build a zero-shot detector, and then update it by using the few examples. Our experiments show the effectiveness of the proposed framework on three datasets: TRECVID Semantic Indexing 2010, 2014, and ImageNET. On the ImageNET dataset, we show that our method outperforms recent few-shot learning methods. On the TRECVID 2014 dataset, we achieve 15.19% and 35.98% in Mean Average Precision under the zero-shot condition and the supervised condition, respectively. To the best of our knowledge, these are the best results on this dataset. |
Tasks | Few-Shot Learning, Zero-Shot Learning |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07203v1 |
http://arxiv.org/pdf/1807.07203v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-adaptation-for-multimedia-semantic |
Repo | |
Framework | |
Value Propagation Networks
Title | Value Propagation Networks |
Authors | Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier |
Abstract | We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments. We show that the modules enable learning to plan when the environment also includes stochastic elements, providing a cost-efficient learning system to build low-level size-invariant planners for a variety of interactive navigation problems. We evaluate on static and dynamic configurations of MazeBase grid-worlds, with randomly generated environments of several different sizes, and on a StarCraft navigation scenario, with more complex dynamics, and pixels as input. |
Tasks | Starcraft |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.11199v2 |
http://arxiv.org/pdf/1805.11199v2.pdf | |
PWC | https://paperswithcode.com/paper/value-propagation-networks |
Repo | |
Framework | |
Information Flow in Pregroup Models of Natural Language
Title | Information Flow in Pregroup Models of Natural Language |
Authors | Peter M. Hines |
Abstract | This paper is about pregroup models of natural languages, and how they relate to the explicitly categorical use of pregroups in Compositional Distributional Semantics and Natural Language Processing. These categorical interpretations make certain assumptions about the nature of natural languages that, when stated formally, may be seen to impose strong restrictions on pregroup grammars for natural languages. We formalize this as a hypothesis about the form that pregroup models of natural languages must take, and demonstrate by an artificial language example that these restrictions are not imposed by the pregroup axioms themselves. We compare and contrast the artificial language examples with natural languages (using Welsh, a language where the ‘noun’ type cannot be taken as primitive, as an illustrative example). The hypothesis is simply that there must exist a causal connection, or information flow, between the words of a sentence in a language whose purpose is to communicate information. This is not necessarily the case with formal languages that are simply generated by a series of ‘meaning-free’ rules. This imposes restrictions on the types of pregroup grammars that we expect to find in natural languages; we formalize this in algebraic, categorical, and graphical terms. We take some preliminary steps in providing conditions that ensure pregroup models satisfy these conjectured properties, and discuss the more general forms this hypothesis may take. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03273v1 |
http://arxiv.org/pdf/1811.03273v1.pdf | |
PWC | https://paperswithcode.com/paper/information-flow-in-pregroup-models-of |
Repo | |
Framework | |
Iterative Delegations in Liquid Democracy with Restricted Preferences
Title | Iterative Delegations in Liquid Democracy with Restricted Preferences |
Authors | Bruno Escoffier, Hugo Gilbert, Adèle Pass-Lanneau |
Abstract | In this paper, we study liquid democracy, a collective decision making paradigm which lies between direct and representative democracy. One main feature of liquid democracy is that voters can delegate their votes in a transitive manner so that: A delegates to B and B delegates to C leads to A delegates to C. Unfortunately, this process may not converge as there may not even exist a stable state (also called equilibrium). In this paper, we investigate the stability of the delegation process in liquid democracy when voters have restricted types of preference on the agent representing them (e.g., single-peaked preferences). We show that various natural structures of preferences guarantee the existence of an equilibrium and we obtain both tractability and hardness results for the problem of computing several equilibria with some desirable properties. |
Tasks | Decision Making |
Published | 2018-09-12 |
URL | https://arxiv.org/abs/1809.04362v2 |
https://arxiv.org/pdf/1809.04362v2.pdf | |
PWC | https://paperswithcode.com/paper/the-convergence-of-iterative-delegations-in |
Repo | |
Framework | |
Policy Optimization with Second-Order Advantage Information
Title | Policy Optimization with Second-Order Advantage Information |
Authors | Jiajin Li, Baoxiang Wang |
Abstract | Policy optimization on high-dimensional continuous control tasks exhibits its difficulty caused by the large variance of the policy gradient estimators. We present the action subspace dependent gradient (ASDG) estimator which incorporates the Rao-Blackwell theorem (RB) and Control Variates (CV) into a unified framework to reduce the variance. To invoke RB, our proposed algorithm (POSA) learns the underlying factorization structure among the action space based on the second-order advantage information. POSA captures the quadratic information explicitly and efficiently by utilizing the wide & deep architecture. Empirical studies show that our proposed approach demonstrates the performance improvements on high-dimensional synthetic settings and OpenAI Gym’s MuJoCo continuous control tasks. |
Tasks | Continuous Control |
Published | 2018-05-09 |
URL | https://arxiv.org/abs/1805.03586v2 |
https://arxiv.org/pdf/1805.03586v2.pdf | |
PWC | https://paperswithcode.com/paper/policy-optimization-with-second-order |
Repo | |
Framework | |
Learning Treatment Regimens from Electronic Medical Records
Title | Learning Treatment Regimens from Electronic Medical Records |
Authors | Khanh-Hung Hoang, Tu-Bao Ho |
Abstract | Appropriate treatment regimens play a vital role in improving patient health status. Although some achievements have been made, few of the recent studies of learning treatment regimens have exploited different kinds of patient information due to the difficulty in adopting heterogeneous data to many data mining methods. Moreover, current studies seem too rigid with fixed intervals of treatment periods corresponding to the varying lengths of hospital stay. To this end, this work proposes a generic data-driven framework which can derive group-treatment regimens from electronic medical records by utilizing a mixed-variate restricted Boltzmann machine and incorporating medical domain knowledge. We conducted experiments on coronary artery disease as a case study. The obtained results show that the framework is promising and capable of assisting physicians in making clinical decisions. |
Tasks | |
Published | 2018-06-16 |
URL | http://arxiv.org/abs/1806.07461v1 |
http://arxiv.org/pdf/1806.07461v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-treatment-regimens-from-electronic |
Repo | |
Framework | |
Ensemble of Convolutional Neural Networks for Automatic Grading of Diabetic Retinopathy and Macular Edema
Title | Ensemble of Convolutional Neural Networks for Automatic Grading of Diabetic Retinopathy and Macular Edema |
Authors | Avinash Kori, Sai Saketh Chennamsetty, Mohammed Safwan K. P., Varghese Alex |
Abstract | In this manuscript, we automate the procedure of grading of diabetic retinopathy and macular edema from fundus images using an ensemble of convolutional neural networks. The availability of limited amount of labeled data to perform supervised learning was circumvented by using transfer learning approach. The models in the ensemble were pre-trained on a large dataset comprising natural images and were later fine-tuned with the limited data for the task of choice. For an image, the ensemble of classifiers generate multiple predictions, and a max-voting based approach was utilized to attain the final grade of the anomaly in the image. For the task of grading DR, on the test data (n=56), the ensemble achieved an accuracy of 83.9%, while for the task for grading macular edema the network achieved an accuracy of 95.45% (n=44). |
Tasks | Transfer Learning |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04228v1 |
http://arxiv.org/pdf/1809.04228v1.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-of-convolutional-neural-networks-for |
Repo | |
Framework | |
EREL Selection using Morphological Relation
Title | EREL Selection using Morphological Relation |
Authors | Yuying Li, Mehdi Faraji |
Abstract | This work concentrates on Extremal Regions of Extremum Level (EREL) selection. EREL is a recently proposed feature detector aiming at detecting regions from a set of extremal regions. This is a branching problem derived from segmentation of arterial wall boundaries from Intravascular Ultrasound (IVUS) images. For each IVUS frame, a set of EREL regions is generated to describe the luminal area of human coronary. Each EREL is then fitted by an ellipse to represent the luminal border. The goal is to assign the most appropriate EREL as the lumen. In this work, EREL selection carries out in two rounds. In the first round, the pattern in a set of EREL regions is analyzed and used to generate an approximate luminal region. Then, the two-dimensional (2D) correlation coefficients are computed between this approximate region and each EREL to keep the ones with tightest relevance. In the second round, a compactness measure is calculated for each EREL and its fitted ellipse to guarantee that the resulting EREL has not affected by the common artifacts such as bifurcations, shadows, and side branches. We evaluated the selected ERELs in terms of Hausdorff Distance (HD) and Jaccard Measure (JM) on the train and test set of a publicly available dataset. The results show that our selection strategy outperforms the current state-of-the-art. |
Tasks | |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03580v1 |
http://arxiv.org/pdf/1806.03580v1.pdf | |
PWC | https://paperswithcode.com/paper/erel-selection-using-morphological-relation |
Repo | |
Framework | |
Not quite unreasonable effectiveness of machine learning algorithms
Title | Not quite unreasonable effectiveness of machine learning algorithms |
Authors | Egor Illarionov, Roman Khudorozhkov |
Abstract | State-of-the-art machine learning algorithms demonstrate close to absolute performance in selected challenges. We provide arguments that the reason can be in low variability of the samples and high effectiveness in learning typical patterns. Due to this fact, standard performance metrics do not reveal model capacity and new metrics are required for the better understanding of state-of-the-art. |
Tasks | |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02543v1 |
http://arxiv.org/pdf/1804.02543v1.pdf | |
PWC | https://paperswithcode.com/paper/not-quite-unreasonable-effectiveness-of |
Repo | |
Framework | |
Nonlinear Online Learning with Adaptive Nyström Approximation
Title | Nonlinear Online Learning with Adaptive Nyström Approximation |
Authors | Si Si, Sanjiv Kumar, Yang Li |
Abstract | Use of nonlinear feature maps via kernel approximation has led to success in many online learning tasks. As a popular kernel approximation method, Nystr"{o}m approximation, has been well investigated, and various landmark points selection methods have been proposed to improve the approximation quality. However, these improved Nystr"{o}m methods cannot be directly applied to the online learning setting as they need to access the entire dataset to learn the landmark points, while we need to update model on-the-fly in the online setting. To address this challenge, we propose Adaptive Nystr"{o}m approximation for solving nonlinear online learning problems. The key idea is to adaptively modify the landmark points via online kmeans and adjust the model accordingly via solving least square problem followed by a gradient descent step. We show that the resulting algorithm outperforms state-of-the-art online learning methods under the same budget. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07887v2 |
http://arxiv.org/pdf/1802.07887v2.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-online-learning-with-adaptive |
Repo | |
Framework | |
Towards Discrete Solution: A Sparse Preserving Method for Correspondence Problem
Title | Towards Discrete Solution: A Sparse Preserving Method for Correspondence Problem |
Authors | Bo Jiang |
Abstract | Many problems of interest in computer vision can be formulated as a problem of finding consistent correspondences between two feature sets. Feature correspondence (matching) problem with one-to-one mapping constraint is usually formulated as an Integral Quadratic Programming (IQP) problem with permutation (or orthogonal) constraint. Since it is NP-hard, relaxation models are required. One main challenge for optimizing IQP matching problem is how to incorporate the discrete one-to-one mapping (permutation) constraint in its quadratic objective optimization. In this paper, we present a new relaxation model, called Sparse Constraint Preserving Matching (SPM), for IQP matching problem. SPM is motivated by our observation that the discrete permutation constraint can be well encoded via a sparse constraint. Comparing with traditional relaxation models, SPM can incorporate the discrete one-to-one mapping constraint straightly via a sparse constraint and thus provides a tighter relaxation for original IQP matching problem. A simple yet effective update algorithm has been derived to solve the proposed SPM model. Experimental results on several feature matching tasks demonstrate the effectiveness and efficiency of SPM method. |
Tasks | |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07456v1 |
http://arxiv.org/pdf/1809.07456v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-discrete-solution-a-sparse-preserving |
Repo | |
Framework | |
Domain Robust Feature Extraction for Rapid Low Resource ASR Development
Title | Domain Robust Feature Extraction for Rapid Low Resource ASR Development |
Authors | Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black |
Abstract | Developing a practical speech recognizer for a low resource language is challenging, not only because of the (potentially unknown) properties of the language, but also because test data may not be from the same domain as the available training data. In this paper, we focus on the latter challenge, i.e. domain mismatch, for systems trained using a sequence-based criterion. We demonstrate the effectiveness of using a pre-trained English recognizer, which is robust to such mismatched conditions, as a domain normalizing feature extractor on a low resource language. In our example, we use Turkish Conversational Speech and Broadcast News data. This enables rapid development of speech recognizers for new languages which can easily adapt to any domain. Testing in various cross-domain scenarios, we achieve relative improvements of around 25% in phoneme error rate, with improvements being around 50% for some domains. |
Tasks | |
Published | 2018-07-28 |
URL | http://arxiv.org/abs/1807.10984v2 |
http://arxiv.org/pdf/1807.10984v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-robust-feature-extraction-for-rapid |
Repo | |
Framework | |
The Many Faces of Exponential Weights in Online Learning
Title | The Many Faces of Exponential Weights in Online Learning |
Authors | Dirk van der Hoeven, Tim van Erven, Wojciech Kotłowski |
Abstract | A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods. Here we explore the alternative approach of putting Exponential Weights (EW) first. We show that many standard methods and their regret bounds then follow as a special case by plugging in suitable surrogate losses and playing the EW posterior mean. For instance, we easily recover Online Gradient Descent by using EW with a Gaussian prior on linearized losses, and, more generally, all instances of Online Mirror Descent based on regular Bregman divergences also correspond to EW with a prior that depends on the mirror map. Furthermore, appropriate quadratic surrogate losses naturally give rise to Online Gradient Descent for strongly convex losses and to Online Newton Step. We further interpret several recent adaptive methods (iProd, Squint, and a variation of Coin Betting for experts) as a series of closely related reductions to exp-concave surrogate losses that are then handled by Exponential Weights. Finally, a benefit of our EW interpretation is that it opens up the possibility of sampling from the EW posterior distribution instead of playing the mean. As already observed by Bubeck and Eldan, this recovers the best-known rate in Online Bandit Linear Optimization. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07543v2 |
http://arxiv.org/pdf/1802.07543v2.pdf | |
PWC | https://paperswithcode.com/paper/the-many-faces-of-exponential-weights-in |
Repo | |
Framework | |
CIoTA: Collaborative IoT Anomaly Detection via Blockchain
Title | CIoTA: Collaborative IoT Anomaly Detection via Blockchain |
Authors | Tomer Golomb, Yisroel Mirsky, Yuval Elovici |
Abstract | Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives. However, they tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices. However, an anomaly detection model must be trained for a long time in order to capture all benign behaviors. This approach is vulnerable to adversarial attacks since all observations are assumed to be benign while training the anomaly detection model. In this paper, we propose CIoTA, a lightweight framework that utilizes the blockchain concept to perform distributed and collaborative anomaly detection for devices with limited resources. CIoTA uses blockchain to incrementally update a trusted anomaly detection model via self-attestation and consensus among IoT devices. We evaluate CIoTA on our own distributed IoT simulation platform, which consists of 48 Raspberry Pis, to demonstrate CIoTA’s ability to enhance the security of each device and the security of the network as a whole. |
Tasks | Anomaly Detection |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03807v2 |
http://arxiv.org/pdf/1803.03807v2.pdf | |
PWC | https://paperswithcode.com/paper/ciota-collaborative-iot-anomaly-detection-via |
Repo | |
Framework | |