October 16, 2019

2828 words 14 mins read

Paper Group ANR 1037

DeepPINK: reproducible feature selection in deep neural networks. Few-Shot Adaptation for Multimedia Semantic Indexing. Value Propagation Networks. Information Flow in Pregroup Models of Natural Language. Iterative Delegations in Liquid Democracy with Restricted Preferences. Policy Optimization with Second-Order Advantage Information. Learning Trea …

DeepPINK: reproducible feature selection in deep neural networks


Title	DeepPINK: reproducible feature selection in deep neural networks
Authors	Yang Young Lu, Yingying Fan, Jinchi Lv, William Stafford Noble
Abstract	Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
Tasks	Feature Selection
Published	2018-09-04
URL	http://arxiv.org/abs/1809.01185v2
PDF	http://arxiv.org/pdf/1809.01185v2.pdf
PWC	https://paperswithcode.com/paper/deeppink-reproducible-feature-selection-in
Repo
Framework

Few-Shot Adaptation for Multimedia Semantic Indexing


Title	Few-Shot Adaptation for Multimedia Semantic Indexing
Authors	Nakamasa Inoue, Koichi Shinoda
Abstract	We propose a few-shot adaptation framework, which bridges zero-shot learning and supervised many-shot learning, for semantic indexing of image and video data. Few-shot adaptation provides robust parameter estimation with few training examples, by optimizing the parameters of zero-shot learning and supervised many-shot learning simultaneously. In this method, first we build a zero-shot detector, and then update it by using the few examples. Our experiments show the effectiveness of the proposed framework on three datasets: TRECVID Semantic Indexing 2010, 2014, and ImageNET. On the ImageNET dataset, we show that our method outperforms recent few-shot learning methods. On the TRECVID 2014 dataset, we achieve 15.19% and 35.98% in Mean Average Precision under the zero-shot condition and the supervised condition, respectively. To the best of our knowledge, these are the best results on this dataset.
Tasks	Few-Shot Learning, Zero-Shot Learning
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07203v1
PDF	http://arxiv.org/pdf/1807.07203v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-adaptation-for-multimedia-semantic
Repo
Framework

Value Propagation Networks


Title	Value Propagation Networks
Authors	Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier
Abstract	We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments. We show that the modules enable learning to plan when the environment also includes stochastic elements, providing a cost-efficient learning system to build low-level size-invariant planners for a variety of interactive navigation problems. We evaluate on static and dynamic configurations of MazeBase grid-worlds, with randomly generated environments of several different sizes, and on a StarCraft navigation scenario, with more complex dynamics, and pixels as input.
Tasks	Starcraft
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11199v2
PDF	http://arxiv.org/pdf/1805.11199v2.pdf
PWC	https://paperswithcode.com/paper/value-propagation-networks
Repo
Framework

Information Flow in Pregroup Models of Natural Language


Title	Information Flow in Pregroup Models of Natural Language
Authors	Peter M. Hines
Abstract	This paper is about pregroup models of natural languages, and how they relate to the explicitly categorical use of pregroups in Compositional Distributional Semantics and Natural Language Processing. These categorical interpretations make certain assumptions about the nature of natural languages that, when stated formally, may be seen to impose strong restrictions on pregroup grammars for natural languages. We formalize this as a hypothesis about the form that pregroup models of natural languages must take, and demonstrate by an artificial language example that these restrictions are not imposed by the pregroup axioms themselves. We compare and contrast the artificial language examples with natural languages (using Welsh, a language where the ‘noun’ type cannot be taken as primitive, as an illustrative example). The hypothesis is simply that there must exist a causal connection, or information flow, between the words of a sentence in a language whose purpose is to communicate information. This is not necessarily the case with formal languages that are simply generated by a series of ‘meaning-free’ rules. This imposes restrictions on the types of pregroup grammars that we expect to find in natural languages; we formalize this in algebraic, categorical, and graphical terms. We take some preliminary steps in providing conditions that ensure pregroup models satisfy these conjectured properties, and discuss the more general forms this hypothesis may take.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03273v1
PDF	http://arxiv.org/pdf/1811.03273v1.pdf
PWC	https://paperswithcode.com/paper/information-flow-in-pregroup-models-of
Repo
Framework

Iterative Delegations in Liquid Democracy with Restricted Preferences


Title	Iterative Delegations in Liquid Democracy with Restricted Preferences
Authors	Bruno Escoffier, Hugo Gilbert, Adèle Pass-Lanneau
Abstract	In this paper, we study liquid democracy, a collective decision making paradigm which lies between direct and representative democracy. One main feature of liquid democracy is that voters can delegate their votes in a transitive manner so that: A delegates to B and B delegates to C leads to A delegates to C. Unfortunately, this process may not converge as there may not even exist a stable state (also called equilibrium). In this paper, we investigate the stability of the delegation process in liquid democracy when voters have restricted types of preference on the agent representing them (e.g., single-peaked preferences). We show that various natural structures of preferences guarantee the existence of an equilibrium and we obtain both tractability and hardness results for the problem of computing several equilibria with some desirable properties.
Tasks	Decision Making
Published	2018-09-12
URL	https://arxiv.org/abs/1809.04362v2
PDF	https://arxiv.org/pdf/1809.04362v2.pdf
PWC	https://paperswithcode.com/paper/the-convergence-of-iterative-delegations-in
Repo
Framework

Policy Optimization with Second-Order Advantage Information


Title	Policy Optimization with Second-Order Advantage Information
Authors	Jiajin Li, Baoxiang Wang
Abstract	Policy optimization on high-dimensional continuous control tasks exhibits its difficulty caused by the large variance of the policy gradient estimators. We present the action subspace dependent gradient (ASDG) estimator which incorporates the Rao-Blackwell theorem (RB) and Control Variates (CV) into a unified framework to reduce the variance. To invoke RB, our proposed algorithm (POSA) learns the underlying factorization structure among the action space based on the second-order advantage information. POSA captures the quadratic information explicitly and efficiently by utilizing the wide & deep architecture. Empirical studies show that our proposed approach demonstrates the performance improvements on high-dimensional synthetic settings and OpenAI Gym’s MuJoCo continuous control tasks.
Tasks	Continuous Control
Published	2018-05-09
URL	https://arxiv.org/abs/1805.03586v2
PDF	https://arxiv.org/pdf/1805.03586v2.pdf
PWC	https://paperswithcode.com/paper/policy-optimization-with-second-order
Repo
Framework

Learning Treatment Regimens from Electronic Medical Records


Title	Learning Treatment Regimens from Electronic Medical Records
Authors	Khanh-Hung Hoang, Tu-Bao Ho
Abstract	Appropriate treatment regimens play a vital role in improving patient health status. Although some achievements have been made, few of the recent studies of learning treatment regimens have exploited different kinds of patient information due to the difficulty in adopting heterogeneous data to many data mining methods. Moreover, current studies seem too rigid with fixed intervals of treatment periods corresponding to the varying lengths of hospital stay. To this end, this work proposes a generic data-driven framework which can derive group-treatment regimens from electronic medical records by utilizing a mixed-variate restricted Boltzmann machine and incorporating medical domain knowledge. We conducted experiments on coronary artery disease as a case study. The obtained results show that the framework is promising and capable of assisting physicians in making clinical decisions.
Tasks
Published	2018-06-16
URL	http://arxiv.org/abs/1806.07461v1
PDF	http://arxiv.org/pdf/1806.07461v1.pdf
PWC	https://paperswithcode.com/paper/learning-treatment-regimens-from-electronic
Repo
Framework

Ensemble of Convolutional Neural Networks for Automatic Grading of Diabetic Retinopathy and Macular Edema


Title	Ensemble of Convolutional Neural Networks for Automatic Grading of Diabetic Retinopathy and Macular Edema
Authors	Avinash Kori, Sai Saketh Chennamsetty, Mohammed Safwan K. P., Varghese Alex
Abstract	In this manuscript, we automate the procedure of grading of diabetic retinopathy and macular edema from fundus images using an ensemble of convolutional neural networks. The availability of limited amount of labeled data to perform supervised learning was circumvented by using transfer learning approach. The models in the ensemble were pre-trained on a large dataset comprising natural images and were later fine-tuned with the limited data for the task of choice. For an image, the ensemble of classifiers generate multiple predictions, and a max-voting based approach was utilized to attain the final grade of the anomaly in the image. For the task of grading DR, on the test data (n=56), the ensemble achieved an accuracy of 83.9%, while for the task for grading macular edema the network achieved an accuracy of 95.45% (n=44).
Tasks	Transfer Learning
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04228v1
PDF	http://arxiv.org/pdf/1809.04228v1.pdf
PWC	https://paperswithcode.com/paper/ensemble-of-convolutional-neural-networks-for
Repo
Framework

EREL Selection using Morphological Relation


Title	EREL Selection using Morphological Relation
Authors	Yuying Li, Mehdi Faraji
Abstract	This work concentrates on Extremal Regions of Extremum Level (EREL) selection. EREL is a recently proposed feature detector aiming at detecting regions from a set of extremal regions. This is a branching problem derived from segmentation of arterial wall boundaries from Intravascular Ultrasound (IVUS) images. For each IVUS frame, a set of EREL regions is generated to describe the luminal area of human coronary. Each EREL is then fitted by an ellipse to represent the luminal border. The goal is to assign the most appropriate EREL as the lumen. In this work, EREL selection carries out in two rounds. In the first round, the pattern in a set of EREL regions is analyzed and used to generate an approximate luminal region. Then, the two-dimensional (2D) correlation coefficients are computed between this approximate region and each EREL to keep the ones with tightest relevance. In the second round, a compactness measure is calculated for each EREL and its fitted ellipse to guarantee that the resulting EREL has not affected by the common artifacts such as bifurcations, shadows, and side branches. We evaluated the selected ERELs in terms of Hausdorff Distance (HD) and Jaccard Measure (JM) on the train and test set of a publicly available dataset. The results show that our selection strategy outperforms the current state-of-the-art.
Tasks
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03580v1
PDF	http://arxiv.org/pdf/1806.03580v1.pdf
PWC	https://paperswithcode.com/paper/erel-selection-using-morphological-relation
Repo
Framework

Not quite unreasonable effectiveness of machine learning algorithms


Title	Not quite unreasonable effectiveness of machine learning algorithms
Authors	Egor Illarionov, Roman Khudorozhkov
Abstract	State-of-the-art machine learning algorithms demonstrate close to absolute performance in selected challenges. We provide arguments that the reason can be in low variability of the samples and high effectiveness in learning typical patterns. Due to this fact, standard performance metrics do not reveal model capacity and new metrics are required for the better understanding of state-of-the-art.
Tasks
Published	2018-04-07
URL	http://arxiv.org/abs/1804.02543v1
PDF	http://arxiv.org/pdf/1804.02543v1.pdf
PWC	https://paperswithcode.com/paper/not-quite-unreasonable-effectiveness-of
Repo
Framework

Nonlinear Online Learning with Adaptive Nyström Approximation


Title	Nonlinear Online Learning with Adaptive Nyström Approximation
Authors	Si Si, Sanjiv Kumar, Yang Li
Abstract	Use of nonlinear feature maps via kernel approximation has led to success in many online learning tasks. As a popular kernel approximation method, Nystr"{o}m approximation, has been well investigated, and various landmark points selection methods have been proposed to improve the approximation quality. However, these improved Nystr"{o}m methods cannot be directly applied to the online learning setting as they need to access the entire dataset to learn the landmark points, while we need to update model on-the-fly in the online setting. To address this challenge, we propose Adaptive Nystr"{o}m approximation for solving nonlinear online learning problems. The key idea is to adaptively modify the landmark points via online kmeans and adjust the model accordingly via solving least square problem followed by a gradient descent step. We show that the resulting algorithm outperforms state-of-the-art online learning methods under the same budget.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07887v2
PDF	http://arxiv.org/pdf/1802.07887v2.pdf
PWC	https://paperswithcode.com/paper/nonlinear-online-learning-with-adaptive
Repo
Framework

Towards Discrete Solution: A Sparse Preserving Method for Correspondence Problem


Title	Towards Discrete Solution: A Sparse Preserving Method for Correspondence Problem
Authors	Bo Jiang
Abstract	Many problems of interest in computer vision can be formulated as a problem of finding consistent correspondences between two feature sets. Feature correspondence (matching) problem with one-to-one mapping constraint is usually formulated as an Integral Quadratic Programming (IQP) problem with permutation (or orthogonal) constraint. Since it is NP-hard, relaxation models are required. One main challenge for optimizing IQP matching problem is how to incorporate the discrete one-to-one mapping (permutation) constraint in its quadratic objective optimization. In this paper, we present a new relaxation model, called Sparse Constraint Preserving Matching (SPM), for IQP matching problem. SPM is motivated by our observation that the discrete permutation constraint can be well encoded via a sparse constraint. Comparing with traditional relaxation models, SPM can incorporate the discrete one-to-one mapping constraint straightly via a sparse constraint and thus provides a tighter relaxation for original IQP matching problem. A simple yet effective update algorithm has been derived to solve the proposed SPM model. Experimental results on several feature matching tasks demonstrate the effectiveness and efficiency of SPM method.
Tasks
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07456v1
PDF	http://arxiv.org/pdf/1809.07456v1.pdf
PWC	https://paperswithcode.com/paper/towards-discrete-solution-a-sparse-preserving
Repo
Framework

Domain Robust Feature Extraction for Rapid Low Resource ASR Development


Title	Domain Robust Feature Extraction for Rapid Low Resource ASR Development
Authors	Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black
Abstract	Developing a practical speech recognizer for a low resource language is challenging, not only because of the (potentially unknown) properties of the language, but also because test data may not be from the same domain as the available training data. In this paper, we focus on the latter challenge, i.e. domain mismatch, for systems trained using a sequence-based criterion. We demonstrate the effectiveness of using a pre-trained English recognizer, which is robust to such mismatched conditions, as a domain normalizing feature extractor on a low resource language. In our example, we use Turkish Conversational Speech and Broadcast News data. This enables rapid development of speech recognizers for new languages which can easily adapt to any domain. Testing in various cross-domain scenarios, we achieve relative improvements of around 25% in phoneme error rate, with improvements being around 50% for some domains.
Tasks
Published	2018-07-28
URL	http://arxiv.org/abs/1807.10984v2
PDF	http://arxiv.org/pdf/1807.10984v2.pdf
PWC	https://paperswithcode.com/paper/domain-robust-feature-extraction-for-rapid
Repo
Framework

The Many Faces of Exponential Weights in Online Learning


Title	The Many Faces of Exponential Weights in Online Learning
Authors	Dirk van der Hoeven, Tim van Erven, Wojciech Kotłowski
Abstract	A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods. Here we explore the alternative approach of putting Exponential Weights (EW) first. We show that many standard methods and their regret bounds then follow as a special case by plugging in suitable surrogate losses and playing the EW posterior mean. For instance, we easily recover Online Gradient Descent by using EW with a Gaussian prior on linearized losses, and, more generally, all instances of Online Mirror Descent based on regular Bregman divergences also correspond to EW with a prior that depends on the mirror map. Furthermore, appropriate quadratic surrogate losses naturally give rise to Online Gradient Descent for strongly convex losses and to Online Newton Step. We further interpret several recent adaptive methods (iProd, Squint, and a variation of Coin Betting for experts) as a series of closely related reductions to exp-concave surrogate losses that are then handled by Exponential Weights. Finally, a benefit of our EW interpretation is that it opens up the possibility of sampling from the EW posterior distribution instead of playing the mean. As already observed by Bubeck and Eldan, this recovers the best-known rate in Online Bandit Linear Optimization.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07543v2
PDF	http://arxiv.org/pdf/1802.07543v2.pdf
PWC	https://paperswithcode.com/paper/the-many-faces-of-exponential-weights-in
Repo
Framework

CIoTA: Collaborative IoT Anomaly Detection via Blockchain


Title	CIoTA: Collaborative IoT Anomaly Detection via Blockchain
Authors	Tomer Golomb, Yisroel Mirsky, Yuval Elovici
Abstract	Due to their rapid growth and deployment, Internet of things (IoT) devices have become a central aspect of our daily lives. However, they tend to have many vulnerabilities which can be exploited by an attacker. Unsupervised techniques, such as anomaly detection, can help us secure the IoT devices. However, an anomaly detection model must be trained for a long time in order to capture all benign behaviors. This approach is vulnerable to adversarial attacks since all observations are assumed to be benign while training the anomaly detection model. In this paper, we propose CIoTA, a lightweight framework that utilizes the blockchain concept to perform distributed and collaborative anomaly detection for devices with limited resources. CIoTA uses blockchain to incrementally update a trusted anomaly detection model via self-attestation and consensus among IoT devices. We evaluate CIoTA on our own distributed IoT simulation platform, which consists of 48 Raspberry Pis, to demonstrate CIoTA’s ability to enhance the security of each device and the security of the network as a whole.
Tasks	Anomaly Detection
Published	2018-03-10
URL	http://arxiv.org/abs/1803.03807v2
PDF	http://arxiv.org/pdf/1803.03807v2.pdf
PWC	https://paperswithcode.com/paper/ciota-collaborative-iot-anomaly-detection-via
Repo
Framework