October 17, 2019

3250 words 16 mins read

Paper Group ANR 946

IcoRating: A Deep-Learning System for Scam ICO Identification. Extending classical surrogate modelling to high-dimensions through supervised dimensionality reduction: a data-driven approach. Optimizing Non-decomposable Measures with Deep Networks. A Survey on Deep Learning Toolkits and Libraries for Intelligent User Interfaces. Asynchronous Stochas …

IcoRating: A Deep-Learning System for Scam ICO Identification


Title	IcoRating: A Deep-Learning System for Scam ICO Identification
Authors	Shuqing Bian, Zhenpeng Deng, Fei Li, Will Monroe, Peng Shi, Zijun Sun, Wei Wu, Sikuang Wang, William Yang Wang, Arianna Yuan, Tianwei Zhang, Jiwei Li
Abstract	Cryptocurrencies (or digital tokens, digital currencies, e.g., BTC, ETH, XRP, NEO) have been rapidly gaining ground in use, value, and understanding among the public, bringing astonishing profits to investors. Unlike other money and banking systems, most digital tokens do not require central authorities. Being decentralized poses significant challenges for credit rating. Most ICOs are currently not subject to government regulations, which makes a reliable credit rating system for ICO projects necessary and urgent. In this paper, we introduce IcoRating, the first learning–based cryptocurrency rating system. We exploit natural-language processing techniques to analyze various aspects of 2,251 digital currencies to date, such as white paper content, founding teams, Github repositories, websites, etc. Supervised learning models are used to correlate the life span and the price change of cryptocurrencies with these features. For the best setting, the proposed system is able to identify scam ICO projects with 0.83 precision. We hope this work will help investors identify scam ICOs and attract more efforts in automatically evaluating and analyzing ICO projects.
Tasks
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03670v1
PDF	http://arxiv.org/pdf/1803.03670v1.pdf
PWC	https://paperswithcode.com/paper/icorating-a-deep-learning-system-for-scam-ico
Repo
Framework

Extending classical surrogate modelling to high-dimensions through supervised dimensionality reduction: a data-driven approach


Title	Extending classical surrogate modelling to high-dimensions through supervised dimensionality reduction: a data-driven approach
Authors	C. Lataniotis, S. Marelli, B. Sudret
Abstract	Thanks to their versatility, ease of deployment and high-performance, surrogate models have become staple tools in the arsenal of uncertainty quantification (UQ). From local interpolants to global spectral decompositions, surrogates are characterised by their ability to efficiently emulate complex computational models based on a small set of model runs used for training. An inherent limitation of many surrogate models is their susceptibility to the curse of dimensionality, which traditionally limits their applicability to a maximum of $\mathcal{O}(10^2)$ input dimensions. We present a novel approach at high-dimensional surrogate modelling that is model-, dimensionality reduction- and surrogate model- agnostic (black box), and can enable the solution of high dimensional (i.e. up to $\mathcal{O}(10^4)$) problems. After introducing the general algorithm, we demonstrate its performance by combining Kriging and polynomial chaos expansions surrogates and kernel principal component analysis. In particular, we compare the generalisation performance that the resulting surrogates achieve to the classical sequential application of dimensionality reduction followed by surrogate modelling on several benchmark applications, comprising an analytical function and two engineering applications of increasing dimensionality and complexity.
Tasks	Dimensionality Reduction
Published	2018-12-15
URL	https://arxiv.org/abs/1812.06309v3
PDF	https://arxiv.org/pdf/1812.06309v3.pdf
PWC	https://paperswithcode.com/paper/extending-classical-surrogate-modelling-to
Repo
Framework

Optimizing Non-decomposable Measures with Deep Networks


Title	Optimizing Non-decomposable Measures with Deep Networks
Authors	Amartya Sanyal, Pawan Kumar, Purushottam Kar, Sanjay Chawla, Fabrizio Sebastiani
Abstract	We present a class of algorithms capable of directly training deep neural networks with respect to large families of task-specific performance measures such as the F-measure and the Kullback-Leibler divergence that are structured and non-decomposable. This presents a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields much faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations have several novel features including (i) convergence to first order stationary points despite optimizing complex objective functions; (ii) use of fewer training samples to achieve a desired level of convergence, (iii) a substantial reduction in training time, and (iv) a seamless integration of our implementation into existing symbolic gradient frameworks. We implement our techniques on a variety of deep architectures including multi-layer perceptrons and recurrent neural networks and show that on a variety of benchmark and real data sets, our algorithms outperform traditional approaches to training deep networks, as well as some recent approaches to task-specific training of neural networks.
Tasks
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00086v1
PDF	http://arxiv.org/pdf/1802.00086v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-non-decomposable-measures-with
Repo
Framework

A Survey on Deep Learning Toolkits and Libraries for Intelligent User Interfaces


Title	A Survey on Deep Learning Toolkits and Libraries for Intelligent User Interfaces
Authors	Jan Zacharias, Michael Barz, Daniel Sonntag
Abstract	This paper provides an overview of prominent deep learning toolkits and, in particular, reports on recent publications that contributed open source software for implementing tasks that are common in intelligent user interfaces (IUI). We provide a scientific reference for researchers and software engineers who plan to utilise deep learning techniques within their IUI research and development projects.
Tasks
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04818v2
PDF	http://arxiv.org/pdf/1803.04818v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-deep-learning-toolkits-and
Repo
Framework

Asynchronous Stochastic Variational Inference


Title	Asynchronous Stochastic Variational Inference
Authors	Saad Mohamad, Abdelhamid Bouchachia, Moamar Sayed-Mouchaweh
Abstract	Stochastic variational inference (SVI) employs stochastic optimization to scale up Bayesian computation to massive data. Since SVI is at its core a stochastic gradient-based algorithm, horizontal parallelism can be harnessed to allow larger scale inference. We propose a lock-free parallel implementation for SVI which allows distributed computations over multiple slaves in an asynchronous style. We show that our implementation leads to linear speed-up while guaranteeing an asymptotic ergodic convergence rate $O(1/\sqrt(T)$ ) given that the number of slaves is bounded by $\sqrt(T)$ ($T$ is the total number of iterations). The implementation is done in a high-performance computing (HPC) environment using message passing interface (MPI) for python (MPI4py). The extensive empirical evaluation shows that our parallel SVI is lossless, performing comparably well to its counterpart serial SVI with linear speed-up.
Tasks	Stochastic Optimization
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04289v1
PDF	http://arxiv.org/pdf/1801.04289v1.pdf
PWC	https://paperswithcode.com/paper/asynchronous-stochastic-variational-inference
Repo
Framework

Population and Empirical PR Curves for Assessment of Ranking Algorithms


Title	Population and Empirical PR Curves for Assessment of Ranking Algorithms
Authors	Jacqueline M. Hughes-Oliver
Abstract	The ROC curve is widely used to assess the quality of prediction/classification/ranking algorithms, and its properties have been extensively studied. The precision-recall (PR) curve has become the de facto replacement for the ROC curve in the presence of imbalance, namely where one class is far more likely than the other class. While the PR and ROC curves tend to be used interchangeably, they have some very different properties. Properties of the PR curve are the focus of this paper. We consider: (1) population PR curves, where complete distributional assumptions are specified for scores from both classes; and (2) empirical estimators of the PR curve, where we observe scores and no distributional assumptions are made. The properties have direct consequence on how the PR curve should, and should not, be used. For example, the empirical PR curve is not consistent when scores in the class of primary interest come from discrete distributions. On the other hand, a normal approximation can fit quite well for points on the empirical PR curve from continuously-defined scores, but convergence can be heavily influenced by the distributional setting, the amount of imbalance, and the point of interest on the PR curve.
Tasks
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08635v1
PDF	http://arxiv.org/pdf/1810.08635v1.pdf
PWC	https://paperswithcode.com/paper/population-and-empirical-pr-curves-for
Repo
Framework

Generating Redundant Features with Unsupervised Multi-Tree Genetic Programming


Title	Generating Redundant Features with Unsupervised Multi-Tree Genetic Programming
Authors	Andrew Lensen, Bing Xue, Mengjie Zhang
Abstract	Recently, feature selection has become an increasingly important area of research due to the surge in high-dimensional datasets in all areas of modern life. A plethora of feature selection algorithms have been proposed, but it is difficult to truly analyse the quality of a given algorithm. Ideally, an algorithm would be evaluated by measuring how well it removes known bad features. Acquiring datasets with such features is inherently difficult, and so a common technique is to add synthetic bad features to an existing dataset. While adding noisy features is an easy task, it is very difficult to automatically add complex, redundant features. This work proposes one of the first approaches to generating redundant features, using a novel genetic programming approach. Initial experiments show that our proposed method can automatically create difficult, redundant features which have the potential to be used for creating high-quality feature selection benchmark datasets.
Tasks	Feature Selection
Published	2018-02-02
URL	http://arxiv.org/abs/1802.00554v2
PDF	http://arxiv.org/pdf/1802.00554v2.pdf
PWC	https://paperswithcode.com/paper/generating-redundant-features-with
Repo
Framework

The Partially Observable Games We Play for Cyber Deception


Title	The Partially Observable Games We Play for Cyber Deception
Authors	Mohamadreza Ahmadi, Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
Abstract	Progressively intricate cyber infiltration mechanisms have made conventional means of defense, such as firewalls and malware detectors, incompetent. These sophisticated infiltration mechanisms can study the defender’s behavior, identify security caveats, and modify their actions adaptively. To tackle these security challenges, cyber-infrastructures require active defense techniques that incorporate cyber deception, in which the defender (deceiver) implements a strategy to mislead the infiltrator. To this end, we use a two-player partially observable stochastic game (POSG) framework, wherein the deceiver has full observability over the states of the POSG, and the infiltrator has partial observability. Then, the deception problem is to compute a strategy for the deceiver that minimizes the expected cost of deception against all strategies of the infiltrator. We first show that the underlying problem is a robust mixed-integer linear program, which is intractable to solve in general. Towards a scalable approach, we compute optimal finite-memory strategies for the infiltrator by a reduction to a series of synthesis problems for parametric Markov decision processes. We use these infiltration strategies to find robust strategies for the deceiver using mixed-integer linear programming. We illustrate the performance of our technique on a POSG model for network security. Our experiments demonstrate that the proposed approach handles scenarios considerably larger than those of the state-of-the-art methods.
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1810.00092v1
PDF	http://arxiv.org/pdf/1810.00092v1.pdf
PWC	https://paperswithcode.com/paper/the-partially-observable-games-we-play-for
Repo
Framework


Title	SOC: hunting the underground inside story of the ethereum Social-network Opinion and Comment
Authors	TonTon Hsien-De Huang, Po-Wei Hong, Ying-Tse Lee, Yi-Lun Wang, Chi-Leong Lok, Hung-Yu Kao
Abstract	The cryptocurrency is attracting more and more attention because of the blockchain technology. Ethereum is gaining a significant popularity in blockchain community, mainly due to the fact that it is designed in a way that enables developers to write smart contracts and decentralized applications (Dapps). There are many kinds of cryptocurrency information on the social network. The risks and fraud problems behind it have pushed many countries including the United States, South Korea, and China to make warnings and set up corresponding regulations. However, the security of Ethereum smart contracts has not gained much attention. Through the Deep Learning approach, we propose a method of sentiment analysis for Ethereum’s community comments. In this research, we first collected the users’ cryptocurrency comments from the social network and then fed to our LSTM + CNN model for training. Then we made prediction through sentiment analysis. With our research result, we have demonstrated that both the precision and the recall of sentiment analysis can achieve 0.80+. More importantly, we deploy our sentiment analysis1 on RatingToken and Coin Master (mobile application of Cheetah Mobile Blockchain Security Center23). We can effectively provide detail information to resolve the risks of being fake and fraud problems.
Tasks	Sentiment Analysis
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11136v1
PDF	http://arxiv.org/pdf/1811.11136v1.pdf
PWC	https://paperswithcode.com/paper/soc-hunting-the-underground-inside-story-of
Repo
Framework

Hunting for Discriminatory Proxies in Linear Regression Models


Title	Hunting for Discriminatory Proxies in Linear Regression Models
Authors	Samuel Yeom, Anupam Datta, Matt Fredrikson
Abstract	A machine learning model may exhibit discrimination when used to make decisions involving people. One potential cause for such outcomes is that the model uses a statistical proxy for a protected demographic attribute. In this paper we formulate a definition of proxy use for the setting of linear regression and present algorithms for detecting proxies. Our definition follows recent work on proxies in classification models, and characterizes a model’s constituent behavior that: 1) correlates closely with a protected random variable, and 2) is causally influential in the overall behavior of the model. We show that proxies in linear regression models can be efficiently identified by solving a second-order cone program, and further extend this result to account for situations where the use of a certain input variable is justified as a `business necessity’. Finally, we present empirical results on two law enforcement datasets that exhibit varying degrees of racial disparity in prediction outcomes, demonstrating that proxies shed useful light on the causes of discriminatory behavior in models. \|
Tasks
Published	2018-10-16
URL	http://arxiv.org/abs/1810.07155v3
PDF	http://arxiv.org/pdf/1810.07155v3.pdf
PWC	https://paperswithcode.com/paper/hunting-for-discriminatory-proxies-in-linear
Repo
Framework

Learning to Defense by Learning to Attack


Title	Learning to Defense by Learning to Attack
Authors	Haoming Jiang, Zhehui Chen, Yuyang Shi, Bo Dai, Tuo Zhao
Abstract	Adversarial training is a principled approach for training robust neural networks. From an optimization perspective, adversarial training is solving a bilevel optimization problem (a general form of minimax approaches): The leader problem targets on learning a robust classifier; The follower problem tries to generate adversarial samples. Unfortunately, such a bilevel problem is very challenging to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying hand-designed algorithms for the follower problem, we learn an optimizer, which is parametrized by a convolutional neural network. Meanwhile, a robust classifier is learned to defense the adversarial attacks generated by the learned optimizer. Our experiments over CIFAR datasets demonstrate that L2L improves upon existing methods in both robust accuracy and computational efficiency. Moreover, the L2L framework can be extended to other popular bilevel problems in machine learning.
Tasks	Adversarial Attack, Adversarial Defense, bilevel optimization
Published	2018-11-03
URL	https://arxiv.org/abs/1811.01213v4
PDF	https://arxiv.org/pdf/1811.01213v4.pdf
PWC	https://paperswithcode.com/paper/learning-to-defense-by-learning-to-attack
Repo
Framework

Statistical Neurodynamics of Deep Networks: Geometry of Signal Spaces


Title	Statistical Neurodynamics of Deep Networks: Geometry of Signal Spaces
Authors	Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi
Abstract	Statistical neurodynamics studies macroscopic behaviors of randomly connected neural networks. We consider a deep layered feedforward network where input signals are processed layer by layer. The manifold of input signals is embedded in a higher dimensional manifold of the next layer as a curved submanifold, provided the number of neurons is larger than that of inputs. We show geometrical features of the embedded manifold, proving that the manifold enlarges or shrinks locally isotropically so that it is always embedded conformally. We study the curvature of the embedded manifold. The scalar curvature converges to a constant or diverges to infinity slowly. The distance between two signals also changes, converging eventually to a stable fixed value, provided both the number of neurons in a layer and the number of layers tend to infinity. This causes a problem, since when we consider a curve in the input space, it is mapped as a continuous curve of fractal nature, but our theory contradictorily suggests that the curve eventually converges to a discrete set of equally spaced points. In reality, the numbers of neurons and layers are finite and thus, it is expected that the finite size effect causes the discrepancies between our theory and reality. We need to further study the discrepancies to understand their implications on information processing.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07169v1
PDF	http://arxiv.org/pdf/1808.07169v1.pdf
PWC	https://paperswithcode.com/paper/statistical-neurodynamics-of-deep-networks
Repo
Framework

A Deep Learning Mechanism for Efficient Information Dissemination in Vehicular Floating Content


Title	A Deep Learning Mechanism for Efficient Information Dissemination in Vehicular Floating Content
Authors	Gaetano Manzo, Juan Sebastian Otálora Montenegro, Gianluca Rizzo
Abstract	Handling the tremendous amount of network data, produced by the explosive growth of mobile traffic volume, is becoming of main priority to achieve desired performance targets efficiently. Opportunistic communication such as FloatingContent (FC), can be used to offload part of the cellular traffic volume to vehicular-to-vehicular communication (V2V), leaving the infrastructure the task of coordinating the communication. Existing FC dimensioning approaches have limitations, mainly due to unrealistic assumptions and on a coarse partitioning of users, which results in over-dimensioning. Shaping the opportunistic communication area is a crucial task to achieve desired application performance efficiently. In this work, we propose a solution for this open challenge. In particular, the broadcasting areas called Anchor Zone (AZ), are selected via a deep learning approach to minimize communication resources achieving desired message availability. No assumption required to fit the classifier in both synthetic and real mobility. A numerical study is made to validate the effectiveness and efficiency of the proposed method. The predicted AZ configuration can achieve an accuracy of 89.7%within 98% of confidence level. By cause of the learning approach, the method performs even better in real scenarios, saving up to 27% of resources compared to previous work analytically modelled
Tasks
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10425v1
PDF	http://arxiv.org/pdf/1810.10425v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-mechanism-for-efficient
Repo
Framework

A Distributed Framework for the Construction of Transport Maps


Title	A Distributed Framework for the Construction of Transport Maps
Authors	Diego A. Mesa, Justin Tantiongloc, Marcela Mendoza, Todd P. Coleman
Abstract	The need to reason about uncertainty in large, complex, and multi-modal datasets has become increasingly common across modern scientific environments. The ability to transform samples from one distribution $P$ to another distribution $Q$ enables the solution to many problems in machine learning (e.g. Bayesian inference, generative modeling) and has been actively pursued from theoretical, computational, and application perspectives across the fields of information theory, computer science, and biology. Performing such transformations, in general, still leads to computational difficulties, especially in high dimensions. Here, we consider the problem of computing such “measure transport maps” with efficient and parallelizable methods. Under the mild assumptions that $P$ need not be known but can be sampled from, and that the density of $Q$ is known up to a proportionality constant, and that $Q$ is log-concave, we provide in this work a convex optimization problem pertaining to relative entropy minimization. We show how an empirical minimization formulation and polynomial chaos map parameterization can allow for learning a transport map between $P$ and $Q$ with distributed and scalable methods. We also leverage findings from nonequilibrium thermodynamics to represent the transport map as a composition of simpler maps, each of which is learned sequentially with a transport cost regularized version of the aforementioned problem formulation. We provide examples of our framework within the context of Bayesian inference for the Boston housing dataset and generative modeling for handwritten digit images from the MNIST dataset.
Tasks	Bayesian Inference
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08454v3
PDF	http://arxiv.org/pdf/1801.08454v3.pdf
PWC	https://paperswithcode.com/paper/a-distributed-framework-for-the-construction
Repo
Framework

The 2018 DAVIS Challenge on Video Object Segmentation


Title	The 2018 DAVIS Challenge on Video Object Segmentation
Authors	Sergi Caelles, Alberto Montes, Kevis-Kokitsi Maninis, Yuhua Chen, Luc Van Gool, Federico Perazzi, Jordi Pont-Tuset
Abstract	We present the 2018 DAVIS Challenge on Video Object Segmentation, a public competition specifically designed for the task of video object segmentation. It builds upon the DAVIS 2017 dataset, which was presented in the previous edition of the DAVIS Challenge, and added 100 videos with multiple objects per sequence to the original DAVIS 2016 dataset. Motivated by the analysis of the results of the 2017 edition, the main track of the competition will be the same than in the previous edition (segmentation given the full mask of the objects in the first frame – semi-supervised scenario). This edition, however, also adds an interactive segmentation teaser track, where the participants will interact with a web service simulating the input of a human that provides scribbles to iteratively improve the result.
Tasks	Interactive Segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00557v2
PDF	http://arxiv.org/pdf/1803.00557v2.pdf
PWC	https://paperswithcode.com/paper/the-2018-davis-challenge-on-video-object
Repo
Framework