Paper Group ANR 679
Adaptive Statistical Learning with Bayesian Differential Privacy. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions. Accelerated Target Updates for Q-learning. An Essay on Optimization Mystery of Deep Learning. Improved Adversarial Robustness by Reducing Open Space …
Adaptive Statistical Learning with Bayesian Differential Privacy
Title | Adaptive Statistical Learning with Bayesian Differential Privacy |
Authors | Jun Zhao |
Abstract | In statistical learning, a dataset is often partitioned into two parts: the training set and the holdout (i.e., testing) set. For instance, the training set is used to learn a predictor, and then the holdout set is used for estimating the accuracy of the predictor on the true distribution. However, often in practice, the holdout dataset is reused and the estimates tested on the holdout dataset are chosen adaptively based on the results of prior estimates, leading to that the predictor may become dependent of the holdout set. Hence, overfitting may occur, and the learned models may not generalize well to the unseen datasets. Prior studies have established connections between the stability of a learning algorithm and its ability to generalize, but the traditional generalization is not robust to adaptive composition. Recently, Dwork et al. in NIPS, STOC, and Science 2015 show that the holdout dataset from i.i.d. data samples can be reused in adaptive statistical learning, if the estimates are perturbed and coordinated using techniques developed for differential privacy, which is a widely used notion to quantify privacy. Yet, the results of Dwork et al. are applicable to only the case of i.i.d. samples. In contrast, correlations between data samples exist because of various behavioral, social, and genetic relationships between users. Our results in adaptive statistical learning generalize the results of Dwork et al. for i.i.d. data samples to arbitrarily correlated data. Specifically, we show that the holdout dataset from correlated samples can be reused in adaptive statistical learning, if the estimates are perturbed and coordinated using techniques developed for Bayesian differential privacy, which is a privacy notion recently introduced by Yang et al. in SIGMOD 2015 to broaden the application scenarios of differential privacy when data records are correlated. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00765v1 |
https://arxiv.org/pdf/1911.00765v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-statistical-learning-with-bayesian |
Repo | |
Framework | |
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
Title | Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions |
Authors | Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley |
Abstract | While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process. The Paired Open-Ended Trailblazer (POET) algorithm introduced in this paper does just that: it pairs the generation of environmental challenges and the optimization of agents to solve those challenges. It simultaneously explores many different paths through the space of possible problems and solutions and, critically, allows these stepping-stone solutions to transfer between problems if better, catalyzing innovation. The term open-ended signifies the intriguing potential for algorithms like POET to continue to create novel and increasingly complex capabilities without bound. Our results show that POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved by direct optimization alone, or even through a direct-path curriculum-building control algorithm introduced to highlight the critical role of open-endedness in solving ambitious challenges. The ability to transfer solutions from one environment to another proves essential to unlocking the full potential of the system as a whole, demonstrating the unpredictable nature of fortuitous stepping stones. We hope that POET will inspire a new push towards open-ended discovery across many domains, where algorithms like POET can blaze a trail through their interesting possible manifestations and solutions. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.01753v3 |
http://arxiv.org/pdf/1901.01753v3.pdf | |
PWC | https://paperswithcode.com/paper/paired-open-ended-trailblazer-poet-endlessly |
Repo | |
Framework | |
Accelerated Target Updates for Q-learning
Title | Accelerated Target Updates for Q-learning |
Authors | Bowen Weng, Huaqing Xiong, Wei Zhang |
Abstract | This paper studies accelerations in Q-learning algorithms. We propose an accelerated target update scheme by incorporating the historical iterates of Q functions. The idea is conceptually inspired by the momentum-based accelerated methods in the optimization theory. Conditions under which the proposed accelerated algorithms converge are established. The algorithms are validated using commonly adopted testing problems in reinforcement learning, including the FrozenLake grid world game, two discrete-time LQR problems from the Deepmind Control Suite, and the Atari 2600 games. Simulation results show that the proposed accelerated algorithms can improve the convergence performance compared with the vanilla Q-learning algorithm. |
Tasks | Atari Games, Q-Learning |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02841v2 |
https://arxiv.org/pdf/1905.02841v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-target-updates-for-q-learning |
Repo | |
Framework | |
An Essay on Optimization Mystery of Deep Learning
Title | An Essay on Optimization Mystery of Deep Learning |
Authors | Eugene Golikov |
Abstract | Despite the huge empirical success of deep learning, theoretical understanding of neural networks learning process is still lacking. This is the reason, why some of its features seem “mysterious”. We emphasize two mysteries of deep learning: generalization mystery, and optimization mystery. In this essay we review and draw connections between several selected works concerning the latter. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07187v1 |
https://arxiv.org/pdf/1905.07187v1.pdf | |
PWC | https://paperswithcode.com/paper/an-essay-on-optimization-mystery-of-deep |
Repo | |
Framework | |
Improved Adversarial Robustness by Reducing Open Space Risk via Tent Activations
Title | Improved Adversarial Robustness by Reducing Open Space Risk via Tent Activations |
Authors | Andras Rozsa, Terrance E. Boult |
Abstract | Adversarial examples contain small perturbations that can remain imperceptible to human observers but alter the behavior of even the best performing deep learning models and yield incorrect outputs. Since their discovery, adversarial examples have drawn significant attention in machine learning: researchers try to reveal the reasons for their existence and improve the robustness of machine learning models to adversarial perturbations. The state-of-the-art defense is the computationally expensive and very time consuming adversarial training via projected gradient descent (PGD). We hypothesize that adversarial attacks exploit the open space risk of classic monotonic activation functions. This paper introduces the tent activation function with bounded open space risk and shows that tents make deep learning models more robust to adversarial attacks. We demonstrate on the MNIST dataset that a classifier with tents yields an average accuracy of 91.8% against six white-box adversarial attacks, which is more than 15 percentage points above the state of the art. On the CIFAR-10 dataset, our approach improves the average accuracy against the six white-box adversarial attacks to 73.5% from 41.8% achieved by adversarial training via PGD. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02435v1 |
https://arxiv.org/pdf/1908.02435v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-adversarial-robustness-by-reducing |
Repo | |
Framework | |
On the Metrics and Adaptation Methods for Domain Divergences of sEMG-based Gesture Recognition
Title | On the Metrics and Adaptation Methods for Domain Divergences of sEMG-based Gesture Recognition |
Authors | István Ketykó, Ferenc Kovács |
Abstract | We propose a new metric to measure domain divergence and a new domain adaptation method for time-series classification. The metric belongs to the class of probability distributions-based metrics, is transductive, and does not assume the presence of source data samples. The 2-stage method utilizes an improved autoregressive, RNN-based architecture with deep/non-linear transformation. We assess our metric and the performance of our model in the context of sEMG/EMG-based gesture recognition under inter-session and inter-subject domain shifts. |
Tasks | Domain Adaptation, Gesture Recognition, Time Series, Time Series Classification |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08914v1 |
https://arxiv.org/pdf/1912.08914v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-metrics-and-adaptation-methods-for |
Repo | |
Framework | |
Classical Chinese Sentence Segmentation for Tomb Biographies of Tang Dynasty
Title | Classical Chinese Sentence Segmentation for Tomb Biographies of Tang Dynasty |
Authors | Chao-Lin Liu, Yi Chang |
Abstract | Tomb biographies of the Tang dynasty provide invaluable information about Chinese history. The original biographies are classical Chinese texts which contain neither word boundaries nor sentence boundaries. Relying on three published books of tomb biographies of the Tang dynasty, we investigated the effectiveness of employing machine-learning methods for algorithmically identifying the pauses and terminals of sentences in the biographies. We consider the segmentation task as a classification problem. Chinese characters that are and are not followed by a punctuation mark are classified into two categories. We applied a machine-learning-based mechanism, the conditional random fields (CRF), to classify the characters (and words) in the texts, and we studied the contributions of selected types of lexical information to the resulting quality of the segmentation recommendations. This proposal presented at the DH 2018 conference discussed some of the basic experiments and their evaluations. By considering the contextual information and employing the heuristics provided by experts of Chinese literature, we achieved F1 measures that were better than 80%. More complex experiments that employ deep neural networks helped us further improve the results in recent work. |
Tasks | |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10606v1 |
https://arxiv.org/pdf/1908.10606v1.pdf | |
PWC | https://paperswithcode.com/paper/classical-chinese-sentence-segmentation-for |
Repo | |
Framework | |
Classifying Pattern and Feature Properties to Get a $Θ(n)$ Checker and Reformulation for Sliding Time-Series Constraints
Title | Classifying Pattern and Feature Properties to Get a $Θ(n)$ Checker and Reformulation for Sliding Time-Series Constraints |
Authors | Nicolas Beldiceanu, Mats Carlsson, Claude-Guy Quimper, Maria-Isabel Restrepo-Ruiz |
Abstract | Given, a sequence $\mathcal{X}$ of $n$ variables, a time-series constraint ctr using the Sum aggregator, and a sliding time-series constraint enforcing the constraint ctr on each sliding window of $\mathcal{X}$ of $m$ consecutive variables, we describe a $\Theta(n)$ time complexity checker, as well as a $\Theta(n)$ space complexity reformulation for such sliding constraint. |
Tasks | Time Series |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01532v1 |
https://arxiv.org/pdf/1912.01532v1.pdf | |
PWC | https://paperswithcode.com/paper/classifying-pattern-and-feature-properties-to |
Repo | |
Framework | |
Hashtags are (not) judgemental: The untold story of Lok Sabha elections 2019
Title | Hashtags are (not) judgemental: The untold story of Lok Sabha elections 2019 |
Authors | Saurabh Gupta, Asmit Kumar Singh, Arun Balaji Buduru, Ponnurangam Kumaraguru |
Abstract | Hashtags in online social media have become a way for users to build communities around topics, promote opinions, and categorize messages. In the political context, hashtags on Twitter are used by users to campaign for their parties, spread news, or to get followers and get a general idea by following a discussion built around a hashtag. In the past, researchers have studied certain types and specific properties of hashtags by utilizing a lot of data collected around hashtags. In this paper, we perform a large-scale empirical analysis of elections using only the hashtags shared on Twitter during the 2019 Lok Sabha elections in India. We study the trends and events unfolded on the ground, the latent topics to uncover representative hashtags and semantic similarity to relate hashtags with the election outcomes. We collect over 24 million hashtags to perform extensive experiments. First, we find the trending hashtags to cross-reference them with the tweets in our dataset to list down notable events. Second, we use Latent Dirichlet Allocation to find topic patterns in the dataset. In the end, we use skip-gram word embedding model to find semantically similar hashtags. We propose popularity and an influence metric to predict election outcomes using just the hashtags. Empirical results show that influence is a good measure to predict the election outcome. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07151v1 |
https://arxiv.org/pdf/1909.07151v1.pdf | |
PWC | https://paperswithcode.com/paper/hashtags-are-not-judgemental-the-untold-story |
Repo | |
Framework | |
Image Classification on IoT Edge Devices: Profiling and Modeling
Title | Image Classification on IoT Edge Devices: Profiling and Modeling |
Authors | Salma Abdel Magid, Francesco Petrini, Behnam Dezfouli |
Abstract | With the advent of powerful, low-cost IoT systems, processing data closer to where the data originates, known as edge computing, has become an increasingly viable option. In addition to lowering the cost of networking infrastructures, edge computing reduces edge-cloud delay, which is essential for mission-critical applications. In this paper, we show the feasibility and study the performance of image classification using IoT devices. Specifically, we explore the relationships between various factors of image classification algorithms that may affect energy consumption such as dataset size, image resolution, algorithm type, algorithm phase, and device hardware. Our experiments show a strong, positive linear relationship between three predictor variables, namely model complexity, image resolution, and dataset size, with respect to energy consumption. In addition, in order to provide a means of predicting the energy consumption of an edge device performing image classification, we investigate the usage of three machine learning algorithms using the data generated from our experiments. The performance as well as the trade offs for using linear regression, Gaussian process, and random forests are discussed and validated. Our results indicate that the random forest model outperforms the two former algorithms, with an R-squared value of 0.95 and 0.79 for two different validation datasets. |
Tasks | Image Classification |
Published | 2019-02-24 |
URL | https://arxiv.org/abs/1902.11119v2 |
https://arxiv.org/pdf/1902.11119v2.pdf | |
PWC | https://paperswithcode.com/paper/image-classification-on-iot-edge-devices |
Repo | |
Framework | |
Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data
Title | Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data |
Authors | Mona Diab, Mahmoud Ghoneim, Abdelati Hawwari, Fahad AlGhamdi, Nada AlMarwani, Mohamed Al-Badrashiny |
Abstract | We present our effort to create a large Multi-Layered representational repository of Linguistic Code-Switched Arabic data. The process involves developing clear annotation standards and Guidelines, streamlining the annotation process, and implementing quality control measures. We used two main protocols for annotation: in-lab gold annotations and crowd sourcing annotations. We developed a web-based annotation tool to facilitate the management of the annotation process. The current version of the repository contains a total of 886,252 tokens that are tagged into one of sixteen code-switching tags. The data exhibits code switching between Modern Standard Arabic and Egyptian Dialectal Arabic representing three data genres: Tweets, commentaries, and discussion fora. The overall Inter-Annotator Agreement is 93.1%. |
Tasks | |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1909.13009v1 |
https://arxiv.org/pdf/1909.13009v1.pdf | |
PWC | https://paperswithcode.com/paper/creating-a-large-multi-layered-1 |
Repo | |
Framework | |
Graph Kernels: A Survey
Title | Graph Kernels: A Survey |
Authors | Giannis Nikolentzos, Giannis Siglidis, Michalis Vazirgiannis |
Abstract | Graph kernels have attracted a lot of attention during the last decade, and have evolved into a rapidly developing branch of learning on structured data. During the past 20 years, the considerable research activity that occurred in the field resulted in the development of dozens of graph kernels, each focusing on specific structural properties of graphs. Graph kernels have proven successful in a wide range of domains, ranging from social networks to bioinformatics. The goal of this survey is to provide a unifying view of the literature on graph kernels. In particular, we present a comprehensive overview of a wide range of graph kernels. Furthermore, we perform an experimental evaluation of several of those kernels on publicly available datasets, and provide a comparative study. Finally, we discuss key applications of graph kernels, and outline some challenges that remain to be addressed. |
Tasks | Graph Classification |
Published | 2019-04-27 |
URL | http://arxiv.org/abs/1904.12218v1 |
http://arxiv.org/pdf/1904.12218v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-kernels-a-survey |
Repo | |
Framework | |
Partial Label Learning with Self-Guided Retraining
Title | Partial Label Learning with Self-Guided Retraining |
Authors | Lei Feng, Bo An |
Abstract | Partial label learning deals with the problem where each training instance is assigned a set of candidate labels, only one of which is correct. This paper provides the first attempt to leverage the idea of self-training for dealing with partially labeled examples. Specifically, we propose a unified formulation with proper constraints to train the desired model and perform pseudo-labeling jointly. For pseudo-labeling, unlike traditional self-training that manually differentiates the ground-truth label with enough high confidence, we introduce the maximum infinity norm regularization on the modeling outputs to automatically achieve this consideratum, which results in a convex-concave optimization problem. We show that optimizing this convex-concave problem is equivalent to solving a set of quadratic programming (QP) problems. By proposing an upper-bound surrogate objective function, we turn to solving only one QP problem for improving the optimization efficiency. Extensive experiments on synthesized and real-world datasets demonstrate that the proposed approach significantly outperforms the state-of-the-art partial label learning approaches. |
Tasks | |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.03045v1 |
http://arxiv.org/pdf/1902.03045v1.pdf | |
PWC | https://paperswithcode.com/paper/partial-label-learning-with-self-guided |
Repo | |
Framework | |
There is no Artificial General Intelligence
Title | There is no Artificial General Intelligence |
Authors | J. Landgrebe, B. Smith |
Abstract | The goal of creating Artificial General Intelligence (AGI) – or in other words of creating Turing machines (modern computers) that can behave in a way that mimics human intelligence – has occupied AI researchers ever since the idea of AI was first proposed. One common theme in these discussions is the thesis that the ability of a machine to conduct convincing dialogues with human beings can serve as at least a sufficient criterion of AGI. We argue that this very ability should be accepted also as a necessary condition of AGI, and we provide a description of the nature of human dialogue in particular and of human language in general against this background. We then argue that it is for mathematical reasons impossible to program a machine in such a way that it could master human dialogue behaviour in its full generality. This is (1) because there are no traditional explicitly designed mathematical models that could be used as a starting point for creating such programs; and (2) because even the sorts of automated models generated by using machine learning, which have been used successfully in areas such as machine translation, cannot be extended to cope with human dialogue. If this is so, then we can conclude that a Turing machine also cannot possess AGI, because it fails to fulfil a necessary condition thereof. At the same time, however, we acknowledge the potential of Turing machines to master dialogue behaviour in highly restricted contexts, where what is called ``narrow’’ AI can still be of considerable utility. | |
Tasks | Machine Translation |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.05833v2 |
https://arxiv.org/pdf/1906.05833v2.pdf | |
PWC | https://paperswithcode.com/paper/there-is-no-general-ai-why-turing-machines |
Repo | |
Framework | |
Nonparametric feature extraction based on Minimax distance
Title | Nonparametric feature extraction based on Minimax distance |
Authors | Morteza Haghir Chehreghani |
Abstract | We investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose framework to employ Minimax distances with many machine learning methods that perform on numerical data. For this purpose, first, we compute the pairwise Minimax distances between the objects, using the equivalence of Minimax distances over a graph and over a minimum spanning tree constructed on that. Then, we perform an embedding of the pairwise Minimax distances into a new vector space, such that their squared Euclidean distances in the new space equal to the pairwise Minimax distances in the original space. In the following, we study the case of having multiple pairwise Minimax matrices, instead of a single one. Thereby, we propose an embedding via first summing up the centered matrices and then performing an eigenvalue decomposition. Finally, we perform several experimental studies to illustrate the effectiveness of our framework. |
Tasks | |
Published | 2019-04-27 |
URL | http://arxiv.org/abs/1904.13223v1 |
http://arxiv.org/pdf/1904.13223v1.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-feature-extraction-based-on |
Repo | |
Framework | |