April 2, 2020

3090 words 15 mins read

Paper Group ANR 111

Paper Group ANR 111

When are Non-Parametric Methods Robust?. Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching. Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection. A Group Norm Regularized LRR Factorization Model for Spectral Clustering. Information Newton’s flow: second-order optimizat …

When are Non-Parametric Methods Robust?

Title When are Non-Parametric Methods Robust?
Authors Robi Bhattacharjee, Kamalika Chaudhuri
Abstract A growing body of research has shown that many classifiers are susceptible to {\em{adversarial examples}} – small strategic modifications to test inputs that lead to misclassification. In this work, we study general non-parametric methods, with a view towards understanding when they are robust to these modifications. We establish general conditions under which non-parametric methods are r-consistent – in the sense that they converge to optimally robust and accurate classifiers in the large sample limit. Concretely, our results show that when data is well-separated, nearest neighbors and kernel classifiers are r-consistent, while histograms are not. For general data distributions, we prove that preprocessing by Adversarial Pruning (Yang et. al., 2019) – that makes data well-separated – followed by nearest neighbors or kernel classifiers also leads to r-consistency.
Published 2020-03-13
URL https://arxiv.org/abs/2003.06121v1
PDF https://arxiv.org/pdf/2003.06121v1.pdf
PWC https://paperswithcode.com/paper/when-are-non-parametric-methods-robust

Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching

Title Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Authors Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu
Abstract Transformer has been successfully applied to many natural language processing tasks. However, for textual sequence matching, simple matching between the representation of a pair of sequences might bring in unnecessary noise. In this paper, we propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks that rely only on pre-computed sequence-vector-representation, such as SNLI, MNLI-match, MNLI-mismatch, QQP, and SQuAD-binary.
Published 2020-01-20
URL https://arxiv.org/abs/2001.07234v1
PDF https://arxiv.org/pdf/2001.07234v1.pdf
PWC https://paperswithcode.com/paper/multi-level-head-wise-match-and-aggregation

Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection

Title Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection
Authors Kaihua Zhang, Tengpeng Li, Shiwen Shen, Bo Liu, Jin Chen, Qingshan Liu
Abstract Co-saliency detection aims to discover the common and salient foregrounds from a group of relevant images. For this task, we present a novel adaptive graph convolutional network with attention graph clustering (GCAGC). Three major contributions have been made, and are experimentally shown to have substantial practical merits. First, we propose a graph convolutional network design to extract information cues to characterize the intra- and interimage correspondence. Second, we develop an attention graph clustering algorithm to discriminate the common objects from all the salient foreground objects in an unsupervised fashion. Third, we present a unified framework with encoder-decoder structure to jointly train and optimize the graph convolutional network, attention graph cluster, and co-saliency detection decoder in an end-to-end manner. We evaluate our proposed GCAGC method on three cosaliency detection benchmark datasets (iCoseg, Cosal2015 and COCO-SEG). Our GCAGC method obtains significant improvements over the state-of-the-arts on most of them.
Tasks Co-Saliency Detection, Graph Clustering, Saliency Detection
Published 2020-03-13
URL https://arxiv.org/abs/2003.06167v1
PDF https://arxiv.org/pdf/2003.06167v1.pdf
PWC https://paperswithcode.com/paper/adaptive-graph-convolutional-network-with

A Group Norm Regularized LRR Factorization Model for Spectral Clustering

Title A Group Norm Regularized LRR Factorization Model for Spectral Clustering
Authors Xishun Wang, Zhouwang Yang, Xingye Yue, Hui Wang
Abstract Spectral clustering is a very important and classic graph clustering method. Its clustering results are heavily dependent on affine matrix produced by data. Solving Low-Rank Representation~(LRR) problems is a very effective method to obtain affine matrix. This paper proposes LRR factorization model based on group norm regularization and uses Augmented Lagrangian Method~(ALM) algorithm to solve this model. We adopt group norm regularization to make the columns of the factor matrix sparse, thereby achieving the purpose of low rank. And no Singular Value Decomposition~(SVD) is required, computational complexity of each step is great reduced. We get the affine matrix by different LRR model and then perform cluster testing on synthetic noise data and real data~(Hopkin155 and EYaleB) respectively. Compared to traditional models and algorithms, ours are faster to solve affine matrix and more robust to noise. The final clustering results are better. And surprisingly, the numerical results show that our algorithm converges very fast, and the convergence condition is satisfied in only about ten steps. Group norm regularized LRR factorization model with the algorithm designed for it is effective and fast to obtain a better affine matrix.
Tasks Graph Clustering
Published 2020-01-08
URL https://arxiv.org/abs/2001.02568v1
PDF https://arxiv.org/pdf/2001.02568v1.pdf
PWC https://paperswithcode.com/paper/a-group-norm-regularized-lrr-factorization

Information Newton’s flow: second-order optimization method in probability space

Title Information Newton’s flow: second-order optimization method in probability space
Authors Yifei Wang, Wuchen Li
Abstract We introduce a framework for Newton’s flows in probability space with information metrics, named information Newton’s flows. Here two information metrics are considered, including both the Fisher-Rao metric and the Wasserstein-2 metric. Several examples of information Newton’s flows for learning objective/loss functions are provided, such as Kullback-Leibler (KL) divergence, Maximum mean discrepancy (MMD), and cross entropy. The asymptotic convergence results of proposed Newton’s methods are provided. A known fact is that overdamped Langevin dynamics correspond to Wasserstein gradient flows of KL divergence. Extending this fact to Wasserstein Newton’s flows of KL divergence, we derive Newton’s Langevin dynamics. We provide examples of Newton’s Langevin dynamics in both one-dimensional space and Gaussian families. For the numerical implementation, we design sampling efficient variational methods to approximate Wasserstein Newton’s directions. Several numerical examples in Gaussian families and Bayesian logistic regression are shown to demonstrate the effectiveness of the proposed method.
Published 2020-01-13
URL https://arxiv.org/abs/2001.04341v2
PDF https://arxiv.org/pdf/2001.04341v2.pdf
PWC https://paperswithcode.com/paper/information-newtons-flow-second-order

Composite Monte Carlo Decision Making under High Uncertainty of Novel Coronavirus Epidemic Using Hybridized Deep Learning and Fuzzy Rule Induction

Title Composite Monte Carlo Decision Making under High Uncertainty of Novel Coronavirus Epidemic Using Hybridized Deep Learning and Fuzzy Rule Induction
Authors Simon James Fong, Gloria Li, Nilanjan Dey, Ruben Gonzalez Crespo, Enrique Herrera-Viedma
Abstract In the advent of the novel coronavirus epidemic since December 2019, governments and authorities have been struggling to make critical decisions under high uncertainty at their best efforts. Composite Monte-Carlo (CMC) simulation is a forecasting method which extrapolates available data which are broken down from multiple correlated/casual micro-data sources into many possible future outcomes by drawing random samples from some probability distributions. For instance, the overall trend and propagation of the infested cases in China are influenced by the temporal-spatial data of the nearby cities around the Wuhan city (where the virus is originated from), in terms of the population density, travel mobility, medical resources such as hospital beds and the timeliness of quarantine control in each city etc. Hence a CMC is reliable only up to the closeness of the underlying statistical distribution of a CMC, that is supposed to represent the behaviour of the future events, and the correctness of the composite data relationships. In this paper, a case study of using CMC that is enhanced by deep learning network and fuzzy rule induction for gaining better stochastic insights about the epidemic development is experimented. Instead of applying simplistic and uniform assumptions for a MC which is a common practice, a deep learning-based CMC is used in conjunction of fuzzy rule induction techniques. As a result, decision makers are benefited from a better fitted MC outputs complemented by min-max rules that foretell about the extreme ranges of future possibilities with respect to the epidemic.
Tasks Decision Making
Published 2020-03-22
URL https://arxiv.org/abs/2003.09868v1
PDF https://arxiv.org/pdf/2003.09868v1.pdf
PWC https://paperswithcode.com/paper/composite-monte-carlo-decision-making-under

Automatic Business Process Structure Discovery using Ordered Neurons LSTM: A Preliminary Study

Title Automatic Business Process Structure Discovery using Ordered Neurons LSTM: A Preliminary Study
Authors Xue Han, Lianxue Hu, Yabin Dang, Shivali Agarwal, Lijun Mei, Shaochun Li, Xin Zhou
Abstract Automatic process discovery from textual process documentations is highly desirable to reduce time and cost of Business Process Management (BPM) implementation in organizations. However, existing automatic process discovery approaches mainly focus on identifying activities out of the documentations. Deriving the structural relationships between activities, which is important in the whole process discovery scope, is still a challenge. In fact, a business process has latent semantic hierarchical structure which defines different levels of detail to reflect the complex business logic. Recent findings in neural ma-chine learning area show that the meaningful linguistic structure can be induced by joint language modeling and structure learning. Inspired by these findings, we propose to retrieve the latent hierarchical structure present in the textual business process documents by building a neural network that leverages a novel recurrent architecture, Ordered Neurons LSTM (ON-LSTM), with process-level language model objective. We tested the proposed approach on data set of Process Description Documents (PDD) from our practical Robotic Process Automation (RPA) projects. Preliminary experiments showed promising results.
Tasks Language Modelling
Published 2020-01-05
URL https://arxiv.org/abs/2001.01243v1
PDF https://arxiv.org/pdf/2001.01243v1.pdf
PWC https://paperswithcode.com/paper/automatic-business-process-structure

Weak Detection in the Spiked Wigner Model with General Rank

Title Weak Detection in the Spiked Wigner Model with General Rank
Authors Ji Hyung Jung, Hye Won Chung, Ji Oon Lee
Abstract We study the statistical decision process of detecting the presence of signal from a ‘signal+noise’ type matrix model with an additive Wigner noise. We derive the error of the likelihood ratio test, which minimizes the sum of the Type-I and Type-II errors, under the Gaussian noise for the signal matrix with arbitrary finite rank. We propose a hypothesis test based on the linear spectral statistics of the data matrix, which is optimal and does not depend on the distribution of the signal or the noise. We also introduce a test for rank estimation that does not require the prior information on the rank of the signal.
Published 2020-01-16
URL https://arxiv.org/abs/2001.05676v1
PDF https://arxiv.org/pdf/2001.05676v1.pdf
PWC https://paperswithcode.com/paper/weak-detection-in-the-spiked-wigner-model

SafeNet: An Assistive Solution to Assess Incoming Threats for Premises

Title SafeNet: An Assistive Solution to Assess Incoming Threats for Premises
Authors Shahinur Alam, Md Sultan Mahmud, Mohammed Yeasin
Abstract An assistive solution to assess incoming threats (e.g., robbery, burglary, gun violence) for homes will enhance the safety of the people with or without disabilities. This paper presents “SafeNet”- an integrated assistive system to generate context-oriented image descriptions to assess incoming threats. The key functionality of the system includes the detection and identification of human and generating image descriptions from the real-time video streams obtained from the cameras placed in strategic locations around the house. In this paper, we focus on developing a robust model called “SafeNet” to generate image descriptions. To interact with the system, we implemented a dialog enabled interface for creating a personalized profile from face images or videos of friends/families. To improve computational efficiency, we apply change detection to filter out frames that do not have any activity and use Faster-RCNN to detect the human presence and extract faces using Multitask Cascaded Convolutional Networks (MTCNN). Subsequently, we apply LBP/FaceNet to identify a person. SafeNet sends image descriptions to the users with an MMS containing a person’s name if any match found or as “Unknown”, scene image, facial description, and contextual information. SafeNet identifies friends/families/caregiver versus intruders/unknown with an average F-score 0.97 and generates image descriptions from 10 classes with an average F-measure 0.97.
Published 2020-01-27
URL https://arxiv.org/abs/2002.04405v1
PDF https://arxiv.org/pdf/2002.04405v1.pdf
PWC https://paperswithcode.com/paper/safenet-an-assistive-solution-to-assess

Detecting Network Anomalies using Rule-based machine learning within SNMP-MIB dataset

Title Detecting Network Anomalies using Rule-based machine learning within SNMP-MIB dataset
Authors Abdalrahman Hwoij, Mouhammd Al-kasassbeh, Mustafa Al-Fayoumi
Abstract One of the most effective threats that targeting cybercriminals to limit network performance is Denial of Service (DOS) attack. Thus, data security, completeness and efficiency could be greatly damaged by this type of attacks. This paper developed a network traffic system that relies on adopted dataset to differentiate the DOS attacks from normal traffic. The detection model is built with five Rule-based machine learning classifiers (DecisionTable, JRip, OneR, PART and ZeroR). The findings have shown that the ICMP variables are implemented in the identification of ICMP attack, HTTP flood attack, and Slowloris at a high accuracy of approximately 99.7% using PART classifier. In addition, PART classifier has succeeded in classifying normal traffic from different DOS attacks at 100%.
Published 2020-01-18
URL https://arxiv.org/abs/2002.02368v1
PDF https://arxiv.org/pdf/2002.02368v1.pdf
PWC https://paperswithcode.com/paper/detecting-network-anomalies-using-rule-based

On the comparability of Pre-trained Language Models

Title On the comparability of Pre-trained Language Models
Authors Matthias Aßenmacher, Christian Heumann
Abstract Recent developments in unsupervised representation learning have successfully established the concept of transfer learning in NLP. Mainly three forces are driving the improvements in this area of research: More elaborated architectures are making better use of contextual information. Instead of simply plugging in static pre-trained representations, these are learned based on surrounding context in end-to-end trainable models with more intelligently designed language modelling objectives. Along with this, larger corpora are used as resources for pre-training large language models in a self-supervised fashion which are afterwards fine-tuned on supervised tasks. Advances in parallel computing as well as in cloud computing, made it possible to train these models with growing capacities in the same or even in shorter time than previously established models. These three developments agglomerate in new state-of-the-art (SOTA) results being revealed in a higher and higher frequency. It is not always obvious where these improvements originate from, as it is not possible to completely disentangle the contributions of the three driving forces. We set ourselves to providing a clear and concise overview on several large pre-trained language models, which achieved SOTA results in the last two years, with respect to their use of new architectures and resources. We want to clarify for the reader where the differences between the models are and we furthermore attempt to gain some insight into the single contributions of lexical/computational improvements as well as of architectural changes. We explicitly do not intend to quantify these contributions, but rather see our work as an overview in order to identify potential starting points for benchmark comparisons. Furthermore, we tentatively want to point at potential possibilities for improvement in the field of open-sourcing and reproducible research.
Tasks Language Modelling, Representation Learning, Transfer Learning, Unsupervised Representation Learning
Published 2020-01-03
URL https://arxiv.org/abs/2001.00781v1
PDF https://arxiv.org/pdf/2001.00781v1.pdf
PWC https://paperswithcode.com/paper/on-the-comparability-of-pre-trained-language

Consensus-Based Optimization on the Sphere I: Well-Posedness and Mean-Field Limit

Title Consensus-Based Optimization on the Sphere I: Well-Posedness and Mean-Field Limit
Authors Massimo Fornasier, Hui Huang, Lorenzo Pareschi, Philippe Sünnen
Abstract We introduce a new stochastic Kuramoto-Vicsek-type model for global optimization of nonconvex functions on the sphere. This model belongs to the class of Consensus-Based Optimization methods. In fact, particles move on the sphere driven by a drift towards an instantaneous consensus point, computed as a convex combination of the particle locations weighted by the cost function according to Laplace’s principle. The consensus point represents an approximation to a global minimizer. The dynamics is further perturbed by a random vector field to favor exploration, whose variance is a function of the distance of the particles to the consensus point. In particular, as soon as the consensus is reached, then the stochastic component vanishes. In this paper, we study the well-posedness of the model and we derive rigorously its mean-field approximation for large particle limit.
Published 2020-01-31
URL https://arxiv.org/abs/2001.11994v3
PDF https://arxiv.org/pdf/2001.11994v3.pdf
PWC https://paperswithcode.com/paper/consensus-based-optimization-on-the-sphere-i

Incremental Monoidal Grammars

Title Incremental Monoidal Grammars
Authors Dan Shiebler, Alexis Toumi, Mehrnoosh Sadrzadeh
Abstract In this work we define formal grammars in terms of free monoidal categories, along with a functor from the category of formal grammars to the category of automata. Generalising from the Booleans to arbitrary semirings, we extend our construction to weighted formal grammars and weighted automata. This allows us to link the categorical viewpoint on natural language to the standard machine learning notion of probabilistic language model.
Tasks Language Modelling
Published 2020-01-02
URL https://arxiv.org/abs/2001.02296v2
PDF https://arxiv.org/pdf/2001.02296v2.pdf
PWC https://paperswithcode.com/paper/incremental-monoidal-grammars

Enhancing the Monte Carlo Tree Search Algorithm for Video Game Testing

Title Enhancing the Monte Carlo Tree Search Algorithm for Video Game Testing
Authors Sinan Ariyurek, Aysu Betin-Can, Elif Surer
Abstract In this paper, we study the effects of several Monte Carlo Tree Search (MCTS) modifications for video game testing. Although MCTS modifications are highly studied in game playing, their impacts on finding bugs are blank. We focused on bug finding in our previous study where we introduced synthetic and human-like test goals and we used these test goals in Sarsa and MCTS agents to find bugs. In this study, we extend the MCTS agent with several modifications for game testing purposes. Furthermore, we present a novel tree reuse strategy. We experiment with these modifications by testing them on three testbed games, four levels each, that contain 45 bugs in total. We use the General Video Game Artificial Intelligence (GVG-AI) framework to create the testbed games and collect 427 human tester trajectories using the GVG-AI framework. We analyze the proposed modifications in three parts: we evaluate their effects on bug finding performances of agents, we measure their success under two different computational budgets, and we assess their effects on human-likeness of the human-like agent. Our results show that MCTS modifications improve the bug finding performance of the agents.
Published 2020-03-17
URL https://arxiv.org/abs/2003.07813v1
PDF https://arxiv.org/pdf/2003.07813v1.pdf
PWC https://paperswithcode.com/paper/enhancing-the-monte-carlo-tree-search

Ensemble Grammar Induction For Detecting Anomalies in Time Series

Title Ensemble Grammar Induction For Detecting Anomalies in Time Series
Authors Yifeng Gao, Jessica Lin, Constantin Brif
Abstract Time series anomaly detection is an important task, with applications in a broad variety of domains. Many approaches have been proposed in recent years, but often they require that the length of the anomalies be known in advance and provided as an input parameter. This limits the practicality of the algorithms, as such information is often unknown in advance, or anomalies with different lengths might co-exist in the data. To address this limitation, previously, a linear time anomaly detection algorithm based on grammar induction has been proposed. While the algorithm can find variable-length patterns, it still requires preselecting values for at least two parameters at the discretization step. How to choose these parameter values properly is still an open problem. In this paper, we introduce a grammar-induction-based anomaly detection method utilizing ensemble learning. Instead of using a particular choice of parameter values for anomaly detection, the method generates the final result based on a set of results obtained using different parameter values. We demonstrate that the proposed ensemble approach can outperform existing grammar-induction-based approaches with different criteria for selection of parameter values. We also show that the proposed approach can achieve performance similar to that of the state-of-the-art distance-based anomaly detection algorithm.
Tasks Anomaly Detection, Time Series
Published 2020-01-29
URL https://arxiv.org/abs/2001.11102v1
PDF https://arxiv.org/pdf/2001.11102v1.pdf
PWC https://paperswithcode.com/paper/ensemble-grammar-induction-for-detecting
comments powered by Disqus