Paper Group ANR 993
Symbolic regression based genetic approximations of the Colebrook equation for flow friction. Log Skeletons: A Classification Approach to Process Discovery. Gradient Band-based Adversarial Training for Generalized Attack Immunity of A3C Path Finding. What are the biases in my word embedding?. The Limits of Post-Selection Generalization. Subjective …
Symbolic regression based genetic approximations of the Colebrook equation for flow friction
Title | Symbolic regression based genetic approximations of the Colebrook equation for flow friction |
Authors | Pavel Praks, Dejan Brkic |
Abstract | Widely used in hydraulics, the Colebrook equation for flow friction relates implicitly to the input parameters; the Reynolds number, and the relative roughness of inner pipe surface, with the output unknown parameter; the flow friction factor. In this paper, a few explicit approximations to the Colebrook equation are generated using the ability of artificial intelligence to make inner patterns to connect input and output parameters in explicit way not knowing their nature or the physical law that connects them, but only knowing raw numbers. The fact that the used genetic programming tool does not know the structure of the Colebrook equation which is based on computationally expensive logarithmic law, is used to obtain better structure of the approximations which is less demanding for calculation but also enough accurate. All generated approximations are with low computational cost because they contain a limited number of logarithmic forms used although for normalization of input parameters or for acceleration, but they are also sufficiently accurate. The relative error regarding the friction factor in best case is up to 0.13% with only two logarithmic forms used. As the second logarithm can be accurately approximated by the Pade approximation, practically the same error is obtained also using only one logarithm. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.10394v1 |
http://arxiv.org/pdf/1808.10394v1.pdf | |
PWC | https://paperswithcode.com/paper/symbolic-regression-based-genetic |
Repo | |
Framework | |
Log Skeletons: A Classification Approach to Process Discovery
Title | Log Skeletons: A Classification Approach to Process Discovery |
Authors | H. M. W. Verbeek, R. Medeiros de Carvalho |
Abstract | To test the effectiveness of process discovery algorithms, a Process Discovery Contest (PDC) has been set up. This PDC uses a classification approach to measure this effectiveness: The better the discovered model can classify whether or not a new trace conforms to the event log, the better the discovery algorithm is supposed to be. Unfortunately, even the state-of-the-art fully-automated discovery algorithms score poorly on this classification. Even the best of these algorithms, the Inductive Miner, scored only 147 correct classified traces out of 200 traces on the PDC of 2017. This paper introduces the rule-based log skeleton model, which is closely related to the Declare constraint model, together with a way to classify traces using this model. This classification using log skeletons is shown to score better on the PDC of 2017 than state-of-the-art discovery algorithms: 194 out of 200. As a result, one can argue that the fully-automated algorithm to construct (or: discover) a log skeleton from an event log outperforms existing state-of-the-art fully-automated discovery algorithms. |
Tasks | |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08247v1 |
http://arxiv.org/pdf/1806.08247v1.pdf | |
PWC | https://paperswithcode.com/paper/log-skeletons-a-classification-approach-to |
Repo | |
Framework | |
Gradient Band-based Adversarial Training for Generalized Attack Immunity of A3C Path Finding
Title | Gradient Band-based Adversarial Training for Generalized Attack Immunity of A3C Path Finding |
Authors | Tong Chen, Wenjia Niu, Yingxiao Xiang, Xiaoxuan Bai, Jiqiang Liu, Zhen Han, Gang Li |
Abstract | As adversarial attacks pose a serious threat to the security of AI system in practice, such attacks have been extensively studied in the context of computer vision applications. However, few attentions have been paid to the adversarial research on automatic path finding. In this paper, we show dominant adversarial examples are effective when targeting A3C path finding, and design a Common Dominant Adversarial Examples Generation Method (CDG) to generate dominant adversarial examples against any given map. In addition, we propose Gradient Band-based Adversarial Training, which trained with a single randomly choose dominant adversarial example without taking any modification, to realize the “1:N” attack immunity for generalized dominant adversarial examples. Extensive experimental results show that, the lowest generation precision for CDG algorithm is 91.91%, and the lowest immune precision for Gradient Band-based Adversarial Training is 93.89%, which can prove that our method can realize the generalized attack immunity of A3C path finding with a high confidence. |
Tasks | |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06752v1 |
http://arxiv.org/pdf/1807.06752v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-band-based-adversarial-training-for |
Repo | |
Framework | |
What are the biases in my word embedding?
Title | What are the biases in my word embedding? |
Authors | Nathaniel Swinger, Maria De-Arteaga, Neil Thomas Heffernan IV, Mark DM Leiserson, Adam Tauman Kalai |
Abstract | This paper presents an algorithm for enumerating biases in word embeddings. The algorithm exposes a large number of offensive associations related to sensitive features such as race and gender on publicly available embeddings, including a supposedly “debiased” embedding. These biases are concerning in light of the widespread use of word embeddings. The associations are identified by geometric patterns in word embeddings that run parallel between people’s names and common lower-case tokens. The algorithm is highly unsupervised: it does not even require the sensitive features to be pre-specified. This is desirable because: (a) many forms of discrimination–such as racial discrimination–are linked to social constructs that may vary depending on the context, rather than to categories with fixed definitions; and (b) it makes it easier to identify biases against intersectional groups, which depend on combinations of sensitive features. The inputs to our algorithm are a list of target tokens, e.g. names, and a word embedding. It outputs a number of Word Embedding Association Tests (WEATs) that capture various biases present in the data. We illustrate the utility of our approach on publicly available word embeddings and lists of names, and evaluate its output using crowdsourcing. We also show how removing names may not remove potential proxy bias. |
Tasks | Word Embeddings |
Published | 2018-12-20 |
URL | https://arxiv.org/abs/1812.08769v4 |
https://arxiv.org/pdf/1812.08769v4.pdf | |
PWC | https://paperswithcode.com/paper/what-are-the-biases-in-my-word-embedding |
Repo | |
Framework | |
The Limits of Post-Selection Generalization
Title | The Limits of Post-Selection Generalization |
Authors | Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, Jonathan Ullman |
Abstract | While statistics and machine learning offers numerous methods for ensuring generalization, these methods often fail in the presence of adaptivity—the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure post hoc generalization (also called robust or post-selection generalization), which says that, given the output of the algorithm, it is hard to find any statistic for which the data differs significantly from the population it came from. In this work we show several limitations on the power of algorithms satisfying post hoc generalization. First, we show a tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries, showing a strong barrier to progress in post selection data analysis. Second, we show that post hoc generalization is not closed under composition, despite many examples of such algorithms exhibiting strong composition properties. |
Tasks | |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.06100v1 |
http://arxiv.org/pdf/1806.06100v1.pdf | |
PWC | https://paperswithcode.com/paper/the-limits-of-post-selection-generalization |
Repo | |
Framework | |
Subjective Annotations for Vision-Based Attention Level Estimation
Title | Subjective Annotations for Vision-Based Attention Level Estimation |
Authors | Andrea Coifman, Péter Rohoska, Miklas S. Kristoffersen, Sven E. Shepstone, Zheng-Hua Tan |
Abstract | Attention level estimation systems have a high potential in many use cases, such as human-robot interaction, driver modeling and smart home systems, since being able to measure a person’s attention level opens the possibility to natural interaction between humans and computers. The topic of estimating a human’s visual focus of attention has been actively addressed recently in the field of HCI. However, most of these previous works do not consider attention as a subjective, cognitive attentive state. New research within the field also faces the problem of the lack of annotated datasets regarding attention level in a certain context. The novelty of our work is two-fold: First, we introduce a new annotation framework that tackles the subjective nature of attention level and use it to annotate more than 100,000 images with three attention levels and second, we introduce a novel method to estimate attention levels, relying purely on extracted geometric features from RGB and depth images, and evaluate it with a deep learning fusion framework. The system achieves an overall accuracy of 80.02%. Our framework and attention level annotations are made publicly available. |
Tasks | |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.04949v2 |
http://arxiv.org/pdf/1812.04949v2.pdf | |
PWC | https://paperswithcode.com/paper/subjective-annotations-for-vision-based |
Repo | |
Framework | |
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints
Title | KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints |
Authors | Aurélien Garivier, Hédi Hadiji, Pierre Menard, Gilles Stoltz |
Abstract | In the context of K-armed stochastic bandits with distribution only assumed to be supported by [0,1], we introduce the first algorithm, called KL-UCB-switch, that enjoys simultaneously a distribution-free regret bound of optimal order $\sqrt{KT}$ and a distribution-dependent regret bound of optimal order as well, that is, matching the $\kappa\ln T$ lower bound by Lai & Robbins (1985) and Burnetas & Katehakis (1996). This self-contained contribution simultaneously presents state-of-the-art techniques for regret minimization in bandit models, and an elementary construction of non-asymptotic confidence bounds based on the empirical likelihood method for bounded distributions. |
Tasks | |
Published | 2018-05-14 |
URL | https://arxiv.org/abs/1805.05071v2 |
https://arxiv.org/pdf/1805.05071v2.pdf | |
PWC | https://paperswithcode.com/paper/kl-ucb-switch-optimal-regret-bounds-for |
Repo | |
Framework | |
Counterexamples for Robotic Planning Explained in Structured Language
Title | Counterexamples for Robotic Planning Explained in Structured Language |
Authors | Lu Feng, Mahsa Ghasemi, Kai-Wei Chang, Ufuk Topcu |
Abstract | Automated techniques such as model checking have been used to verify models of robotic mission plans based on Markov decision processes (MDPs) and generate counterexamples that may help diagnose requirement violations. However, such artifacts may be too complex for humans to understand, because existing representations of counterexamples typically include a large number of paths or a complex automaton. To help improve the interpretability of counterexamples, we define a notion of explainable counterexample, which includes a set of structured natural language sentences to describe the robotic behavior that lead to a requirement violation in an MDP model of robotic mission plan. We propose an approach based on mixed-integer linear programming for generating explainable counterexamples that are minimal, sound and complete. We demonstrate the usefulness of the proposed approach via a case study of warehouse robots planning. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08966v1 |
http://arxiv.org/pdf/1803.08966v1.pdf | |
PWC | https://paperswithcode.com/paper/counterexamples-for-robotic-planning |
Repo | |
Framework | |
Knowledge-aware Autoencoders for Explainable Recommender Sytems
Title | Knowledge-aware Autoencoders for Explainable Recommender Sytems |
Authors | Vito Bellini, Angelo Schiavone, Tommaso Di Noia, Azzurra Ragone, Eugenio Di Sciascio |
Abstract | Recommender Systems have been widely used to help users in finding what they are looking for thus tackling the information overload problem. After several years of research and industrial findings looking after better algorithms to improve accuracy and diversity metrics, explanation services for recommendation are gaining momentum as a tool to provide a human-understandable feedback to results computed, in most of the cases, by black-box machine learning techniques. As a matter of fact, explanations may guarantee users satisfaction, trust, and loyalty in a system. In this paper, we evaluate how different information encoded in a Knowledge Graph are perceived by users when they are adopted to show them an explanation. More precisely, we compare how the use of categorical information, factual one or a mixture of them both in building explanations, affect explanatory criteria for a recommender system. Experimental results are validated through an A/B testing platform which uses a recommendation engine based on a Semantics-Aware Autoencoder to build users profiles which are in turn exploited to compute recommendation lists and to provide an explanation. |
Tasks | Recommendation Systems |
Published | 2018-07-17 |
URL | https://arxiv.org/abs/1807.06300v1 |
https://arxiv.org/pdf/1807.06300v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-aware-autoencoders-for-explainable |
Repo | |
Framework | |
On the Convergence Rate of Training Recurrent Neural Networks
Title | On the Convergence Rate of Training Recurrent Neural Networks |
Authors | Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song |
Abstract | How can local-search methods such as stochastic gradient descent (SGD) avoid bad local minima in training multi-layer neural networks? Why can they fit random labels even given non-convex and non-smooth architectures? Most existing theory only covers networks with one hidden layer, so can we go deeper? In this paper, we focus on recurrent neural networks (RNNs) which are multi-layer networks widely used in natural language processing. They are harder to analyze than feedforward neural networks, because the $\textit{same}$ recurrent unit is repeatedly applied across the entire time horizon of length $L$, which is analogous to feedforward networks of depth $L$. We show when the number of neurons is sufficiently large, meaning polynomial in the training data size and in $L$, then SGD is capable of minimizing the regression loss in the linear convergence rate. This gives theoretical evidence of how RNNs can memorize data. More importantly, in this paper we build general toolkits to analyze multi-layer networks with ReLU activations. For instance, we prove why ReLU activations can prevent exponential gradient explosion or vanishing, and build a perturbation theory to analyze first-order approximation of multi-layer networks. |
Tasks | |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.12065v4 |
https://arxiv.org/pdf/1810.12065v4.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convergence-rate-of-training-recurrent |
Repo | |
Framework | |
Adaptive MCMC via Combining Local Samplers
Title | Adaptive MCMC via Combining Local Samplers |
Authors | Kiarash Shaloudegi, András György |
Abstract | Markov chain Monte Carlo (MCMC) methods are widely used in machine learning. One of the major problems with MCMC is the question of how to design chains that mix fast over the whole state space; in particular, how to select the parameters of an MCMC algorithm. Here we take a different approach and, similarly to parallel MCMC methods, instead of trying to find a single chain that samples from the whole distribution, we combine samples from several chains run in parallel, each exploring only parts of the state space (e.g., a few modes only). The chains are prioritized based on kernel Stein discrepancy, which provides a good measure of performance locally. The samples from the independent chains are combined using a novel technique for estimating the probability of different regions of the sample space. Experimental results demonstrate that the proposed algorithm may provide significant speedups in different sampling problems. Most importantly, when combined with the state-of-the-art NUTS algorithm as the base MCMC sampler, our method remained competitive with NUTS on sampling from unimodal distributions, while significantly outperforming state-of-the-art competitors on synthetic multimodal problems as well as on a challenging sensor localization task. |
Tasks | |
Published | 2018-06-11 |
URL | https://arxiv.org/abs/1806.03816v6 |
https://arxiv.org/pdf/1806.03816v6.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-mcmc-via-combining-local-samplers |
Repo | |
Framework | |
Propagation of spiking moments in linear Hawkes networks
Title | Propagation of spiking moments in linear Hawkes networks |
Authors | Matthieu Gilson, Jean-Pascal Pfister |
Abstract | The present paper provides exact mathematical expressions for the high-order moments of spiking activity in a recurrently-connected network of linear Hawkes processes. It extends previous studies that have explored the case of a (linear) Hawkes network driven by deterministic intensity functions to the case of a stimulation by external inputs (rate functions or spike trains) with arbitrary correlation structure. Our approach describes the spatio-temporal filtering induced by the afferent and recurrent connectivities (with arbitrary synaptic response kernels) using operators acting on the input moments. This algebraic viewpoint provides intuition about how the network ingredients shape the input-output mapping for moments, as well as cumulants. We also show using numerical simulation that our results hold for neurons with refractoriness implemented by self-inhibition, provided the corresponding negative feedback for each neuron only mildly alters its mean firing probability. |
Tasks | |
Published | 2018-09-18 |
URL | https://arxiv.org/abs/1810.09520v3 |
https://arxiv.org/pdf/1810.09520v3.pdf | |
PWC | https://paperswithcode.com/paper/propagation-of-spiking-moments-in-linear |
Repo | |
Framework | |
Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection
Title | Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection |
Authors | Jiale Cao, Yanwei Pang, Xuelong Li |
Abstract | To better detect pedestrians of various scales, deep multi-scale methods usually detect pedestrians of different scales by different in-network layers. However, the semantic levels of features from different layers are usually inconsistent. In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches. As a result, the different branches have the same depth and the output features of different branches have similarly high-level semantics. Due to the difference of receptive fields, the different branches are suitable to detect pedestrians of different scales. Meanwhile, the multi-branch network does not introduce additional parameters by sharing convolutional weights of different branches. To further improve detection performance, skip-layer connections among different branches are used to add context to the branch of relatively small receptive filed, and dilated convolution is incorporated into part branches to enlarge the resolutions of output feature maps. When they are embedded into Faster RCNN architecture, the weighted scores of proposal generation network and proposal classification network are further proposed. Experiments on KITTI dataset, Caltech pedestrian dataset, and Citypersons dataset demonstrate the effectiveness of proposed method. On these pedestrian datasets, the proposed method achieves state-of-the-art detection performance. Moreover, experiments on COCO benchmark show the proposed method is also suitable for general object detection. |
Tasks | Object Detection, Pedestrian Detection |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.00872v1 |
http://arxiv.org/pdf/1804.00872v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-multi-branch-and-high-level |
Repo | |
Framework | |
Privately Learning High-Dimensional Distributions
Title | Privately Learning High-Dimensional Distributions |
Authors | Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman |
Abstract | We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance. The sample complexity of our algorithms nearly matches the sample complexity of the optimal non-private learners for these tasks in a wide range of parameters, showing that privacy comes essentially for free for these problems. In particular, in contrast to previous approaches, our algorithm for learning Gaussians does not require strong a priori bounds on the range of the parameters. Our algorithms introduce a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning. |
Tasks | |
Published | 2018-05-01 |
URL | https://arxiv.org/abs/1805.00216v3 |
https://arxiv.org/pdf/1805.00216v3.pdf | |
PWC | https://paperswithcode.com/paper/privately-learning-high-dimensional |
Repo | |
Framework | |
Chinese User Service Intention Classification Based on Hybrid Neural Network
Title | Chinese User Service Intention Classification Based on Hybrid Neural Network |
Authors | Shengbin Jia, Yang Xiang |
Abstract | In order to satisfy the consumers’ increasing personalized service demand, the Intelligent service has arisen. User service intention recognition is an important challenge for intelligent service system to provide precise service. It is difficult for the intelligent system to understand the semantics of user demand which leads to poor recognition effect, because of the noise in user requirement descriptions. Therefore, a hybrid neural network classification model based on BiLSTM and CNN is proposed to recognize users service intentions. The model can fuse the temporal semantics and spatial semantics of the user descriptions. The experimental results show that our model achieves a better effect compared with other models, reaching 0.94 on the F1 score. |
Tasks | Intent Detection, Relation Extraction |
Published | 2018-09-25 |
URL | https://arxiv.org/abs/1809.09408v2 |
https://arxiv.org/pdf/1809.09408v2.pdf | |
PWC | https://paperswithcode.com/paper/supervised-neural-models-revitalize-the-open |
Repo | |
Framework | |