Paper Group AWR 349
Dynamic Path-Decomposed Tries. Learning Explainable Models Using Attribution Priors. Likelihood-free MCMC with Amortized Approximate Ratio Estimators. On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning. Learning Hierarchy-Aware Knowledge Graph Embeddings for Li …
Dynamic Path-Decomposed Tries
Title | Dynamic Path-Decomposed Tries |
Authors | Shunsuke Kanda, Dominik Köppl, Yasuo Tabei, Kazuhiro Morita, Masao Fuketa |
Abstract | A keyword dictionary is an associative array whose keys are strings. Recent applications handling massive keyword dictionaries in main memory have a need for a space-efficient implementation. When limited to static applications, there are a number of highly-compressed keyword dictionaries based on the advancements of practical succinct data structures. However, as most succinct data structures are only efficient in the static case, it is still difficult to implement a keyword dictionary that is space efficient and dynamic. In this article, we propose such a keyword dictionary. Our main idea is to embrace the path decomposition technique, which was proposed for constructing cache-friendly tries. To store the path-decomposed trie in small memory, we design data structures based on recent compact hash trie representations. Exhaustive experiments on real-world datasets reveal that our dynamic keyword dictionary needs up to 68% less space than the existing smallest ones. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06015v1 |
https://arxiv.org/pdf/1906.06015v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-path-decomposed-tries |
Repo | https://github.com/kampersanda/poplar-trie |
Framework | none |
Learning Explainable Models Using Attribution Priors
Title | Learning Explainable Models Using Attribution Priors |
Authors | Gabriel Erion, Joseph D. Janizek, Pascal Sturmfels, Scott Lundberg, Su-In Lee |
Abstract | Two important topics in deep learning both involve incorporating humans into the modeling process: Model priors transfer information from humans to a model by constraining the model’s parameters; Model attributions transfer information from a model to humans by explaining the model’s behavior. We propose connecting these topics with attribution priors (https://github.com/suinleelab/attributionpriors), which allow humans to use the common language of attributions to enforce prior expectations about a model’s behavior during training. We develop a differentiable axiomatic feature attribution method called expected gradients and show how to directly regularize these attributions during training. We demonstrate the broad applicability of attribution priors ($\Omega$) by presenting three distinct examples that regularize models to behave more intuitively in three different domains: 1) on image data, $\Omega_{\textrm{pixel}}$ encourages models to have piecewise smooth attribution maps; 2) on gene expression data, $\Omega_{\textrm{graph}}$ encourages models to treat functionally related genes similarly; 3) on a health care dataset, $\Omega_{\textrm{sparse}}$ encourages models to rely on fewer features. In all three domains, attribution priors produce models with more intuitive behavior and better generalization performance by encoding constraints that would otherwise be very difficult to encode using standard model priors. |
Tasks | Interpretable Machine Learning |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.10670v1 |
https://arxiv.org/pdf/1906.10670v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-explainable-models-using-attribution |
Repo | https://github.com/suinleelab/attributionpriors |
Framework | tf |
Likelihood-free MCMC with Amortized Approximate Ratio Estimators
Title | Likelihood-free MCMC with Amortized Approximate Ratio Estimators |
Authors | Joeri Hermans, Volodimir Begy, Gilles Louppe |
Abstract | Posterior inference with an intractable likelihood is becoming an increasingly common task in scientific domains which rely on sophisticated computer simulations. Typically, these forward models do not admit tractable densities forcing practitioners to rely on approximations. This work introduces a novel approach to address the intractability of the likelihood and the marginal model. We achieve this by learning a flexible amortized estimator which approximates the likelihood-to-evidence ratio. We demonstrate that the learned ratio estimator can be embedded in MCMC samplers to approximate likelihood-ratios between consecutive states in the Markov chain, allowing us to draw samples from the intractable posterior. Techniques are presented to improve the numerical stability and to measure the quality of an approximation. The accuracy of our approach is demonstrated on a variety of benchmarks against well-established techniques. Scientific applications in physics show its applicability. |
Tasks | |
Published | 2019-03-10 |
URL | https://arxiv.org/abs/1903.04057v4 |
https://arxiv.org/pdf/1903.04057v4.pdf | |
PWC | https://paperswithcode.com/paper/likelihood-free-mcmc-with-approximate |
Repo | https://github.com/mackelab/sbi |
Framework | pytorch |
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning
Title | On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning |
Authors | Aritra Dutta, El Houcine Bergou, Ahmed M. Abdelmoniem, Chen-Yu Ho, Atal Narayan Sahu, Marco Canini, Panos Kalnis |
Abstract | Compressed communication, in the form of sparsification or quantization of stochastic gradients, is employed to reduce communication costs in distributed data-parallel training of deep neural networks. However, there exists a discrepancy between theory and practice: while theoretical analysis of most existing compression methods assumes compression is applied to the gradients of the entire model, many practical implementations operate individually on the gradients of each layer of the model. In this paper, we prove that layer-wise compression is, in theory, better, because the convergence rate is upper bounded by that of entire-model compression for a wide range of biased and unbiased compression methods. However, despite the theoretical bound, our experimental study of six well-known methods shows that convergence, in practice, may or may not be better, depending on the actual trained model and compression ratio. Our findings suggest that it would be advantageous for deep learning frameworks to include support for both layer-wise and entire-model compression. |
Tasks | Model Compression, Quantization |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08250v1 |
https://arxiv.org/pdf/1911.08250v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-discrepancy-between-the-theoretical |
Repo | https://github.com/sands-lab/layer-wise-aaai20 |
Framework | pytorch |
Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction
Title | Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction |
Authors | Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, Jie Wang |
Abstract | Knowledge graph embedding, which aims to represent entities and relations as low dimensional vectors (or matrices, tensors, etc.), has been shown to be a powerful technique for predicting missing links in knowledge graphs. Existing knowledge graph embedding models mainly focus on modeling relation patterns such as symmetry/antisymmetry, inversion, and composition. However, many existing approaches fail to model semantic hierarchies, which are common in real-world applications. To address this challenge, we propose a novel knowledge graph embedding model—namely, Hierarchy-Aware Knowledge Graph Embedding (HAKE)—which maps entities into the polar coordinate system. HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy. Specifically, the radial coordinate aims to model entities at different levels of the hierarchy, and entities with smaller radii are expected to be at higher levels; the angular coordinate aims to distinguish entities at the same level of the hierarchy, and these entities are expected to have roughly the same radii but different angles. Experiments demonstrate that HAKE can effectively model the semantic hierarchies in knowledge graphs, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the link prediction task. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09419v2 |
https://arxiv.org/pdf/1911.09419v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-hierarchy-aware-knowledge-graph |
Repo | https://github.com/MIRALab-USTC/KGE-HAKE |
Framework | pytorch |
ColorNet – Estimating Colorfulness in Natural Images
Title | ColorNet – Estimating Colorfulness in Natural Images |
Authors | Emin Zerman, Aakanksha Rana, Aljosa Smolic |
Abstract | Measuring the colorfulness of a natural or virtual scene is critical for many applications in image processing field ranging from capturing to display. In this paper, we propose the first deep learning-based colorfulness estimation metric. For this purpose, we develop a color rating model which simultaneously learns to extracts the pertinent characteristic color features and the mapping from feature space to the ideal colorfulness scores for a variety of natural colored images. Additionally, we propose to overcome the lack of adequate annotated dataset problem by combining/aligning two publicly available colorfulness databases using the results of a new subjective test which employs a common subset of both databases. Using the obtained subjectively annotated dataset with 180 colored images, we finally demonstrate the efficacy of our proposed model over the traditional methods, both quantitatively and qualitatively. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08505v1 |
https://arxiv.org/pdf/1908.08505v1.pdf | |
PWC | https://paperswithcode.com/paper/colornet-estimating-colorfulness-in-natural |
Repo | https://github.com/V-Sense/colornet-estimating-colorfulness |
Framework | pytorch |
MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks
Title | MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks |
Authors | Dan Li, Dacheng Chen, Lei Shi, Baihong Jin, Jonathan Goh, See-Kiong Ng |
Abstract | The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems. |
Tasks | Anomaly Detection, Time Series |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.04997v1 |
http://arxiv.org/pdf/1901.04997v1.pdf | |
PWC | https://paperswithcode.com/paper/mad-gan-multivariate-anomaly-detection-for |
Repo | https://github.com/LiDan456/MAD-GANs |
Framework | tf |
Multi-task Learning for Low-resource Second Language Acquisition Modeling
Title | Multi-task Learning for Low-resource Second Language Acquisition Modeling |
Authors | Yong Hu, Heyan Huang, Tian Lan, Xiaochi Wei, Yuxiang Nie, Jiarui Qi, Liner Yang, Xian-Ling Mao |
Abstract | Second language acquisition (SLA) modeling is to predict whether second language learners could correctly answer the questions according to what they have learned. It is a fundamental building block of the personalized learning system and has attracted more and more attention recently. However, as far as we know, almost all existing methods cannot work well in low-resource scenarios because lacking of training data. Fortunately, there are some latent common patterns among different language-learning tasks, which gives us an opportunity to solve the low-resource SLA modeling problem. Inspired by this idea, in this paper, we propose a novel SLA modeling method, which learns the latent common patterns among different language-learning datasets by multi-task learning and are further applied to improving the prediction performance in low-resource scenarios. Extensive experiments show that the proposed method performs much better than the state-of-the-art baselines in the low-resource scenario. Meanwhile, it also obtains improvement slightly in the non-low-resource scenario. |
Tasks | Language Acquisition, Multi-Task Learning |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09283v2 |
https://arxiv.org/pdf/1908.09283v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-low-resource-second |
Repo | https://github.com/nghuyong/MTL-SLAM |
Framework | none |
MSNM-Sensor: An Applied Network Monitoring Tool for Anomaly Detection in Complex Networks and Systems
Title | MSNM-Sensor: An Applied Network Monitoring Tool for Anomaly Detection in Complex Networks and Systems |
Authors | Roberto Magán-Carrión, José Camacho, Gabriel Maciá-Fernández, Ángel Ruíz-Zafra |
Abstract | Technology evolves quickly. Low-cost and ready-to-connect devices are designed to provide new services and applications. Smart grids or smart healthcare systems are some examples of these applications, all of which are in the context of smart cities. In this total-connectivity scenario, some security issues arise since the larger the number of connected devices is, the greater the surface attack dimension. In this way, new solutions for monitoring and detecting security events are needed to address new challenges brought about by this scenario, among others, the large number of devices to monitor, the large amount of data to manage and the real-time requirement to provide quick security event detection and, consequently, quick response to attacks. In this work, a practical and ready-to-use tool for monitoring and detecting security events in these environments is developed and introduced. The tool is based on the Multivariate Statistical Network Monitoring (MSNM) methodology for monitoring and anomaly detection and we call it MSNM-Sensor. Although it is in its early development stages, experimental results based on the detection of well-known attacks in hierarchical network systems prove the suitability of this tool for more complex scenarios, such as those found in smart cities or IoT ecosystems. |
Tasks | Anomaly Detection |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13612v2 |
https://arxiv.org/pdf/1907.13612v2.pdf | |
PWC | https://paperswithcode.com/paper/msnm-s-an-applied-network-monitoring-tool-for |
Repo | https://github.com/nesg-ugr/msnm-sensor |
Framework | none |
Motion Planning Explorer: Visualizing Local Minima using a Local-Minima Tree
Title | Motion Planning Explorer: Visualizing Local Minima using a Local-Minima Tree |
Authors | Andreas Orthey, Benjamin Frész, Marc Toussaint |
Abstract | Motion planning problems often have many local minima. Those minima are important to visualize to let a user guide, prevent or predict motions. Towards this goal, we develop the motion planning explorer, an algorithm to let users interactively explore a tree of local-minima. Following ideas from Morse theory, we define local minima as paths invariant under minimization of a cost functional. The local-minima are grouped into a local-minima tree using lower-dimensional projections specified by a user. The user can then interactively explore the local-minima tree, thereby visualizing the problem structure and guide or prevent motions. We show the motion planning explorer to faithfully capture local minima in four realistic scenarios, both for holonomic and certain non-holonomic robots. |
Tasks | Motion Planning |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05035v2 |
https://arxiv.org/pdf/1909.05035v2.pdf | |
PWC | https://paperswithcode.com/paper/motion-planning-explorer-visualizing-local |
Repo | https://github.com/aorthey/MotionPlanningExplorerGUI |
Framework | none |
FAVAE: Sequence Disentanglement using Information Bottleneck Principle
Title | FAVAE: Sequence Disentanglement using Information Bottleneck Principle |
Authors | Masanori Yamada, Heecheol Kim, Kosuke Miyoshi, Hiroshi Yamakawa |
Abstract | We propose the factorized action variational autoencoder (FAVAE), a state-of-the-art generative model for learning disentangled and interpretable representations from sequential data via the information bottleneck without supervision. The purpose of disentangled representation learning is to obtain interpretable and transferable representations from data. We focused on the disentangled representation of sequential data since there is a wide range of potential applications if disentanglement representation is extended to sequential data such as video, speech, and stock market. Sequential data are characterized by dynamic and static factors: dynamic factors are time dependent, and static factors are independent of time. Previous models disentangle static and dynamic factors by explicitly modeling the priors of latent variables to distinguish between these factors. However, these models cannot disentangle representations between dynamic factors, such as disentangling “picking up” and “throwing” in robotic tasks. FAVAE can disentangle multiple dynamic factors. Since it does not require modeling priors, it can disentangle “between” dynamic factors. We conducted experiments to show that FAVAE can extract disentangled dynamic factors. |
Tasks | Representation Learning |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08341v2 |
https://arxiv.org/pdf/1902.08341v2.pdf | |
PWC | https://paperswithcode.com/paper/favae-sequence-disentanglement-using |
Repo | https://github.com/favae/favae_ijcai2019 |
Framework | pytorch |
Convolutional neural networks with fractional order gradient method
Title | Convolutional neural networks with fractional order gradient method |
Authors | Dian Sheng, Yiheng Wei, Yuquan Chen, Yong Wang |
Abstract | This paper proposes a fractional order gradient method for the backward propagation of convolutional neural networks. To overcome the problem that fractional order gradient method cannot converge to real extreme point, a simplified fractional order gradient method is designed based on Caputo’s definition. The parameters within layers are updated by the designed gradient method, but the propagations between layers still use integer order gradients, and thus the complicated derivatives of composite functions are avoided and the chain rule will be kept. By connecting every layers in series and adding loss functions, the proposed convolutional neural networks can be trained smoothly according to various tasks. Some practical experiments are carried out in order to demonstrate fast convergence, high accuracy and ability to escape local optimal point at last. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05336v2 |
https://arxiv.org/pdf/1905.05336v2.pdf | |
PWC | https://paperswithcode.com/paper/190505336 |
Repo | https://github.com/IQ250/FOCNN |
Framework | none |
Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media
Title | Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media |
Authors | Shi Zong, Alan Ritter, Graham Mueller, Evan Wright |
Abstract | Breaking cybersecurity events are shared across a range of websites, including security blogs (FireEye, Kaspersky, etc.), in addition to social media platforms such as Facebook and Twitter. In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online. A corpus of 6,000 tweets describing software vulnerabilities is annotated with authors’ opinions toward their severity. We show that our corpus supports the development of automatic classifiers with high precision for this task. Furthermore, we demonstrate the value of analyzing users’ opinions about the severity of threats reported online as an early indicator of important software vulnerabilities. We present a simple, yet effective method for linking software vulnerabilities reported in tweets to Common Vulnerabilities and Exposures (CVEs) in the National Vulnerability Database (NVD). Using our predicted severity scores, we show that it is possible to achieve a Precision@50 of 0.86 when forecasting high severity vulnerabilities, significantly outperforming a baseline that is based on tweet volume. Finally we show how reports of severe vulnerabilities online are predictive of real-world exploits. |
Tasks | |
Published | 2019-02-27 |
URL | https://arxiv.org/abs/1902.10680v3 |
https://arxiv.org/pdf/1902.10680v3.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-the-perceived-severity-of |
Repo | https://github.com/viczong/cybersecurity_threat_severity_analysis |
Framework | none |
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
Title | Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction |
Authors | Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, Yuzhou Zhang |
Abstract | Easy-to-use,Modular and Extendible package of deep-learning based CTR models.DeepFM,DeepInterestNetwork(DIN),DeepInterestEvolutionNetwork(DIEN),DeepCrossNetwork(DCN),AttentionalFactorizationMachine(AFM),Neural Factorization Machine(NFM),AutoInt,Deep Session Interest Network(DSIN) |
Tasks | Click-Through Rate Prediction, Recommendation Systems |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04447v1 |
http://arxiv.org/pdf/1904.04447v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-generation-by-convolutional-neural |
Repo | https://github.com/shenweichen/DeepCTR-PyTorch |
Framework | pytorch |
Cyberthreat Detection from Twitter using Deep Neural Networks
Title | Cyberthreat Detection from Twitter using Deep Neural Networks |
Authors | Nuno Dionísio, Fernando Alves, Pedro M. Ferreira, Alysson Bessani |
Abstract | To be prepared against cyberattacks, most organizations resort to security information and event management systems to monitor their infrastructures. These systems depend on the timeliness and relevance of the latest updates, patches and threats provided by cyberthreat intelligence feeds. Open source intelligence platforms, namely social media networks such as Twitter, are capable of aggregating a vast amount of cybersecurity-related sources. To process such information streams, we require scalable and efficient tools capable of identifying and summarizing relevant information for specified assets. This paper presents the processing pipeline of a novel tool that uses deep neural networks to process cybersecurity information received from Twitter. A convolutional neural network identifies tweets containing security-related information relevant to assets in an IT infrastructure. Then, a bidirectional long short-term memory network extracts named entities from these tweets to form a security alert or to fill an indicator of compromise. The proposed pipeline achieves an average 94% true positive rate and 91% true negative rate for the classification task and an average F1-score of 92% for the named entity recognition task, across three case study infrastructures. |
Tasks | Named Entity Recognition |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.01127v1 |
http://arxiv.org/pdf/1904.01127v1.pdf | |
PWC | https://paperswithcode.com/paper/cyberthreat-detection-from-twitter-using-deep |
Repo | https://github.com/ndionysus/twitter-cyberthreat-detection |
Framework | none |