February 1, 2020

3051 words 15 mins read

Paper Group AWR 349

Dynamic Path-Decomposed Tries. Learning Explainable Models Using Attribution Priors. Likelihood-free MCMC with Amortized Approximate Ratio Estimators. On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning. Learning Hierarchy-Aware Knowledge Graph Embeddings for Li …

Dynamic Path-Decomposed Tries


Title	Dynamic Path-Decomposed Tries
Authors	Shunsuke Kanda, Dominik Köppl, Yasuo Tabei, Kazuhiro Morita, Masao Fuketa
Abstract	A keyword dictionary is an associative array whose keys are strings. Recent applications handling massive keyword dictionaries in main memory have a need for a space-efficient implementation. When limited to static applications, there are a number of highly-compressed keyword dictionaries based on the advancements of practical succinct data structures. However, as most succinct data structures are only efficient in the static case, it is still difficult to implement a keyword dictionary that is space efficient and dynamic. In this article, we propose such a keyword dictionary. Our main idea is to embrace the path decomposition technique, which was proposed for constructing cache-friendly tries. To store the path-decomposed trie in small memory, we design data structures based on recent compact hash trie representations. Exhaustive experiments on real-world datasets reveal that our dynamic keyword dictionary needs up to 68% less space than the existing smallest ones.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06015v1
PDF	https://arxiv.org/pdf/1906.06015v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-path-decomposed-tries
Repo	https://github.com/kampersanda/poplar-trie
Framework	none

Learning Explainable Models Using Attribution Priors


Title	Learning Explainable Models Using Attribution Priors
Authors	Gabriel Erion, Joseph D. Janizek, Pascal Sturmfels, Scott Lundberg, Su-In Lee
Abstract	Two important topics in deep learning both involve incorporating humans into the modeling process: Model priors transfer information from humans to a model by constraining the model’s parameters; Model attributions transfer information from a model to humans by explaining the model’s behavior. We propose connecting these topics with attribution priors (https://github.com/suinleelab/attributionpriors), which allow humans to use the common language of attributions to enforce prior expectations about a model’s behavior during training. We develop a differentiable axiomatic feature attribution method called expected gradients and show how to directly regularize these attributions during training. We demonstrate the broad applicability of attribution priors ($\Omega$) by presenting three distinct examples that regularize models to behave more intuitively in three different domains: 1) on image data, $\Omega_{\textrm{pixel}}$ encourages models to have piecewise smooth attribution maps; 2) on gene expression data, $\Omega_{\textrm{graph}}$ encourages models to treat functionally related genes similarly; 3) on a health care dataset, $\Omega_{\textrm{sparse}}$ encourages models to rely on fewer features. In all three domains, attribution priors produce models with more intuitive behavior and better generalization performance by encoding constraints that would otherwise be very difficult to encode using standard model priors.
Tasks	Interpretable Machine Learning
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10670v1
PDF	https://arxiv.org/pdf/1906.10670v1.pdf
PWC	https://paperswithcode.com/paper/learning-explainable-models-using-attribution
Repo	https://github.com/suinleelab/attributionpriors
Framework	tf

Likelihood-free MCMC with Amortized Approximate Ratio Estimators


Title	Likelihood-free MCMC with Amortized Approximate Ratio Estimators
Authors	Joeri Hermans, Volodimir Begy, Gilles Louppe
Abstract	Posterior inference with an intractable likelihood is becoming an increasingly common task in scientific domains which rely on sophisticated computer simulations. Typically, these forward models do not admit tractable densities forcing practitioners to rely on approximations. This work introduces a novel approach to address the intractability of the likelihood and the marginal model. We achieve this by learning a flexible amortized estimator which approximates the likelihood-to-evidence ratio. We demonstrate that the learned ratio estimator can be embedded in MCMC samplers to approximate likelihood-ratios between consecutive states in the Markov chain, allowing us to draw samples from the intractable posterior. Techniques are presented to improve the numerical stability and to measure the quality of an approximation. The accuracy of our approach is demonstrated on a variety of benchmarks against well-established techniques. Scientific applications in physics show its applicability.
Tasks
Published	2019-03-10
URL	https://arxiv.org/abs/1903.04057v4
PDF	https://arxiv.org/pdf/1903.04057v4.pdf
PWC	https://paperswithcode.com/paper/likelihood-free-mcmc-with-approximate
Repo	https://github.com/mackelab/sbi
Framework	pytorch

On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning


Title	On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep Learning
Authors	Aritra Dutta, El Houcine Bergou, Ahmed M. Abdelmoniem, Chen-Yu Ho, Atal Narayan Sahu, Marco Canini, Panos Kalnis
Abstract	Compressed communication, in the form of sparsification or quantization of stochastic gradients, is employed to reduce communication costs in distributed data-parallel training of deep neural networks. However, there exists a discrepancy between theory and practice: while theoretical analysis of most existing compression methods assumes compression is applied to the gradients of the entire model, many practical implementations operate individually on the gradients of each layer of the model. In this paper, we prove that layer-wise compression is, in theory, better, because the convergence rate is upper bounded by that of entire-model compression for a wide range of biased and unbiased compression methods. However, despite the theoretical bound, our experimental study of six well-known methods shows that convergence, in practice, may or may not be better, depending on the actual trained model and compression ratio. Our findings suggest that it would be advantageous for deep learning frameworks to include support for both layer-wise and entire-model compression.
Tasks	Model Compression, Quantization
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08250v1
PDF	https://arxiv.org/pdf/1911.08250v1.pdf
PWC	https://paperswithcode.com/paper/on-the-discrepancy-between-the-theoretical
Repo	https://github.com/sands-lab/layer-wise-aaai20
Framework	pytorch

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction


Title	Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction
Authors	Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, Jie Wang
Abstract	Knowledge graph embedding, which aims to represent entities and relations as low dimensional vectors (or matrices, tensors, etc.), has been shown to be a powerful technique for predicting missing links in knowledge graphs. Existing knowledge graph embedding models mainly focus on modeling relation patterns such as symmetry/antisymmetry, inversion, and composition. However, many existing approaches fail to model semantic hierarchies, which are common in real-world applications. To address this challenge, we propose a novel knowledge graph embedding model—namely, Hierarchy-Aware Knowledge Graph Embedding (HAKE)—which maps entities into the polar coordinate system. HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy. Specifically, the radial coordinate aims to model entities at different levels of the hierarchy, and entities with smaller radii are expected to be at higher levels; the angular coordinate aims to distinguish entities at the same level of the hierarchy, and these entities are expected to have roughly the same radii but different angles. Experiments demonstrate that HAKE can effectively model the semantic hierarchies in knowledge graphs, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the link prediction task.
Tasks	Graph Embedding, Knowledge Graph Embedding, Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09419v2
PDF	https://arxiv.org/pdf/1911.09419v2.pdf
PWC	https://paperswithcode.com/paper/learning-hierarchy-aware-knowledge-graph
Repo	https://github.com/MIRALab-USTC/KGE-HAKE
Framework	pytorch

ColorNet – Estimating Colorfulness in Natural Images


Title	ColorNet – Estimating Colorfulness in Natural Images
Authors	Emin Zerman, Aakanksha Rana, Aljosa Smolic
Abstract	Measuring the colorfulness of a natural or virtual scene is critical for many applications in image processing field ranging from capturing to display. In this paper, we propose the first deep learning-based colorfulness estimation metric. For this purpose, we develop a color rating model which simultaneously learns to extracts the pertinent characteristic color features and the mapping from feature space to the ideal colorfulness scores for a variety of natural colored images. Additionally, we propose to overcome the lack of adequate annotated dataset problem by combining/aligning two publicly available colorfulness databases using the results of a new subjective test which employs a common subset of both databases. Using the obtained subjectively annotated dataset with 180 colored images, we finally demonstrate the efficacy of our proposed model over the traditional methods, both quantitatively and qualitatively.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08505v1
PDF	https://arxiv.org/pdf/1908.08505v1.pdf
PWC	https://paperswithcode.com/paper/colornet-estimating-colorfulness-in-natural
Repo	https://github.com/V-Sense/colornet-estimating-colorfulness
Framework	pytorch

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks


Title	MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks
Authors	Dan Li, Dacheng Chen, Lei Shi, Baihong Jin, Jonathan Goh, See-Kiong Ng
Abstract	The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.
Tasks	Anomaly Detection, Time Series
Published	2019-01-15
URL	http://arxiv.org/abs/1901.04997v1
PDF	http://arxiv.org/pdf/1901.04997v1.pdf
PWC	https://paperswithcode.com/paper/mad-gan-multivariate-anomaly-detection-for
Repo	https://github.com/LiDan456/MAD-GANs
Framework	tf

Multi-task Learning for Low-resource Second Language Acquisition Modeling


Title	Multi-task Learning for Low-resource Second Language Acquisition Modeling
Authors	Yong Hu, Heyan Huang, Tian Lan, Xiaochi Wei, Yuxiang Nie, Jiarui Qi, Liner Yang, Xian-Ling Mao
Abstract	Second language acquisition (SLA) modeling is to predict whether second language learners could correctly answer the questions according to what they have learned. It is a fundamental building block of the personalized learning system and has attracted more and more attention recently. However, as far as we know, almost all existing methods cannot work well in low-resource scenarios because lacking of training data. Fortunately, there are some latent common patterns among different language-learning tasks, which gives us an opportunity to solve the low-resource SLA modeling problem. Inspired by this idea, in this paper, we propose a novel SLA modeling method, which learns the latent common patterns among different language-learning datasets by multi-task learning and are further applied to improving the prediction performance in low-resource scenarios. Extensive experiments show that the proposed method performs much better than the state-of-the-art baselines in the low-resource scenario. Meanwhile, it also obtains improvement slightly in the non-low-resource scenario.
Tasks	Language Acquisition, Multi-Task Learning
Published	2019-08-25
URL	https://arxiv.org/abs/1908.09283v2
PDF	https://arxiv.org/pdf/1908.09283v2.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-for-low-resource-second
Repo	https://github.com/nghuyong/MTL-SLAM
Framework	none

MSNM-Sensor: An Applied Network Monitoring Tool for Anomaly Detection in Complex Networks and Systems


Title	MSNM-Sensor: An Applied Network Monitoring Tool for Anomaly Detection in Complex Networks and Systems
Authors	Roberto Magán-Carrión, José Camacho, Gabriel Maciá-Fernández, Ángel Ruíz-Zafra
Abstract	Technology evolves quickly. Low-cost and ready-to-connect devices are designed to provide new services and applications. Smart grids or smart healthcare systems are some examples of these applications, all of which are in the context of smart cities. In this total-connectivity scenario, some security issues arise since the larger the number of connected devices is, the greater the surface attack dimension. In this way, new solutions for monitoring and detecting security events are needed to address new challenges brought about by this scenario, among others, the large number of devices to monitor, the large amount of data to manage and the real-time requirement to provide quick security event detection and, consequently, quick response to attacks. In this work, a practical and ready-to-use tool for monitoring and detecting security events in these environments is developed and introduced. The tool is based on the Multivariate Statistical Network Monitoring (MSNM) methodology for monitoring and anomaly detection and we call it MSNM-Sensor. Although it is in its early development stages, experimental results based on the detection of well-known attacks in hierarchical network systems prove the suitability of this tool for more complex scenarios, such as those found in smart cities or IoT ecosystems.
Tasks	Anomaly Detection
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13612v2
PDF	https://arxiv.org/pdf/1907.13612v2.pdf
PWC	https://paperswithcode.com/paper/msnm-s-an-applied-network-monitoring-tool-for
Repo	https://github.com/nesg-ugr/msnm-sensor
Framework	none

Motion Planning Explorer: Visualizing Local Minima using a Local-Minima Tree


Title	Motion Planning Explorer: Visualizing Local Minima using a Local-Minima Tree
Authors	Andreas Orthey, Benjamin Frész, Marc Toussaint
Abstract	Motion planning problems often have many local minima. Those minima are important to visualize to let a user guide, prevent or predict motions. Towards this goal, we develop the motion planning explorer, an algorithm to let users interactively explore a tree of local-minima. Following ideas from Morse theory, we define local minima as paths invariant under minimization of a cost functional. The local-minima are grouped into a local-minima tree using lower-dimensional projections specified by a user. The user can then interactively explore the local-minima tree, thereby visualizing the problem structure and guide or prevent motions. We show the motion planning explorer to faithfully capture local minima in four realistic scenarios, both for holonomic and certain non-holonomic robots.
Tasks	Motion Planning
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05035v2
PDF	https://arxiv.org/pdf/1909.05035v2.pdf
PWC	https://paperswithcode.com/paper/motion-planning-explorer-visualizing-local
Repo	https://github.com/aorthey/MotionPlanningExplorerGUI
Framework	none

FAVAE: Sequence Disentanglement using Information Bottleneck Principle


Title	FAVAE: Sequence Disentanglement using Information Bottleneck Principle
Authors	Masanori Yamada, Heecheol Kim, Kosuke Miyoshi, Hiroshi Yamakawa
Abstract	We propose the factorized action variational autoencoder (FAVAE), a state-of-the-art generative model for learning disentangled and interpretable representations from sequential data via the information bottleneck without supervision. The purpose of disentangled representation learning is to obtain interpretable and transferable representations from data. We focused on the disentangled representation of sequential data since there is a wide range of potential applications if disentanglement representation is extended to sequential data such as video, speech, and stock market. Sequential data are characterized by dynamic and static factors: dynamic factors are time dependent, and static factors are independent of time. Previous models disentangle static and dynamic factors by explicitly modeling the priors of latent variables to distinguish between these factors. However, these models cannot disentangle representations between dynamic factors, such as disentangling “picking up” and “throwing” in robotic tasks. FAVAE can disentangle multiple dynamic factors. Since it does not require modeling priors, it can disentangle “between” dynamic factors. We conducted experiments to show that FAVAE can extract disentangled dynamic factors.
Tasks	Representation Learning
Published	2019-02-22
URL	https://arxiv.org/abs/1902.08341v2
PDF	https://arxiv.org/pdf/1902.08341v2.pdf
PWC	https://paperswithcode.com/paper/favae-sequence-disentanglement-using
Repo	https://github.com/favae/favae_ijcai2019
Framework	pytorch

Convolutional neural networks with fractional order gradient method


Title	Convolutional neural networks with fractional order gradient method
Authors	Dian Sheng, Yiheng Wei, Yuquan Chen, Yong Wang
Abstract	This paper proposes a fractional order gradient method for the backward propagation of convolutional neural networks. To overcome the problem that fractional order gradient method cannot converge to real extreme point, a simplified fractional order gradient method is designed based on Caputo’s definition. The parameters within layers are updated by the designed gradient method, but the propagations between layers still use integer order gradients, and thus the complicated derivatives of composite functions are avoided and the chain rule will be kept. By connecting every layers in series and adding loss functions, the proposed convolutional neural networks can be trained smoothly according to various tasks. Some practical experiments are carried out in order to demonstrate fast convergence, high accuracy and ability to escape local optimal point at last.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05336v2
PDF	https://arxiv.org/pdf/1905.05336v2.pdf
PWC	https://paperswithcode.com/paper/190505336
Repo	https://github.com/IQ250/FOCNN
Framework	none


Title	Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media
Authors	Shi Zong, Alan Ritter, Graham Mueller, Evan Wright
Abstract	Breaking cybersecurity events are shared across a range of websites, including security blogs (FireEye, Kaspersky, etc.), in addition to social media platforms such as Facebook and Twitter. In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online. A corpus of 6,000 tweets describing software vulnerabilities is annotated with authors’ opinions toward their severity. We show that our corpus supports the development of automatic classifiers with high precision for this task. Furthermore, we demonstrate the value of analyzing users’ opinions about the severity of threats reported online as an early indicator of important software vulnerabilities. We present a simple, yet effective method for linking software vulnerabilities reported in tweets to Common Vulnerabilities and Exposures (CVEs) in the National Vulnerability Database (NVD). Using our predicted severity scores, we show that it is possible to achieve a Precision@50 of 0.86 when forecasting high severity vulnerabilities, significantly outperforming a baseline that is based on tweet volume. Finally we show how reports of severe vulnerabilities online are predictive of real-world exploits.
Tasks
Published	2019-02-27
URL	https://arxiv.org/abs/1902.10680v3
PDF	https://arxiv.org/pdf/1902.10680v3.pdf
PWC	https://paperswithcode.com/paper/analyzing-the-perceived-severity-of
Repo	https://github.com/viczong/cybersecurity_threat_severity_analysis
Framework	none

Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction


Title	Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
Authors	Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, Yuzhou Zhang
Abstract	Easy-to-use,Modular and Extendible package of deep-learning based CTR models.DeepFM,DeepInterestNetwork(DIN),DeepInterestEvolutionNetwork(DIEN),DeepCrossNetwork(DCN),AttentionalFactorizationMachine(AFM),Neural Factorization Machine(NFM),AutoInt,Deep Session Interest Network(DSIN)
Tasks	Click-Through Rate Prediction, Recommendation Systems
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04447v1
PDF	http://arxiv.org/pdf/1904.04447v1.pdf
PWC	https://paperswithcode.com/paper/feature-generation-by-convolutional-neural
Repo	https://github.com/shenweichen/DeepCTR-PyTorch
Framework	pytorch

Cyberthreat Detection from Twitter using Deep Neural Networks


Title	Cyberthreat Detection from Twitter using Deep Neural Networks
Authors	Nuno Dionísio, Fernando Alves, Pedro M. Ferreira, Alysson Bessani
Abstract	To be prepared against cyberattacks, most organizations resort to security information and event management systems to monitor their infrastructures. These systems depend on the timeliness and relevance of the latest updates, patches and threats provided by cyberthreat intelligence feeds. Open source intelligence platforms, namely social media networks such as Twitter, are capable of aggregating a vast amount of cybersecurity-related sources. To process such information streams, we require scalable and efficient tools capable of identifying and summarizing relevant information for specified assets. This paper presents the processing pipeline of a novel tool that uses deep neural networks to process cybersecurity information received from Twitter. A convolutional neural network identifies tweets containing security-related information relevant to assets in an IT infrastructure. Then, a bidirectional long short-term memory network extracts named entities from these tweets to form a security alert or to fill an indicator of compromise. The proposed pipeline achieves an average 94% true positive rate and 91% true negative rate for the classification task and an average F1-score of 92% for the named entity recognition task, across three case study infrastructures.
Tasks	Named Entity Recognition
Published	2019-04-01
URL	http://arxiv.org/abs/1904.01127v1
PDF	http://arxiv.org/pdf/1904.01127v1.pdf
PWC	https://paperswithcode.com/paper/cyberthreat-detection-from-twitter-using-deep
Repo	https://github.com/ndionysus/twitter-cyberthreat-detection
Framework	none