January 30, 2020

3045 words 15 mins read

Paper Group ANR 250

Deep Hashing using Entropy Regularised Product Quantisation Network. Text Modeling with Syntax-Aware Variational Autoencoders. Zero-Shot Cross-Lingual Opinion Target Extraction. Language Independent Sequence Labelling for Opinion Target Extraction. LuNet: A Deep Neural Network for Network Intrusion Detection. Overfitting of neural nets under class …

Deep Hashing using Entropy Regularised Product Quantisation Network


Title	Deep Hashing using Entropy Regularised Product Quantisation Network
Authors	Jo Schlemper, Jose Caballero, Andy Aitken, Joost van Amersfoort
Abstract	In large scale systems, approximate nearest neighbour search is a crucial algorithm to enable efficient data retrievals. Recently, deep learning-based hashing algorithms have been proposed as a promising paradigm to enable data dependent schemes. Often their efficacy is only demonstrated on data sets with fixed, limited numbers of classes. In practical scenarios, those labels are not always available or one requires a method that can handle a higher input variability, as well as a higher granularity. To fulfil those requirements, we look at more flexible similarity measures. In this work, we present a novel, flexible, end-to-end trainable network for large-scale data hashing. Our method works by transforming the data distribution to behave as a uniform distribution on a product of spheres. The transformed data is subsequently hashed to a binary form in a way that maximises entropy of the output, (i.e. to fully utilise the available bit-rate capacity) while maintaining the correctness (i.e. close items hash to the same key in the map). We show that the method outperforms baseline approaches such as locality-sensitive hashing and product quantisation in the limited capacity regime.
Tasks
Published	2019-02-11
URL	http://arxiv.org/abs/1902.03876v1
PDF	http://arxiv.org/pdf/1902.03876v1.pdf
PWC	https://paperswithcode.com/paper/deep-hashing-using-entropy-regularised
Repo
Framework

Text Modeling with Syntax-Aware Variational Autoencoders


Title	Text Modeling with Syntax-Aware Variational Autoencoders
Authors	Yijun Xiao, William Yang Wang
Abstract	Syntactic information contains structures and rules about how text sentences are arranged. Incorporating syntax into text modeling methods can potentially benefit both representation learning and generation. Variational autoencoders (VAEs) are deep generative models that provide a probabilistic way to describe observations in the latent space. When applied to text data, the latent representations are often unstructured. We propose syntax-aware variational autoencoders (SAVAEs) that dedicate a subspace in the latent dimensions dubbed syntactic latent to represent syntactic structures of sentences. SAVAEs are trained to infer syntactic latent from either text inputs or parsed syntax results as well as reconstruct original text with inferred latent variables. Experiments show that SAVAEs are able to achieve lower reconstruction loss on four different data sets. Furthermore, they are capable of generating examples with modified target syntax.
Tasks	Representation Learning
Published	2019-08-27
URL	https://arxiv.org/abs/1908.09964v1
PDF	https://arxiv.org/pdf/1908.09964v1.pdf
PWC	https://paperswithcode.com/paper/text-modeling-with-syntax-aware-variational
Repo
Framework

Zero-Shot Cross-Lingual Opinion Target Extraction


Title	Zero-Shot Cross-Lingual Opinion Target Extraction
Authors	Soufian Jebbara, Philipp Cimiano
Abstract	Aspect-based sentiment analysis involves the recognition of so called opinion target expressions (OTEs). To automatically extract OTEs, supervised learning algorithms are usually employed which are trained on manually annotated corpora. The creation of these corpora is labor-intensive and sufficiently large datasets are therefore usually only available for a very narrow selection of languages and domains. In this work, we address the lack of available annotated data for specific languages by proposing a zero-shot cross-lingual approach for the extraction of opinion target expressions. We leverage multilingual word embeddings that share a common vector space across various languages and incorporate these into a convolutional neural network architecture for OTE extraction. Our experiments with 5 languages give promising results: We can successfully train a model on annotated data of a source language and perform accurate prediction on a target language without ever using any annotated samples in that target language. Depending on the source and target language pairs, we reach performances in a zero-shot regime of up to 77% of a model trained on target language data. Furthermore, we can increase this performance up to 87% of a baseline model trained on target language data by performing cross-lingual learning from multiple source languages.
Tasks	Aspect-Based Sentiment Analysis, Multilingual Word Embeddings, Sentiment Analysis, Word Embeddings
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09122v1
PDF	http://arxiv.org/pdf/1904.09122v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-cross-lingual-opinion-target
Repo
Framework

Language Independent Sequence Labelling for Opinion Target Extraction


Title	Language Independent Sequence Labelling for Opinion Target Extraction
Authors	Rodrigo Agerri, German Rigau
Abstract	In this research note we present a language independent system to model Opinion Target Extraction (OTE) as a sequence labelling task. The system consists of a combination of clustering features implemented on top of a simple set of shallow local features. Experiments on the well known Aspect Based Sentiment Analysis (ABSA) benchmarks show that our approach is very competitive across languages, obtaining best results for six languages in seven different datasets. Furthermore, the results provide further insights into the behaviour of clustering features for sequence labelling tasks. The system and models generated in this work are available for public use and to facilitate reproducibility of results.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09755v1
PDF	http://arxiv.org/pdf/1901.09755v1.pdf
PWC	https://paperswithcode.com/paper/language-independent-sequence-labelling-for
Repo
Framework

LuNet: A Deep Neural Network for Network Intrusion Detection


Title	LuNet: A Deep Neural Network for Network Intrusion Detection
Authors	Peilun Wu, Hui Guo
Abstract	Network attack is a significant security issue for modern society. From small mobile devices to large cloud platforms, almost all computing products, used in our daily life, are networked and potentially under the threat of network intrusion. With the fast-growing network users, network intrusions become more and more frequent, volatile and advanced. Being able to capture intrusions in time for such a large scale network is critical and very challenging. To this end, the machine learning (or AI) based network intrusion detection (NID), due to its intelligent capability, has drawn increasing attention in recent years. Compared to the traditional signature-based approaches, the AI-based solutions are more capable of detecting variants of advanced network attacks. However, the high detection rate achieved by the existing designs is usually accompanied by a high rate of false alarms, which may significantly discount the overall effectiveness of the intrusion detection system. In this paper, we consider the existence of spatial and temporal features in the network traffic data and propose a hierarchical CNN+RNN neural network, LuNet. In LuNet, the convolutional neural network (CNN) and the recurrent neural network (RNN) learn input traffic data in sync with a gradually increasing granularity such that both spatial and temporal features of the data can be effectively extracted. Our experiments on two network traffic datasets show that compared to the state-of-the-art network intrusion detection techniques, LuNet not only offers a high level of detection capability but also has a much low rate of false positive-alarm.
Tasks	Intrusion Detection, Network Intrusion Detection
Published	2019-09-22
URL	https://arxiv.org/abs/1909.10031v2
PDF	https://arxiv.org/pdf/1909.10031v2.pdf
PWC	https://paperswithcode.com/paper/190910031
Repo
Framework

Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation


Title	Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation
Authors	Zeju Li, Konstantinos Kamnitsas, Ben Glocker
Abstract	Overfitting in deep learning has been the focus of a number of recent works, yet its exact impact on the behavior of neural networks is not well understood. This study analyzes overfitting by examining how the distribution of logits alters in relation to how much the model overfits. Specifically, we find that when training with few data samples, the distribution of logit activations when processing unseen test samples of an under-represented class tends to shift towards and even across the decision boundary, while the over-represented class seems unaffected. In image segmentation, foreground samples are often heavily under-represented. We observe that sensitivity of the model drops as a result of overfitting, while precision remains mostly stable. Based on our analysis, we derive asymmetric modifications of existing loss functions and regularizers including a large margin loss, focal loss, adversarial training and mixup, which specifically aim at reducing the shift observed when embedding unseen samples of the under-represented class. We study the case of binary segmentation of brain tumor core and show that our proposed simple modifications lead to significantly improved segmentation performance over the symmetric variants.
Tasks	Semantic Segmentation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10982v2
PDF	https://arxiv.org/pdf/1907.10982v2.pdf
PWC	https://paperswithcode.com/paper/overfitting-of-neural-nets-under-class
Repo
Framework

Towards Self-Explainable Cyber-Physical Systems


Title	Towards Self-Explainable Cyber-Physical Systems
Authors	Mathias Blumreiter, Joel Greenyer, Francisco Javier Chiyah Garcia, Verena Klös, Maike Schwammberger, Christoph Sommer, Andreas Vogelsang, Andreas Wortmann
Abstract	With the increasing complexity of CPSs, their behavior and decisions become increasingly difficult to understand and comprehend for users and other stakeholders. Our vision is to build self-explainable systems that can, at run-time, answer questions about the system’s past, current, and future behavior. As hitherto no design methodology or reference framework exists for building such systems, we propose the MAB-EX framework for building self-explainable systems that leverage requirements- and explainability models at run-time. The basic idea of MAB-EX is to first Monitor and Analyze a certain behavior of a system, then Build an explanation from explanation models and convey this EXplanation in a suitable way to a stakeholder. We also take into account that new explanations can be learned, by updating the explanation models, should new and yet un-explainable behavior be detected by the system.
Tasks
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04698v1
PDF	https://arxiv.org/pdf/1908.04698v1.pdf
PWC	https://paperswithcode.com/paper/towards-self-explainable-cyber-physical
Repo
Framework

Network with Sub-Networks


Title	Network with Sub-Networks
Authors	Ninnart Fuengfusin, Hakaru Tamukoh
Abstract	We introduce network with sub-networks, a neural network which its weight layers could be detached into sub-neural networks during inference. To develop weights and biases which could be inserted in both base and sub-neural networks, firstly, the parameters are copied from sub-model to base-model. Each model is forward-propagated separately. Gradients from a pair of networks are averaged and, used to update both networks. Our base model achieves the test-accuracy which is comparable to the regularly trained models, while the model maintains the ability to detach weight layers.
Tasks
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00763v2
PDF	https://arxiv.org/pdf/1908.00763v2.pdf
PWC	https://paperswithcode.com/paper/network-with-sub-networks
Repo
Framework

POI Semantic Model with a Deep Convolutional Structure


Title	POI Semantic Model with a Deep Convolutional Structure
Authors	Ji Zhao, Meiyu Yu, Huan Chen, Boning Li, Lingyu Zhang, Qi Song, Li Ma, Hua Chai, Jieping Ye
Abstract	When using the electronic map, POI retrieval is the initial and important step, whose quality directly affects the user experience. Similarity between user query and POI information is the most critical feature in POI retrieval. An accurate similarity calculation is challenging since the mismatch between a query and a retrieval text may exist in the case of a mistyped query or an alias inquiry. In this paper, we propose a POI latent semantic model based on deep networks, which can effectively extract query features and POI information features for the similarity calculation. Our model describes the semantic information of complex texts at multiple layers, and achieves multi-field matches by modeling POI’s name and detailed address respectively. Our model is evaluated by the POI retrieval ranking datasets, including the labeled data of relevance and real-world user click data in POI retrieval. Results show that our model significantly outperforms our competitors in POI retrieval ranking tasks. The proposed algorithm has become a critical component of an online system serving millions of people everyday.
Tasks
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07279v1
PDF	http://arxiv.org/pdf/1903.07279v1.pdf
PWC	https://paperswithcode.com/paper/poi-semantic-model-with-a-deep-convolutional
Repo
Framework

Using positive spanning sets to achieve d-stationarity with the Boosted DC Algorithm


Title	Using positive spanning sets to achieve d-stationarity with the Boosted DC Algorithm
Authors	Francisco J. Aragón Artacho, Rubén Campoy, Phan T. Vuong
Abstract	The Difference of Convex functions Algorithm (DCA) is widely used for minimizing the difference of two convex functions. A recently proposed accelerated version, termed BDCA for Boosted DC Algorithm, incorporates a line search step to achieve a larger decrease of the objective value at each iteration. Thanks to this step, BDCA usually converges much faster than DCA in practice. The solutions found by DCA are guaranteed to be critical points of the problem, but these may not be local minima. Although BDCA tends to improve the objective value of the solutions it finds, these are frequently just critical points as well. In this paper we combine BDCA with a simple Derivative-Free Optimization (DFO) algorithm to force the d-stationarity (lack of descent direction) at the point obtained. The potential of this approach is illustrated through some computational experiments on a Minimum-Sum-of-Squares clustering problem. Our numerical results demonstrate that the new method provides better solutions while still remains faster than DCA in the majority of test cases.
Tasks	Text-to-Image Generation
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11471v2
PDF	https://arxiv.org/pdf/1907.11471v2.pdf
PWC	https://paperswithcode.com/paper/using-positive-spanning-sets-to-achieve
Repo
Framework

SynGAN: Towards Generating Synthetic Network Attacks using GANs


Title	SynGAN: Towards Generating Synthetic Network Attacks using GANs
Authors	Jeremy Charlier, Aman Singh, Gaston Ormazabal, Radu State, Henning Schulzrinne
Abstract	The rapid digital transformation without security considerations has resulted in the rise of global-scale cyberattacks. The first line of defense against these attacks are Network Intrusion Detection Systems (NIDS). Once deployed, however, these systems work as blackboxes with a high rate of false positives with no measurable effectiveness. There is a need to continuously test and improve these systems by emulating real-world network attack mutations. We present SynGAN, a framework that generates adversarial network attacks using the Generative Adversial Networks (GAN). SynGAN generates malicious packet flow mutations using real attack traffic, which can improve NIDS attack detection rates. As a first step, we compare two public datasets, NSL-KDD and CICIDS2017, for generating synthetic Distributed Denial of Service (DDoS) network attacks. We evaluate the attack quality (real vs. synthetic) using a gradient boosting classifier.
Tasks	Intrusion Detection, Network Intrusion Detection
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09899v1
PDF	https://arxiv.org/pdf/1908.09899v1.pdf
PWC	https://paperswithcode.com/paper/syngan-towards-generating-synthetic-network
Repo
Framework

Generalized Planning: Non-Deterministic Abstractions and Trajectory Constraints


Title	Generalized Planning: Non-Deterministic Abstractions and Trajectory Constraints
Authors	Blai Bonet, Giuseppe De Giacomo, Hector Geffner, Sasha Rubin
Abstract	We study the characterization and computation of general policies for families of problems that share a structure characterized by a common reduction into a single abstract problem. Policies $\mu$ that solve the abstract problem P have been shown to solve all problems Q that reduce to P provided that $\mu$ terminates in Q. In this work, we shed light on why this termination condition is needed and how it can be removed. The key observation is that the abstract problem P captures the common structure among the concrete problems Q that is local (Markovian) but misses common structure that is global. We show how such global structure can be captured by means of trajectory constraints that in many cases can be expressed as LTL formulas, thus reducing generalized planning to LTL synthesis. Moreover, for a broad class of problems that involve integer variables that can be increased or decreased, trajectory constraints can be compiled away, reducing generalized planning to fully observable non-deterministic planning.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12135v1
PDF	https://arxiv.org/pdf/1909.12135v1.pdf
PWC	https://paperswithcode.com/paper/generalized-planning-non-deterministic
Repo
Framework

Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning


Title	Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning
Authors	Gang Chen, Yiming Peng
Abstract	We propose a new policy iteration theory as an important extension of soft policy iteration and Soft Actor-Critic (SAC), one of the most efficient model free algorithms for deep reinforcement learning. Supported by the new theory, arbitrary entropy measures that generalize Shannon entropy, such as Tsallis entropy and Renyi entropy, can be utilized to properly randomize action selection while fulfilling the goal of maximizing expected long-term rewards. Our theory gives birth to two new algorithms, i.e., Tsallis entropy Actor-Critic (TAC) and Renyi entropy Actor-Critic (RAC). Theoretical analysis shows that these algorithms can be more effective than SAC. Moreover, they pave the way for us to develop a new Ensemble Actor-Critic (EAC) algorithm in this paper that features the use of a bootstrap mechanism for deep environment exploration as well as a new value-function based mechanism for high-level action selection. Empirically we show that TAC, RAC and EAC can achieve state-of-the-art performance on a range of benchmark control tasks, outperforming SAC and several cutting-edge learning algorithms in terms of both sample efficiency and effectiveness.
Tasks
Published	2019-02-14
URL	http://arxiv.org/abs/1902.05551v1
PDF	http://arxiv.org/pdf/1902.05551v1.pdf
PWC	https://paperswithcode.com/paper/off-policy-actor-critic-in-an-ensemble
Repo
Framework

An Analysis of Emotion Communication Channels in Fan Fiction: Towards Emotional Storytelling


Title	An Analysis of Emotion Communication Channels in Fan Fiction: Towards Emotional Storytelling
Authors	Evgeny Kim, Roman Klinger
Abstract	Centrality of emotion for the stories told by humans is underpinned by numerous studies in literature and psychology. The research in automatic storytelling has recently turned towards emotional storytelling, in which characters’ emotions play an important role in the plot development. However, these studies mainly use emotion to generate propositional statements in the form “A feels affection towards B” or “A confronts B”. At the same time, emotional behavior does not boil down to such propositional descriptions, as humans display complex and highly variable patterns in communicating their emotions, both verbally and non-verbally. In this paper, we analyze how emotions are expressed non-verbally in a corpus of fan fiction short stories. Our analysis shows that stories written by humans convey character emotions along various non-verbal channels. We find that some non-verbal channels, such as facial expressions and voice characteristics of the characters, are more strongly associated with joy, while gestures and body postures are more likely to occur with trust. Based on our analysis, we argue that automatic storytelling systems should take variability of emotion into account when generating descriptions of characters’ emotions.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02402v1
PDF	https://arxiv.org/pdf/1906.02402v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-emotion-communication-channels
Repo
Framework

On Geometric Structure of Activation Spaces in Neural Networks


Title	On Geometric Structure of Activation Spaces in Neural Networks
Authors	Yuting Jia, Haiwen Wang, Shuo Shao, Huan Long, Yunsong Zhou, Xinbing Wang
Abstract	In this paper, we investigate the geometric structure of activation spaces of fully connected layers in neural networks and then show applications of this study. We propose an efficient approximation algorithm to characterize the convex hull of massive points in high dimensional space. Based on this new algorithm, four common geometric properties shared by the activation spaces are concluded, which gives a rather clear description of the activation spaces. We then propose an alternative classification method grounding on the geometric structure description, which works better than neural networks alone. Surprisingly, this data classification method can be an indicator of overfitting in neural networks. We believe our work reveals several critical intrinsic properties of modern neural networks and further gives a new metric for evaluating them.
Tasks
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01399v1
PDF	http://arxiv.org/pdf/1904.01399v1.pdf
PWC	https://paperswithcode.com/paper/on-geometric-structure-of-activation-spaces
Repo
Framework