October 21, 2019

2975 words 14 mins read

Paper Group AWR 126

Generating Music using an LSTM Network. A Call for Clarity in Reporting BLEU Scores. Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition. CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning. Simple random search provides a competitive approach to reinforcement learning. Quickshift++: Provably Good Initi …

Generating Music using an LSTM Network


Title	Generating Music using an LSTM Network
Authors	Nikhil Kotecha, Paul Young
Abstract	A model of music needs to have the ability to recall past details and have a clear, coherent understanding of musical structure. Detailed in the paper is a neural network architecture that predicts and generates polyphonic music aligned with musical rules. The probabilistic model presented is a Bi-axial LSTM trained with a kernel reminiscent of a convolutional kernel. When analyzed quantitatively and qualitatively, this approach performs well in composing polyphonic music. Link to the code is provided.
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.07300v1
PDF	http://arxiv.org/pdf/1804.07300v1.pdf
PWC	https://paperswithcode.com/paper/generating-music-using-an-lstm-network
Repo	https://github.com/nikhil-kotecha/Generating_Music
Framework	none

A Call for Clarity in Reporting BLEU Scores


Title	A Call for Clarity in Reporting BLEU Scores
Authors	Matt Post
Abstract	The field of machine translation faces an under-recognized problem because of inconsistency in the reporting of scores from its dominant metric. Although people refer to “the” BLEU score, BLEU is in fact a parameterized metric whose values can vary wildly with changes to these parameters. These parameters are often not reported or are hard to find, and consequently, BLEU scores between papers cannot be directly compared. I quantify this variation, finding differences as high as 1.8 between commonly used configurations. The main culprit is different tokenization and normalization schemes applied to the reference. Pointing to the success of the parsing community, I suggest machine translation researchers settle upon the BLEU scheme used by the annual Conference on Machine Translation (WMT), which does not allow for user-supplied reference processing, and provide a new tool, SacreBLEU, to facilitate this.
Tasks	Machine Translation, Tokenization
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08771v2
PDF	http://arxiv.org/pdf/1804.08771v2.pdf
PWC	https://paperswithcode.com/paper/a-call-for-clarity-in-reporting-bleu-scores
Repo	https://github.com/mjpost/sacreBLEU
Framework	none

Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition


Title	Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition
Authors	Federico Magliani, Tomaso Fontanini, Andrea Prati
Abstract	The problem of landmark recognition has achieved excellent results in small-scale datasets. When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05946v1
PDF	http://arxiv.org/pdf/1806.05946v1.pdf
PWC	https://paperswithcode.com/paper/efficient-nearest-neighbors-search-for-large
Repo	https://github.com/fmaglia/BoI
Framework	none

CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning


Title	CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Authors	Jerome Abdelnour, Giampiero Salvi, Jean Rouat
Abstract	We introduce the task of acoustic question answering (AQA) in the area of acoustic reasoning. In this task an agent learns to answer questions on the basis of acoustic context. In order to promote research in this area, we propose a data generation paradigm adapted from CLEVR (Johnson et al. 2017). We generate acoustic scenes by leveraging a bank elementary sounds. We also provide a number of functional programs that can be used to compose questions and answers that exploit the relationships between the attributes of the elementary sounds in each scene. We provide AQA datasets of various sizes as well as the data generation code. As a preliminary experiment to validate our data, we report the accuracy of current state of the art visual question answering models when they are applied to the AQA task without modifications. Although there is a plethora of question answering tasks based on text, image or video data, to our knowledge, we are the first to propose answering questions directly on audio streams. We hope this contribution will facilitate the development of research in the area.
Tasks	Acoustic Question Answering, Question Answering, Visual Question Answering
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10561v1
PDF	http://arxiv.org/pdf/1811.10561v1.pdf
PWC	https://paperswithcode.com/paper/clear-a-dataset-for-compositional-language
Repo	https://github.com/IGLU-CHISTERA/CLEAR-dataset-generation
Framework	none

Simple random search provides a competitive approach to reinforcement learning


Title	Simple random search provides a competitive approach to reinforcement learning
Authors	Horia Mania, Aurelia Guy, Benjamin Recht
Abstract	A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks. Our method also finds a nearly optimal controller for a challenging instance of the Linear Quadratic Regulator, a classical problem in control theory, when the dynamics are not known. Computationally, our random search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks. We take advantage of this computational efficiency to evaluate the performance of our method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task. Our simulations highlight a high variability in performance in these benchmark tasks, suggesting that commonly used estimations of sample efficiency do not adequately evaluate the performance of RL algorithms.
Tasks	Continuous Control
Published	2018-03-19
URL	http://arxiv.org/abs/1803.07055v1
PDF	http://arxiv.org/pdf/1803.07055v1.pdf
PWC	https://paperswithcode.com/paper/simple-random-search-provides-a-competitive
Repo	https://github.com/kayuksel/pytorch-ars
Framework	pytorch

Quickshift++: Provably Good Initializations for Sample-Based Mean Shift


Title	Quickshift++: Provably Good Initializations for Sample-Based Mean Shift
Authors	Heinrich Jiang, Jennifer Jang, Samory Kpotufe
Abstract	We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applications to image segmentation.
Tasks	Semantic Segmentation
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07909v1
PDF	http://arxiv.org/pdf/1805.07909v1.pdf
PWC	https://paperswithcode.com/paper/quickshift-provably-good-initializations-for
Repo	https://github.com/google/quickshift
Framework	none


Title	Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention
Authors	Khanh Nguyen, Debadeepta Dey, Chris Brockett, Bill Dolan
Abstract	We present Vision-based Navigation with Language-based Assistance (VNLA), a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments. The task emulates a real-world scenario in that (a) the requester may not know how to navigate to the target objects and thus makes requests by only specifying high-level end-goals, and (b) the agent is capable of sensing when it is lost and querying an advisor, who is more qualified at the task, to obtain language subgoals to make progress. To model language-based assistance, we develop a general framework termed Imitation Learning with Indirect Intervention (I3L), and propose a solution that is effective on the VNLA task. Empirical results show that this approach significantly improves the success rate of the learning agent over other baselines in both seen and unseen environments. Our code and data are publicly available at https://github.com/debadeepta/vnla .
Tasks	Imitation Learning, Vision-based navigation with language-based assistance, VNLA
Published	2018-12-10
URL	http://arxiv.org/abs/1812.04155v4
PDF	http://arxiv.org/pdf/1812.04155v4.pdf
PWC	https://paperswithcode.com/paper/vision-based-navigation-with-language-based
Repo	https://github.com/debadeepta/vnla
Framework	pytorch

Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus


Title	Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus
Authors	Jianxiong Dong, Jim Huang
Abstract	Ubuntu dialogue corpus is the largest public available dialogue corpus to make it feasible to build end-to-end deep neural network models directly from the conversation data. One challenge of Ubuntu dialogue corpus is the large number of out-of-vocabulary words. In this paper we proposed a method which combines the general pre-trained word embedding vectors with those generated on the task-specific training set to address this issue. We integrated character embedding into Chen et al’s Enhanced LSTM method (ESIM) and used it to evaluate the effectiveness of our proposed method. For the task of next utterance selection, the proposed method has demonstrated a significant performance improvement against original ESIM and the new model has achieved state-of-the-art results on both Ubuntu dialogue corpus and Douban conversation corpus. In addition, we investigated the performance impact of end-of-utterance and end-of-turn token tags.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02614v2
PDF	http://arxiv.org/pdf/1802.02614v2.pdf
PWC	https://paperswithcode.com/paper/enhance-word-representation-for-out-of
Repo	https://github.com/jdongca2003/next_utterance_selection
Framework	tf


Title	Forecasting the presence and intensity of hostility on Instagram using linguistic and social features
Authors	Ping Liu, Joshua Guberman, Libby Hemphill, Aron Culotta
Abstract	Online antisocial behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has negative physical and mental health consequences for victims and communities. While prior work has proposed automated methods to identify hostile comments in online discussions, these methods work retrospectively on comments that have already been posted, making it difficult to intervene before an interaction escalates. In this paper we instead consider the problem of forecasting future hostilities in online discussions, which we decompose into two tasks: (1) given an initial sequence of non-hostile comments in a discussion, predict whether some future comment will contain hostility; and (2) given the first hostile comment in a discussion, predict whether this will lead to an escalation of hostility in subsequent comments. Thus, we aim to forecast both the presence and intensity of hostile comments based on linguistic and social features from earlier comments. To evaluate our approach, we introduce a corpus of over 30K annotated Instagram comments from over 1,100 posts. Our approach is able to predict the appearance of a hostile comment on an Instagram post ten or more hours in the future with an AUC of .82 (task 1), and can furthermore distinguish between high and low levels of future hostility with an AUC of .91 (task 2).
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06759v1
PDF	http://arxiv.org/pdf/1804.06759v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-the-presence-and-intensity-of
Repo	https://github.com/tapilab/icwsm-2018-hostility
Framework	none

Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection


Title	Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection
Authors	Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, Asaf Shabtai
Abstract	Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time. In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune’s core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS.
Tasks	Intrusion Detection, Network Intrusion Detection
Published	2018-02-25
URL	http://arxiv.org/abs/1802.09089v2
PDF	http://arxiv.org/pdf/1802.09089v2.pdf
PWC	https://paperswithcode.com/paper/kitsune-an-ensemble-of-autoencoders-for
Repo	https://github.com/ymirsky/Kitsune-py
Framework	none

A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets


Title	A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets
Authors	Hanan Hindy, David Brosset, Ethan Bayne, Amar Seeam, Christos Tachtatzis, Robert Atkinson, Xavier Bellekens
Abstract	With the world moving towards being increasingly dependent on computers and automation, one of the main challenges in the current decade has been to build secure applications, systems and networks. Alongside these challenges, the number of threats is rising exponentially due to the attack surface increasing through numerous interfaces offered for each service. To alleviate the impact of these threats, researchers have proposed numerous solutions; however, current tools often fail to adapt to ever-changing architectures, associated threats and 0-days. This manuscript aims to provide researchers with a taxonomy and survey of current dataset composition and current Intrusion Detection Systems (IDS) capabilities and assets. These taxonomies and surveys aim to improve both the efficiency of IDS and the creation of datasets to build the next generation IDS as well as to reflect networks threats more accurately in future datasets. To this end, this manuscript also provides a taxonomy and survey or network threats and associated tools. The manuscript highlights that current IDS only cover 25% of our threat taxonomy, while current datasets demonstrate clear lack of real-network threats and attack representation, but rather include a large number of deprecated threats, hence limiting the accuracy of current machine learning IDS. Moreover, the taxonomies are open-sourced to allow public contributions through a Github repository.
Tasks	Intrusion Detection
Published	2018-06-09
URL	http://arxiv.org/abs/1806.03517v1
PDF	http://arxiv.org/pdf/1806.03517v1.pdf
PWC	https://paperswithcode.com/paper/a-taxonomy-and-survey-of-intrusion-detection
Repo	https://github.com/AbertayMachineLearningGroup/network-threats-taxonomy
Framework	none

Domain Adaptation with Randomized Expectation Maximization


Title	Domain Adaptation with Randomized Expectation Maximization
Authors	Twan van Laarhoven, Elena Marchiori
Abstract	Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods.
Tasks	Domain Adaptation, Transfer Learning
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07634v1
PDF	http://arxiv.org/pdf/1803.07634v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptation-with-randomized-expectation
Repo	https://github.com/twanvl/adrem
Framework	none

Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization


Title	Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization
Authors	Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir
Abstract	Adversarial self-play in two-player games has delivered impressive results when used with reinforcement learning algorithms that combine deep neural networks and tree search. Algorithms like AlphaZero and Expert Iteration learn tabula-rasa, producing highly informative training data on the fly. However, the self-play training strategy is not directly applicable to single-player games. Recently, several practically important combinatorial optimisation problems, such as the travelling salesman problem and the bin packing problem, have been reformulated as reinforcement learning problems, increasing the importance of enabling the benefits of self-play beyond two-player games. We present the Ranked Reward (R2) algorithm which accomplishes this by ranking the rewards obtained by a single agent over multiple games to create a relative performance metric. Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers. We also present an analysis of the ranked reward mechanism, in particular, the effects of problem instances with varying difficulty and different ranking thresholds.
Tasks	Combinatorial Optimization
Published	2018-07-04
URL	http://arxiv.org/abs/1807.01672v3
PDF	http://arxiv.org/pdf/1807.01672v3.pdf
PWC	https://paperswithcode.com/paper/ranked-reward-enabling-self-play
Repo	https://github.com/karl-hajjar/InstaDeep-internship-Deep-RL
Framework	none

Structured Uncertainty Prediction Networks


Title	Structured Uncertainty Prediction Networks
Authors	Garoe Dorta, Sara Vicente, Lourdes Agapito, Neill D. F. Campbell, Ivor Simpson
Abstract	This paper is the first work to propose a network to predict a structured uncertainty distribution for a synthesized image. Previous approaches have been mostly limited to predicting diagonal covariance matrices. Our novel model learns to predict a full Gaussian covariance matrix for each reconstruction, which permits efficient sampling and likelihood evaluation. We demonstrate that our model can accurately reconstruct ground truth correlated residual distributions for synthetic datasets and generate plausible high frequency samples for real face images. We also illustrate the use of these predicted covariances for structure preserving image denoising.
Tasks	Denoising, Image Denoising
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07079v2
PDF	http://arxiv.org/pdf/1802.07079v2.pdf
PWC	https://paperswithcode.com/paper/structured-uncertainty-prediction-networks
Repo	https://github.com/Garoe/tf_mvg
Framework	tf

MobileFace: 3D Face Reconstruction with Efficient CNN Regression


Title	MobileFace: 3D Face Reconstruction with Efficient CNN Regression
Authors	Nikolai Chinaev, Alexander Chigorin, Ivan Laptev
Abstract	Estimation of facial shapes plays a central role for face transfer and animation. Accurate 3D face reconstruction, however, often deploys iterative and costly methods preventing real-time applications. In this work we design a compact and fast CNN model enabling real-time face reconstruction on mobile devices. For this purpose, we first study more traditional but slow morphable face models and use them to automatically annotate a large set of images for CNN training. We then investigate a class of efficient MobileNet CNNs and adapt such models for the task of shape regression. Our evaluation on three datasets demonstrates significant improvements in the speed and the size of our model while maintaining state-of-the-art reconstruction accuracy.
Tasks	3D Face Reconstruction, Face Reconstruction, Face Transfer
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08809v1
PDF	http://arxiv.org/pdf/1809.08809v1.pdf
PWC	https://paperswithcode.com/paper/mobileface-3d-face-reconstruction-with
Repo	https://github.com/nchinaev/MobileFace
Framework	none