October 21, 2019

2975 words 14 mins read

Paper Group AWR 126

Paper Group AWR 126

Generating Music using an LSTM Network. A Call for Clarity in Reporting BLEU Scores. Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition. CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning. Simple random search provides a competitive approach to reinforcement learning. Quickshift++: Provably Good Initi …

Generating Music using an LSTM Network

Title Generating Music using an LSTM Network
Authors Nikhil Kotecha, Paul Young
Abstract A model of music needs to have the ability to recall past details and have a clear, coherent understanding of musical structure. Detailed in the paper is a neural network architecture that predicts and generates polyphonic music aligned with musical rules. The probabilistic model presented is a Bi-axial LSTM trained with a kernel reminiscent of a convolutional kernel. When analyzed quantitatively and qualitatively, this approach performs well in composing polyphonic music. Link to the code is provided.
Tasks
Published 2018-04-18
URL http://arxiv.org/abs/1804.07300v1
PDF http://arxiv.org/pdf/1804.07300v1.pdf
PWC https://paperswithcode.com/paper/generating-music-using-an-lstm-network
Repo https://github.com/nikhil-kotecha/Generating_Music
Framework none

A Call for Clarity in Reporting BLEU Scores

Title A Call for Clarity in Reporting BLEU Scores
Authors Matt Post
Abstract The field of machine translation faces an under-recognized problem because of inconsistency in the reporting of scores from its dominant metric. Although people refer to “the” BLEU score, BLEU is in fact a parameterized metric whose values can vary wildly with changes to these parameters. These parameters are often not reported or are hard to find, and consequently, BLEU scores between papers cannot be directly compared. I quantify this variation, finding differences as high as 1.8 between commonly used configurations. The main culprit is different tokenization and normalization schemes applied to the reference. Pointing to the success of the parsing community, I suggest machine translation researchers settle upon the BLEU scheme used by the annual Conference on Machine Translation (WMT), which does not allow for user-supplied reference processing, and provide a new tool, SacreBLEU, to facilitate this.
Tasks Machine Translation, Tokenization
Published 2018-04-23
URL http://arxiv.org/abs/1804.08771v2
PDF http://arxiv.org/pdf/1804.08771v2.pdf
PWC https://paperswithcode.com/paper/a-call-for-clarity-in-reporting-bleu-scores
Repo https://github.com/mjpost/sacreBLEU
Framework none

Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition

Title Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition
Authors Federico Magliani, Tomaso Fontanini, Andrea Prati
Abstract The problem of landmark recognition has achieved excellent results in small-scale datasets. When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k.
Tasks
Published 2018-06-15
URL http://arxiv.org/abs/1806.05946v1
PDF http://arxiv.org/pdf/1806.05946v1.pdf
PWC https://paperswithcode.com/paper/efficient-nearest-neighbors-search-for-large
Repo https://github.com/fmaglia/BoI
Framework none

CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning

Title CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Authors Jerome Abdelnour, Giampiero Salvi, Jean Rouat
Abstract We introduce the task of acoustic question answering (AQA) in the area of acoustic reasoning. In this task an agent learns to answer questions on the basis of acoustic context. In order to promote research in this area, we propose a data generation paradigm adapted from CLEVR (Johnson et al. 2017). We generate acoustic scenes by leveraging a bank elementary sounds. We also provide a number of functional programs that can be used to compose questions and answers that exploit the relationships between the attributes of the elementary sounds in each scene. We provide AQA datasets of various sizes as well as the data generation code. As a preliminary experiment to validate our data, we report the accuracy of current state of the art visual question answering models when they are applied to the AQA task without modifications. Although there is a plethora of question answering tasks based on text, image or video data, to our knowledge, we are the first to propose answering questions directly on audio streams. We hope this contribution will facilitate the development of research in the area.
Tasks Acoustic Question Answering, Question Answering, Visual Question Answering
Published 2018-11-26
URL http://arxiv.org/abs/1811.10561v1
PDF http://arxiv.org/pdf/1811.10561v1.pdf
PWC https://paperswithcode.com/paper/clear-a-dataset-for-compositional-language
Repo https://github.com/IGLU-CHISTERA/CLEAR-dataset-generation
Framework none

Simple random search provides a competitive approach to reinforcement learning

Title Simple random search provides a competitive approach to reinforcement learning
Authors Horia Mania, Aurelia Guy, Benjamin Recht
Abstract A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks. Our method also finds a nearly optimal controller for a challenging instance of the Linear Quadratic Regulator, a classical problem in control theory, when the dynamics are not known. Computationally, our random search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks. We take advantage of this computational efficiency to evaluate the performance of our method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task. Our simulations highlight a high variability in performance in these benchmark tasks, suggesting that commonly used estimations of sample efficiency do not adequately evaluate the performance of RL algorithms.
Tasks Continuous Control
Published 2018-03-19
URL http://arxiv.org/abs/1803.07055v1
PDF http://arxiv.org/pdf/1803.07055v1.pdf
PWC https://paperswithcode.com/paper/simple-random-search-provides-a-competitive
Repo https://github.com/kayuksel/pytorch-ars
Framework pytorch

Quickshift++: Provably Good Initializations for Sample-Based Mean Shift

Title Quickshift++: Provably Good Initializations for Sample-Based Mean Shift
Authors Heinrich Jiang, Jennifer Jang, Samory Kpotufe
Abstract We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applications to image segmentation.
Tasks Semantic Segmentation
Published 2018-05-21
URL http://arxiv.org/abs/1805.07909v1
PDF http://arxiv.org/pdf/1805.07909v1.pdf
PWC https://paperswithcode.com/paper/quickshift-provably-good-initializations-for
Repo https://github.com/google/quickshift
Framework none

Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention

Title Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention
Authors Khanh Nguyen, Debadeepta Dey, Chris Brockett, Bill Dolan
Abstract We present Vision-based Navigation with Language-based Assistance (VNLA), a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments. The task emulates a real-world scenario in that (a) the requester may not know how to navigate to the target objects and thus makes requests by only specifying high-level end-goals, and (b) the agent is capable of sensing when it is lost and querying an advisor, who is more qualified at the task, to obtain language subgoals to make progress. To model language-based assistance, we develop a general framework termed Imitation Learning with Indirect Intervention (I3L), and propose a solution that is effective on the VNLA task. Empirical results show that this approach significantly improves the success rate of the learning agent over other baselines in both seen and unseen environments. Our code and data are publicly available at https://github.com/debadeepta/vnla .
Tasks Imitation Learning, Vision-based navigation with language-based assistance, VNLA
Published 2018-12-10
URL http://arxiv.org/abs/1812.04155v4
PDF http://arxiv.org/pdf/1812.04155v4.pdf
PWC https://paperswithcode.com/paper/vision-based-navigation-with-language-based
Repo https://github.com/debadeepta/vnla
Framework pytorch

Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus

Title Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus
Authors Jianxiong Dong, Jim Huang
Abstract Ubuntu dialogue corpus is the largest public available dialogue corpus to make it feasible to build end-to-end deep neural network models directly from the conversation data. One challenge of Ubuntu dialogue corpus is the large number of out-of-vocabulary words. In this paper we proposed a method which combines the general pre-trained word embedding vectors with those generated on the task-specific training set to address this issue. We integrated character embedding into Chen et al’s Enhanced LSTM method (ESIM) and used it to evaluate the effectiveness of our proposed method. For the task of next utterance selection, the proposed method has demonstrated a significant performance improvement against original ESIM and the new model has achieved state-of-the-art results on both Ubuntu dialogue corpus and Douban conversation corpus. In addition, we investigated the performance impact of end-of-utterance and end-of-turn token tags.
Tasks
Published 2018-02-07
URL http://arxiv.org/abs/1802.02614v2
PDF http://arxiv.org/pdf/1802.02614v2.pdf
PWC https://paperswithcode.com/paper/enhance-word-representation-for-out-of
Repo https://github.com/jdongca2003/next_utterance_selection
Framework tf

Forecasting the presence and intensity of hostility on Instagram using linguistic and social features

Title Forecasting the presence and intensity of hostility on Instagram using linguistic and social features
Authors Ping Liu, Joshua Guberman, Libby Hemphill, Aron Culotta
Abstract Online antisocial behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has negative physical and mental health consequences for victims and communities. While prior work has proposed automated methods to identify hostile comments in online discussions, these methods work retrospectively on comments that have already been posted, making it difficult to intervene before an interaction escalates. In this paper we instead consider the problem of forecasting future hostilities in online discussions, which we decompose into two tasks: (1) given an initial sequence of non-hostile comments in a discussion, predict whether some future comment will contain hostility; and (2) given the first hostile comment in a discussion, predict whether this will lead to an escalation of hostility in subsequent comments. Thus, we aim to forecast both the presence and intensity of hostile comments based on linguistic and social features from earlier comments. To evaluate our approach, we introduce a corpus of over 30K annotated Instagram comments from over 1,100 posts. Our approach is able to predict the appearance of a hostile comment on an Instagram post ten or more hours in the future with an AUC of .82 (task 1), and can furthermore distinguish between high and low levels of future hostility with an AUC of .91 (task 2).
Tasks
Published 2018-04-18
URL http://arxiv.org/abs/1804.06759v1
PDF http://arxiv.org/pdf/1804.06759v1.pdf
PWC https://paperswithcode.com/paper/forecasting-the-presence-and-intensity-of
Repo https://github.com/tapilab/icwsm-2018-hostility
Framework none

Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

Title Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection
Authors Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, Asaf Shabtai
Abstract Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time. In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune’s core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS.
Tasks Intrusion Detection, Network Intrusion Detection
Published 2018-02-25
URL http://arxiv.org/abs/1802.09089v2
PDF http://arxiv.org/pdf/1802.09089v2.pdf
PWC https://paperswithcode.com/paper/kitsune-an-ensemble-of-autoencoders-for
Repo https://github.com/ymirsky/Kitsune-py
Framework none

A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets

Title A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets
Authors Hanan Hindy, David Brosset, Ethan Bayne, Amar Seeam, Christos Tachtatzis, Robert Atkinson, Xavier Bellekens
Abstract With the world moving towards being increasingly dependent on computers and automation, one of the main challenges in the current decade has been to build secure applications, systems and networks. Alongside these challenges, the number of threats is rising exponentially due to the attack surface increasing through numerous interfaces offered for each service. To alleviate the impact of these threats, researchers have proposed numerous solutions; however, current tools often fail to adapt to ever-changing architectures, associated threats and 0-days. This manuscript aims to provide researchers with a taxonomy and survey of current dataset composition and current Intrusion Detection Systems (IDS) capabilities and assets. These taxonomies and surveys aim to improve both the efficiency of IDS and the creation of datasets to build the next generation IDS as well as to reflect networks threats more accurately in future datasets. To this end, this manuscript also provides a taxonomy and survey or network threats and associated tools. The manuscript highlights that current IDS only cover 25% of our threat taxonomy, while current datasets demonstrate clear lack of real-network threats and attack representation, but rather include a large number of deprecated threats, hence limiting the accuracy of current machine learning IDS. Moreover, the taxonomies are open-sourced to allow public contributions through a Github repository.
Tasks Intrusion Detection
Published 2018-06-09
URL http://arxiv.org/abs/1806.03517v1
PDF http://arxiv.org/pdf/1806.03517v1.pdf
PWC https://paperswithcode.com/paper/a-taxonomy-and-survey-of-intrusion-detection
Repo https://github.com/AbertayMachineLearningGroup/network-threats-taxonomy
Framework none

Domain Adaptation with Randomized Expectation Maximization

Title Domain Adaptation with Randomized Expectation Maximization
Authors Twan van Laarhoven, Elena Marchiori
Abstract Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods.
Tasks Domain Adaptation, Transfer Learning
Published 2018-03-20
URL http://arxiv.org/abs/1803.07634v1
PDF http://arxiv.org/pdf/1803.07634v1.pdf
PWC https://paperswithcode.com/paper/domain-adaptation-with-randomized-expectation
Repo https://github.com/twanvl/adrem
Framework none

Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization

Title Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization
Authors Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir
Abstract Adversarial self-play in two-player games has delivered impressive results when used with reinforcement learning algorithms that combine deep neural networks and tree search. Algorithms like AlphaZero and Expert Iteration learn tabula-rasa, producing highly informative training data on the fly. However, the self-play training strategy is not directly applicable to single-player games. Recently, several practically important combinatorial optimisation problems, such as the travelling salesman problem and the bin packing problem, have been reformulated as reinforcement learning problems, increasing the importance of enabling the benefits of self-play beyond two-player games. We present the Ranked Reward (R2) algorithm which accomplishes this by ranking the rewards obtained by a single agent over multiple games to create a relative performance metric. Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers. We also present an analysis of the ranked reward mechanism, in particular, the effects of problem instances with varying difficulty and different ranking thresholds.
Tasks Combinatorial Optimization
Published 2018-07-04
URL http://arxiv.org/abs/1807.01672v3
PDF http://arxiv.org/pdf/1807.01672v3.pdf
PWC https://paperswithcode.com/paper/ranked-reward-enabling-self-play
Repo https://github.com/karl-hajjar/InstaDeep-internship-Deep-RL
Framework none

Structured Uncertainty Prediction Networks

Title Structured Uncertainty Prediction Networks
Authors Garoe Dorta, Sara Vicente, Lourdes Agapito, Neill D. F. Campbell, Ivor Simpson
Abstract This paper is the first work to propose a network to predict a structured uncertainty distribution for a synthesized image. Previous approaches have been mostly limited to predicting diagonal covariance matrices. Our novel model learns to predict a full Gaussian covariance matrix for each reconstruction, which permits efficient sampling and likelihood evaluation. We demonstrate that our model can accurately reconstruct ground truth correlated residual distributions for synthetic datasets and generate plausible high frequency samples for real face images. We also illustrate the use of these predicted covariances for structure preserving image denoising.
Tasks Denoising, Image Denoising
Published 2018-02-20
URL http://arxiv.org/abs/1802.07079v2
PDF http://arxiv.org/pdf/1802.07079v2.pdf
PWC https://paperswithcode.com/paper/structured-uncertainty-prediction-networks
Repo https://github.com/Garoe/tf_mvg
Framework tf

MobileFace: 3D Face Reconstruction with Efficient CNN Regression

Title MobileFace: 3D Face Reconstruction with Efficient CNN Regression
Authors Nikolai Chinaev, Alexander Chigorin, Ivan Laptev
Abstract Estimation of facial shapes plays a central role for face transfer and animation. Accurate 3D face reconstruction, however, often deploys iterative and costly methods preventing real-time applications. In this work we design a compact and fast CNN model enabling real-time face reconstruction on mobile devices. For this purpose, we first study more traditional but slow morphable face models and use them to automatically annotate a large set of images for CNN training. We then investigate a class of efficient MobileNet CNNs and adapt such models for the task of shape regression. Our evaluation on three datasets demonstrates significant improvements in the speed and the size of our model while maintaining state-of-the-art reconstruction accuracy.
Tasks 3D Face Reconstruction, Face Reconstruction, Face Transfer
Published 2018-09-24
URL http://arxiv.org/abs/1809.08809v1
PDF http://arxiv.org/pdf/1809.08809v1.pdf
PWC https://paperswithcode.com/paper/mobileface-3d-face-reconstruction-with
Repo https://github.com/nchinaev/MobileFace
Framework none
comments powered by Disqus