Paper Group AWR 126
Generating Music using an LSTM Network. A Call for Clarity in Reporting BLEU Scores. Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition. CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning. Simple random search provides a competitive approach to reinforcement learning. Quickshift++: Provably Good Initi …
Generating Music using an LSTM Network
Title | Generating Music using an LSTM Network |
Authors | Nikhil Kotecha, Paul Young |
Abstract | A model of music needs to have the ability to recall past details and have a clear, coherent understanding of musical structure. Detailed in the paper is a neural network architecture that predicts and generates polyphonic music aligned with musical rules. The probabilistic model presented is a Bi-axial LSTM trained with a kernel reminiscent of a convolutional kernel. When analyzed quantitatively and qualitatively, this approach performs well in composing polyphonic music. Link to the code is provided. |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.07300v1 |
http://arxiv.org/pdf/1804.07300v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-music-using-an-lstm-network |
Repo | https://github.com/nikhil-kotecha/Generating_Music |
Framework | none |
A Call for Clarity in Reporting BLEU Scores
Title | A Call for Clarity in Reporting BLEU Scores |
Authors | Matt Post |
Abstract | The field of machine translation faces an under-recognized problem because of inconsistency in the reporting of scores from its dominant metric. Although people refer to “the” BLEU score, BLEU is in fact a parameterized metric whose values can vary wildly with changes to these parameters. These parameters are often not reported or are hard to find, and consequently, BLEU scores between papers cannot be directly compared. I quantify this variation, finding differences as high as 1.8 between commonly used configurations. The main culprit is different tokenization and normalization schemes applied to the reference. Pointing to the success of the parsing community, I suggest machine translation researchers settle upon the BLEU scheme used by the annual Conference on Machine Translation (WMT), which does not allow for user-supplied reference processing, and provide a new tool, SacreBLEU, to facilitate this. |
Tasks | Machine Translation, Tokenization |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08771v2 |
http://arxiv.org/pdf/1804.08771v2.pdf | |
PWC | https://paperswithcode.com/paper/a-call-for-clarity-in-reporting-bleu-scores |
Repo | https://github.com/mjpost/sacreBLEU |
Framework | none |
Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition
Title | Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition |
Authors | Federico Magliani, Tomaso Fontanini, Andrea Prati |
Abstract | The problem of landmark recognition has achieved excellent results in small-scale datasets. When dealing with large-scale retrieval, issues that were irrelevant with small amount of data, quickly become fundamental for an efficient retrieval phase. In particular, computational time needs to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. In this paper we propose a novel multi-index hashing method called Bag of Indexes (BoI) for Approximate Nearest Neighbors (ANN) search. It allows to drastically reduce the query time and outperforms the accuracy results compared to the state-of-the-art methods for large-scale landmark recognition. It has been demonstrated that this family of algorithms can be applied on different embedding techniques like VLAD and R-MAC obtaining excellent results in very short times on different public datasets: Holidays+Flickr1M, Oxford105k and Paris106k. |
Tasks | |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.05946v1 |
http://arxiv.org/pdf/1806.05946v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-nearest-neighbors-search-for-large |
Repo | https://github.com/fmaglia/BoI |
Framework | none |
CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Title | CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning |
Authors | Jerome Abdelnour, Giampiero Salvi, Jean Rouat |
Abstract | We introduce the task of acoustic question answering (AQA) in the area of acoustic reasoning. In this task an agent learns to answer questions on the basis of acoustic context. In order to promote research in this area, we propose a data generation paradigm adapted from CLEVR (Johnson et al. 2017). We generate acoustic scenes by leveraging a bank elementary sounds. We also provide a number of functional programs that can be used to compose questions and answers that exploit the relationships between the attributes of the elementary sounds in each scene. We provide AQA datasets of various sizes as well as the data generation code. As a preliminary experiment to validate our data, we report the accuracy of current state of the art visual question answering models when they are applied to the AQA task without modifications. Although there is a plethora of question answering tasks based on text, image or video data, to our knowledge, we are the first to propose answering questions directly on audio streams. We hope this contribution will facilitate the development of research in the area. |
Tasks | Acoustic Question Answering, Question Answering, Visual Question Answering |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10561v1 |
http://arxiv.org/pdf/1811.10561v1.pdf | |
PWC | https://paperswithcode.com/paper/clear-a-dataset-for-compositional-language |
Repo | https://github.com/IGLU-CHISTERA/CLEAR-dataset-generation |
Framework | none |
Simple random search provides a competitive approach to reinforcement learning
Title | Simple random search provides a competitive approach to reinforcement learning |
Authors | Horia Mania, Aurelia Guy, Benjamin Recht |
Abstract | A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks. Our method also finds a nearly optimal controller for a challenging instance of the Linear Quadratic Regulator, a classical problem in control theory, when the dynamics are not known. Computationally, our random search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks. We take advantage of this computational efficiency to evaluate the performance of our method over hundreds of random seeds and many different hyperparameter configurations for each benchmark task. Our simulations highlight a high variability in performance in these benchmark tasks, suggesting that commonly used estimations of sample efficiency do not adequately evaluate the performance of RL algorithms. |
Tasks | Continuous Control |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.07055v1 |
http://arxiv.org/pdf/1803.07055v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-random-search-provides-a-competitive |
Repo | https://github.com/kayuksel/pytorch-ars |
Framework | pytorch |
Quickshift++: Provably Good Initializations for Sample-Based Mean Shift
Title | Quickshift++: Provably Good Initializations for Sample-Based Mean Shift |
Authors | Heinrich Jiang, Jennifer Jang, Samory Kpotufe |
Abstract | We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applications to image segmentation. |
Tasks | Semantic Segmentation |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07909v1 |
http://arxiv.org/pdf/1805.07909v1.pdf | |
PWC | https://paperswithcode.com/paper/quickshift-provably-good-initializations-for |
Repo | https://github.com/google/quickshift |
Framework | none |
Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention
Title | Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention |
Authors | Khanh Nguyen, Debadeepta Dey, Chris Brockett, Bill Dolan |
Abstract | We present Vision-based Navigation with Language-based Assistance (VNLA), a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments. The task emulates a real-world scenario in that (a) the requester may not know how to navigate to the target objects and thus makes requests by only specifying high-level end-goals, and (b) the agent is capable of sensing when it is lost and querying an advisor, who is more qualified at the task, to obtain language subgoals to make progress. To model language-based assistance, we develop a general framework termed Imitation Learning with Indirect Intervention (I3L), and propose a solution that is effective on the VNLA task. Empirical results show that this approach significantly improves the success rate of the learning agent over other baselines in both seen and unseen environments. Our code and data are publicly available at https://github.com/debadeepta/vnla . |
Tasks | Imitation Learning, Vision-based navigation with language-based assistance, VNLA |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.04155v4 |
http://arxiv.org/pdf/1812.04155v4.pdf | |
PWC | https://paperswithcode.com/paper/vision-based-navigation-with-language-based |
Repo | https://github.com/debadeepta/vnla |
Framework | pytorch |
Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus
Title | Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus |
Authors | Jianxiong Dong, Jim Huang |
Abstract | Ubuntu dialogue corpus is the largest public available dialogue corpus to make it feasible to build end-to-end deep neural network models directly from the conversation data. One challenge of Ubuntu dialogue corpus is the large number of out-of-vocabulary words. In this paper we proposed a method which combines the general pre-trained word embedding vectors with those generated on the task-specific training set to address this issue. We integrated character embedding into Chen et al’s Enhanced LSTM method (ESIM) and used it to evaluate the effectiveness of our proposed method. For the task of next utterance selection, the proposed method has demonstrated a significant performance improvement against original ESIM and the new model has achieved state-of-the-art results on both Ubuntu dialogue corpus and Douban conversation corpus. In addition, we investigated the performance impact of end-of-utterance and end-of-turn token tags. |
Tasks | |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02614v2 |
http://arxiv.org/pdf/1802.02614v2.pdf | |
PWC | https://paperswithcode.com/paper/enhance-word-representation-for-out-of |
Repo | https://github.com/jdongca2003/next_utterance_selection |
Framework | tf |
Forecasting the presence and intensity of hostility on Instagram using linguistic and social features
Title | Forecasting the presence and intensity of hostility on Instagram using linguistic and social features |
Authors | Ping Liu, Joshua Guberman, Libby Hemphill, Aron Culotta |
Abstract | Online antisocial behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has negative physical and mental health consequences for victims and communities. While prior work has proposed automated methods to identify hostile comments in online discussions, these methods work retrospectively on comments that have already been posted, making it difficult to intervene before an interaction escalates. In this paper we instead consider the problem of forecasting future hostilities in online discussions, which we decompose into two tasks: (1) given an initial sequence of non-hostile comments in a discussion, predict whether some future comment will contain hostility; and (2) given the first hostile comment in a discussion, predict whether this will lead to an escalation of hostility in subsequent comments. Thus, we aim to forecast both the presence and intensity of hostile comments based on linguistic and social features from earlier comments. To evaluate our approach, we introduce a corpus of over 30K annotated Instagram comments from over 1,100 posts. Our approach is able to predict the appearance of a hostile comment on an Instagram post ten or more hours in the future with an AUC of .82 (task 1), and can furthermore distinguish between high and low levels of future hostility with an AUC of .91 (task 2). |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06759v1 |
http://arxiv.org/pdf/1804.06759v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-the-presence-and-intensity-of |
Repo | https://github.com/tapilab/icwsm-2018-hostility |
Framework | none |
Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection
Title | Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection |
Authors | Yisroel Mirsky, Tomer Doitshman, Yuval Elovici, Asaf Shabtai |
Abstract | Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time. In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune’s core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS. |
Tasks | Intrusion Detection, Network Intrusion Detection |
Published | 2018-02-25 |
URL | http://arxiv.org/abs/1802.09089v2 |
http://arxiv.org/pdf/1802.09089v2.pdf | |
PWC | https://paperswithcode.com/paper/kitsune-an-ensemble-of-autoencoders-for |
Repo | https://github.com/ymirsky/Kitsune-py |
Framework | none |
A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets
Title | A Taxonomy and Survey of Intrusion Detection System Design Techniques, Network Threats and Datasets |
Authors | Hanan Hindy, David Brosset, Ethan Bayne, Amar Seeam, Christos Tachtatzis, Robert Atkinson, Xavier Bellekens |
Abstract | With the world moving towards being increasingly dependent on computers and automation, one of the main challenges in the current decade has been to build secure applications, systems and networks. Alongside these challenges, the number of threats is rising exponentially due to the attack surface increasing through numerous interfaces offered for each service. To alleviate the impact of these threats, researchers have proposed numerous solutions; however, current tools often fail to adapt to ever-changing architectures, associated threats and 0-days. This manuscript aims to provide researchers with a taxonomy and survey of current dataset composition and current Intrusion Detection Systems (IDS) capabilities and assets. These taxonomies and surveys aim to improve both the efficiency of IDS and the creation of datasets to build the next generation IDS as well as to reflect networks threats more accurately in future datasets. To this end, this manuscript also provides a taxonomy and survey or network threats and associated tools. The manuscript highlights that current IDS only cover 25% of our threat taxonomy, while current datasets demonstrate clear lack of real-network threats and attack representation, but rather include a large number of deprecated threats, hence limiting the accuracy of current machine learning IDS. Moreover, the taxonomies are open-sourced to allow public contributions through a Github repository. |
Tasks | Intrusion Detection |
Published | 2018-06-09 |
URL | http://arxiv.org/abs/1806.03517v1 |
http://arxiv.org/pdf/1806.03517v1.pdf | |
PWC | https://paperswithcode.com/paper/a-taxonomy-and-survey-of-intrusion-detection |
Repo | https://github.com/AbertayMachineLearningGroup/network-threats-taxonomy |
Framework | none |
Domain Adaptation with Randomized Expectation Maximization
Title | Domain Adaptation with Randomized Expectation Maximization |
Authors | Twan van Laarhoven, Elena Marchiori |
Abstract | Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods. |
Tasks | Domain Adaptation, Transfer Learning |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07634v1 |
http://arxiv.org/pdf/1803.07634v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptation-with-randomized-expectation |
Repo | https://github.com/twanvl/adrem |
Framework | none |
Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization
Title | Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization |
Authors | Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir |
Abstract | Adversarial self-play in two-player games has delivered impressive results when used with reinforcement learning algorithms that combine deep neural networks and tree search. Algorithms like AlphaZero and Expert Iteration learn tabula-rasa, producing highly informative training data on the fly. However, the self-play training strategy is not directly applicable to single-player games. Recently, several practically important combinatorial optimisation problems, such as the travelling salesman problem and the bin packing problem, have been reformulated as reinforcement learning problems, increasing the importance of enabling the benefits of self-play beyond two-player games. We present the Ranked Reward (R2) algorithm which accomplishes this by ranking the rewards obtained by a single agent over multiple games to create a relative performance metric. Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers. We also present an analysis of the ranked reward mechanism, in particular, the effects of problem instances with varying difficulty and different ranking thresholds. |
Tasks | Combinatorial Optimization |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01672v3 |
http://arxiv.org/pdf/1807.01672v3.pdf | |
PWC | https://paperswithcode.com/paper/ranked-reward-enabling-self-play |
Repo | https://github.com/karl-hajjar/InstaDeep-internship-Deep-RL |
Framework | none |
Structured Uncertainty Prediction Networks
Title | Structured Uncertainty Prediction Networks |
Authors | Garoe Dorta, Sara Vicente, Lourdes Agapito, Neill D. F. Campbell, Ivor Simpson |
Abstract | This paper is the first work to propose a network to predict a structured uncertainty distribution for a synthesized image. Previous approaches have been mostly limited to predicting diagonal covariance matrices. Our novel model learns to predict a full Gaussian covariance matrix for each reconstruction, which permits efficient sampling and likelihood evaluation. We demonstrate that our model can accurately reconstruct ground truth correlated residual distributions for synthetic datasets and generate plausible high frequency samples for real face images. We also illustrate the use of these predicted covariances for structure preserving image denoising. |
Tasks | Denoising, Image Denoising |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07079v2 |
http://arxiv.org/pdf/1802.07079v2.pdf | |
PWC | https://paperswithcode.com/paper/structured-uncertainty-prediction-networks |
Repo | https://github.com/Garoe/tf_mvg |
Framework | tf |
MobileFace: 3D Face Reconstruction with Efficient CNN Regression
Title | MobileFace: 3D Face Reconstruction with Efficient CNN Regression |
Authors | Nikolai Chinaev, Alexander Chigorin, Ivan Laptev |
Abstract | Estimation of facial shapes plays a central role for face transfer and animation. Accurate 3D face reconstruction, however, often deploys iterative and costly methods preventing real-time applications. In this work we design a compact and fast CNN model enabling real-time face reconstruction on mobile devices. For this purpose, we first study more traditional but slow morphable face models and use them to automatically annotate a large set of images for CNN training. We then investigate a class of efficient MobileNet CNNs and adapt such models for the task of shape regression. Our evaluation on three datasets demonstrates significant improvements in the speed and the size of our model while maintaining state-of-the-art reconstruction accuracy. |
Tasks | 3D Face Reconstruction, Face Reconstruction, Face Transfer |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08809v1 |
http://arxiv.org/pdf/1809.08809v1.pdf | |
PWC | https://paperswithcode.com/paper/mobileface-3d-face-reconstruction-with |
Repo | https://github.com/nchinaev/MobileFace |
Framework | none |