July 29, 2019

2993 words 15 mins read

Paper Group AWR 200

Paper Group AWR 200

Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM. A Benchmark Environment Motivated by Industrial Control Problems. Unsupervised Steganalysis Based on Artificial Training Sets. Optical Music Recognition with Convolutional Sequence-to-Sequence Models. Repeatability Is Not Enough: Learning Affine Regions via Discri …

Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM

Title Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM
Authors Lai Jiang, Mai Xu, Zulin Wang
Abstract Over the past few years, deep neural networks (DNNs) have exhibited great success in predicting the saliency of images. However, there are few works that apply DNNs to predict the saliency of generic videos. In this paper, we propose a novel DNN-based video saliency prediction method. Specifically, we establish a large-scale eye-tracking database of videos (LEDOV), which provides sufficient data to train the DNN models for predicting video saliency. Through the statistical analysis of our LEDOV database, we find that human attention is normally attracted by objects, particularly moving objects or the moving parts of objects. Accordingly, we propose an object-to-motion convolutional neural network (OM-CNN) to learn spatio-temporal features for predicting the intra-frame saliency via exploring the information of both objectness and object motion. We further find from our database that there exists a temporal correlation of human attention with a smooth saliency transition across video frames. Therefore, we develop a two-layer convolutional long short-term memory (2C-LSTM) network in our DNN-based method, using the extracted features of OM-CNN as the input. Consequently, the inter-frame saliency maps of videos can be generated, which consider the transition of attention across video frames. Finally, the experimental results show that our method advances the state-of-the-art in video saliency prediction.
Tasks Eye Tracking, Saliency Prediction
Published 2017-09-19
URL http://arxiv.org/abs/1709.06316v3
PDF http://arxiv.org/pdf/1709.06316v3.pdf
PWC https://paperswithcode.com/paper/predicting-video-saliency-with-object-to
Repo https://github.com/remega/LEDOV-eye-tracking-database
Framework none

A Benchmark Environment Motivated by Industrial Control Problems

Title A Benchmark Environment Motivated by Industrial Control Problems
Authors Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing
Abstract In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand. On the other hand, they usually do not share much similarity with industrial real-world applications. For this reason we used our industry experience to design a benchmark which bridges the gap between freely available, documented, and motivated artificial benchmarks and properties of real industrial problems. The resulting industrial benchmark (IB) has been made publicly available to the RL community by publishing its Java and Python code, including an OpenAI Gym wrapper, on Github. In this paper we motivate and describe in detail the IB’s dynamics and identify prototypic experimental settings that capture common situations in real-world industry control problems.
Tasks
Published 2017-09-27
URL http://arxiv.org/abs/1709.09480v2
PDF http://arxiv.org/pdf/1709.09480v2.pdf
PWC https://paperswithcode.com/paper/a-benchmark-environment-motivated-by
Repo https://github.com/siemens/industrialbenchmark
Framework none

Unsupervised Steganalysis Based on Artificial Training Sets

Title Unsupervised Steganalysis Based on Artificial Training Sets
Authors Daniel Lerch-Hostalot, David Megías
Abstract In this paper, an unsupervised steganalysis method that combines artificial training setsand supervised classification is proposed. We provide a formal framework for unsupervisedclassification of stego and cover images in the typical situation of targeted steganalysis (i.e.,for a known algorithm and approximate embedding bit rate). We also present a completeset of experiments using 1) eight different image databases, 2) image features based on RichModels, and 3) three different embedding algorithms: Least Significant Bit (LSB) matching,Highly undetectable steganography (HUGO) and Wavelet Obtained Weights (WOW). Weshow that the experimental results outperform previous methods based on Rich Models inthe majority of the tested cases. At the same time, the proposed approach bypasses theproblem of Cover Source Mismatch -when the embedding algorithm and bit rate are known-, since it removes the need of a training database when we have a large enough testing set.Furthermore, we provide a generic proof of the proposed framework in the machine learningcontext. Hence, the results of this paper could be extended to other classification problemssimilar to steganalysis.
Tasks
Published 2017-03-02
URL http://arxiv.org/abs/1703.00796v1
PDF http://arxiv.org/pdf/1703.00796v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-steganalysis-based-on-artificial
Repo https://github.com/daniellerch/papers
Framework none

Optical Music Recognition with Convolutional Sequence-to-Sequence Models

Title Optical Music Recognition with Convolutional Sequence-to-Sequence Models
Authors Eelco van der Wel, Karen Ullrich
Abstract Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-end trainable OMR pipeline, and apply a learning process that trains on full sentences of sheet music instead of individually labeled symbols. The model is trained and evaluated on a human generated data set, with various image augmentations based on real-world scenarios. This data set is the first publicly available set in OMR research with sufficient size to train and evaluate deep learning models. With the introduced augmentations a pitch recognition accuracy of 81% and a duration accuracy of 94% is achieved, resulting in a note level accuracy of 80%. Finally, the model is compared to commercially available methods, showing a large improvements over these applications.
Tasks Information Retrieval, Music Information Retrieval
Published 2017-07-16
URL http://arxiv.org/abs/1707.04877v1
PDF http://arxiv.org/pdf/1707.04877v1.pdf
PWC https://paperswithcode.com/paper/optical-music-recognition-with-convolutional
Repo https://github.com/apacha/OMR-Datasets
Framework none

Repeatability Is Not Enough: Learning Affine Regions via Discriminability

Title Repeatability Is Not Enough: Learning Affine Regions via Discriminability
Authors Dmytro Mishkin, Filip Radenovic, Jiri Matas
Abstract A method for learning local affine-covariant regions is presented. We show that maximizing geometric repeatability does not lead to local regions, a.k.a features,that are reliably matched and this necessitates descriptor-based learning. We explore factors that influence such learning and registration: the loss function, descriptor type, geometric parametrization and the trade-off between matchability and geometric accuracy and propose a novel hard negative-constant loss function for learning of affine regions. The affine shape estimator – AffNet – trained with the hard negative-constant loss outperforms the state-of-the-art in bag-of-words image retrieval and wide baseline stereo. The proposed training process does not require precisely geometrically aligned patches.The source codes and trained weights are available at https://github.com/ducha-aiki/affnet
Tasks Image Retrieval
Published 2017-11-17
URL http://arxiv.org/abs/1711.06704v4
PDF http://arxiv.org/pdf/1711.06704v4.pdf
PWC https://paperswithcode.com/paper/repeatability-is-not-enough-learning-affine
Repo https://github.com/ducha-aiki/affnet
Framework pytorch

Learning with Biased Complementary Labels

Title Learning with Biased Complementary Labels
Authors Xiyu Yu, Tongliang Liu, Mingming Gong, Dacheng Tao
Abstract In this paper, we study the classification problem in which we have access to easily obtainable surrogate for true labels, namely complementary labels, which specify classes that observations do \textbf{not} belong to. Let $Y$ and $\bar{Y}$ be the true and complementary labels, respectively. We first model the annotation of complementary labels via transition probabilities $P(\bar{Y}=iY=j), i\neq j\in{1,\cdots,c}$, where $c$ is the number of classes. Previous methods implicitly assume that $P(\bar{Y}=iY=j), \forall i\neq j$, are identical, which is not true in practice because humans are biased toward their own experience. For example, as shown in Figure 1, if an annotator is more familiar with monkeys than prairie dogs when providing complementary labels for meerkats, she is more likely to employ “monkey” as a complementary label. We therefore reason that the transition probabilities will be different. In this paper, we propose a framework that contributes three main innovations to learning with \textbf{biased} complementary labels: (1) It estimates transition probabilities with no bias. (2) It provides a general method to modify traditional loss functions and extends standard deep neural network classifiers to learn with biased complementary labels. (3) It theoretically ensures that the classifier learned with complementary labels converges to the optimal one learned with true labels. Comprehensive experiments on several benchmark datasets validate the superiority of our method to current state-of-the-art methods.
Tasks
Published 2017-11-27
URL http://arxiv.org/abs/1711.09535v3
PDF http://arxiv.org/pdf/1711.09535v3.pdf
PWC https://paperswithcode.com/paper/learning-with-biased-complementary-labels
Repo https://github.com/takashiishida/comp
Framework pytorch

Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis

Title Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis
Authors Luke de Oliveira, Michela Paganini, Benjamin Nachman
Abstract We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images – 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in High Energy Particle Physics.
Tasks
Published 2017-01-20
URL http://arxiv.org/abs/1701.05927v2
PDF http://arxiv.org/pdf/1701.05927v2.pdf
PWC https://paperswithcode.com/paper/learning-particle-physics-by-example-location
Repo https://github.com/hep-lbdl/adversarial-jets
Framework none

Metrical-accent Aware Vocal Onset Detection in Polyphonic Audio

Title Metrical-accent Aware Vocal Onset Detection in Polyphonic Audio
Authors Georgi Dzhambazov, Andre Holzapfel, Ajay Srinivasamurthy, Xavier Serra
Abstract The goal of this study is the automatic detection of onsets of the singing voice in polyphonic audio recordings. Starting with a hypothesis that the knowledge of the current position in a metrical cycle (i.e. metrical accent) can improve the accuracy of vocal note onset detection, we propose a novel probabilistic model to jointly track beats and vocal note onsets. The proposed model extends a state of the art model for beat and meter tracking, in which a-priori probability of a note at a specific metrical accent interacts with the probability of observing a vocal note onset. We carry out an evaluation on a varied collection of multi-instrument datasets from two music traditions (English popular music and Turkish makam) with different types of metrical cycles and singing styles. Results confirm that the proposed model reasonably improves vocal note onset detection accuracy compared to a baseline model that does not take metrical position into account.
Tasks
Published 2017-07-19
URL http://arxiv.org/abs/1707.06163v1
PDF http://arxiv.org/pdf/1707.06163v1.pdf
PWC https://paperswithcode.com/paper/metrical-accent-aware-vocal-onset-detection
Repo https://github.com/georgid/lakh_vocal_segments_dataset
Framework none

A Syntactic Neural Model for General-Purpose Code Generation

Title A Syntactic Neural Model for General-Purpose Code Generation
Authors Pengcheng Yin, Graham Neubig
Abstract We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python. Existing data-driven methods treat this problem as a language generation task without considering the underlying syntax of the target programming language. Informed by previous work in semantic parsing, in this paper we propose a novel neural architecture powered by a grammar model to explicitly capture the target syntax as prior knowledge. Experiments find this an effective way to scale up to generation of complex programs from natural language descriptions, achieving state-of-the-art results that well outperform previous code generation and semantic parsing approaches.
Tasks Code Generation, Semantic Parsing, Text Generation
Published 2017-04-06
URL http://arxiv.org/abs/1704.01696v1
PDF http://arxiv.org/pdf/1704.01696v1.pdf
PWC https://paperswithcode.com/paper/a-syntactic-neural-model-for-general-purpose
Repo https://github.com/zimengq/PyTorch-ReCode
Framework pytorch

Nematus: a Toolkit for Neural Machine Translation

Title Nematus: a Toolkit for Neural Machine Translation
Authors Rico Sennrich, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys-Dowmunt, Samuel Läubli, Antonio Valerio Miceli Barone, Jozef Mokry, Maria Nădejde
Abstract We present Nematus, a toolkit for Neural Machine Translation. The toolkit prioritizes high translation accuracy, usability, and extensibility. Nematus has been used to build top-performing submissions to shared translation tasks at WMT and IWSLT, and has been used to train systems for production environments.
Tasks Machine Translation
Published 2017-03-13
URL http://arxiv.org/abs/1703.04357v1
PDF http://arxiv.org/pdf/1703.04357v1.pdf
PWC https://paperswithcode.com/paper/nematus-a-toolkit-for-neural-machine
Repo https://github.com/Avmb/code-docstring-corpus
Framework none

IKBT: solving closed-form Inverse Kinematics with Behavior Tree

Title IKBT: solving closed-form Inverse Kinematics with Behavior Tree
Authors Dianmu Zhang, Blake Hannaford
Abstract Serial robot arms have complicated kinematic equations which must be solved to write effective arm planning and control software (the Inverse Kinematics Problem). Existing software packages for inverse kinematics often rely on numerical methods which have significant shortcomings. Here we report a new symbolic inverse kinematics solver which overcomes the limitations of numerical methods, and the shortcomings of previous symbolic software packages. We integrate Behavior Trees, an execution planning framework previously used for controlling intelligent robot behavior, to organize the equation solving process, and a modular architecture for each solution technique. The system successfully solved, generated a LaTex report, and generated a Python code template for 18 out of 19 example robots of 4-6 DOF. The system is readily extensible, maintainable, and multi-platform with few dependencies. The complete package is available with a Modified BSD license on Github.
Tasks
Published 2017-11-15
URL http://arxiv.org/abs/1711.05412v3
PDF http://arxiv.org/pdf/1711.05412v3.pdf
PWC https://paperswithcode.com/paper/ikbt-solving-closed-form-inverse-kinematics
Repo https://github.com/uw-biorobotics/IKBT
Framework none

One pixel attack for fooling deep neural networks

Title One pixel attack for fooling deep neural networks
Authors Jiawei Su, Danilo Vasconcellos Vargas, Sakurai Kouichi
Abstract Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE). It requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE. The results show that 67.97% of the natural images in Kaggle CIFAR-10 test dataset and 16.04% of the ImageNet (ILSVRC 2012) test images can be perturbed to at least one target class by modifying just one pixel with 74.03% and 22.91% confidence on average. We also show the same vulnerability on the original CIFAR-10 dataset. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks. Besides, we also illustrate an important application of DE (or broadly speaking, evolutionary computation) in the domain of adversarial machine learning: creating tools that can effectively generate low-cost adversarial attacks against neural networks for evaluating robustness.
Tasks
Published 2017-10-24
URL https://arxiv.org/abs/1710.08864v7
PDF https://arxiv.org/pdf/1710.08864v7.pdf
PWC https://paperswithcode.com/paper/one-pixel-attack-for-fooling-deep-neural
Repo https://github.com/Axeleik/pixel_attack
Framework pytorch

A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques

Title A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques
Authors Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut
Abstract The amount of text that is generated every day is increasing dramatically. This tremendous volume of mostly unstructured text cannot be simply processed and perceived by computers. Therefore, efficient and effective techniques and algorithms are required to discover useful patterns. Text mining is the task of extracting meaningful information from text, which has gained significant attentions in recent years. In this paper, we describe several of the most fundamental text mining tasks and techniques including text pre-processing, classification and clustering. Additionally, we briefly explain text mining in biomedical and health care domains.
Tasks
Published 2017-07-10
URL http://arxiv.org/abs/1707.02919v2
PDF http://arxiv.org/pdf/1707.02919v2.pdf
PWC https://paperswithcode.com/paper/a-brief-survey-of-text-mining-classification
Repo https://github.com/RAJAT--PALIWAL/research_AI
Framework none

Graph Embedding Techniques, Applications, and Performance: A Survey

Title Graph Embedding Techniques, Applications, and Performance: A Survey
Authors Palash Goyal, Emilio Ferrara
Abstract Graphs, such as social networks, word co-occurrence networks, and communication networks, occur naturally in various real-world applications. Analyzing them yields insight into the structure of society, language, and different patterns of communication. Many approaches have been proposed to perform the analysis. Recently, methods which use the representation of graph nodes in vector space have gained traction from the research community. In this survey, we provide a comprehensive and structured analysis of various graph embedding techniques proposed in the literature. We first introduce the embedding task and its challenges such as scalability, choice of dimensionality, and features to be preserved, and their possible solutions. We then present three categories of approaches based on factorization methods, random walks, and deep learning, with examples of representative algorithms in each category and analysis of their performance on various tasks. We evaluate these state-of-the-art methods on a few common datasets and compare their performance against one another. Our analysis concludes by suggesting some potential applications and future directions. We finally present the open-source Python library we developed, named GEM (Graph Embedding Methods, available at https://github.com/palash1992/GEM), which provides all presented algorithms within a unified interface to foster and facilitate research on the topic.
Tasks Graph Embedding
Published 2017-05-08
URL http://arxiv.org/abs/1705.02801v4
PDF http://arxiv.org/pdf/1705.02801v4.pdf
PWC https://paperswithcode.com/paper/graph-embedding-techniques-applications-and
Repo https://github.com/olekscode/Power2TheWiki
Framework none

Sockeye: A Toolkit for Neural Machine Translation

Title Sockeye: A Toolkit for Neural Machine Translation
Authors Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post
Abstract We describe Sockeye (version 1.12), an open-source sequence-to-sequence toolkit for Neural Machine Translation (NMT). Sockeye is a production-ready framework for training and applying models as well as an experimental platform for researchers. Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attentional recurrent neural networks, self-attentional transformers, and fully convolutional networks. Sockeye also supports a wide range of optimizers, normalization and regularization techniques, and inference improvements from current NMT literature. Users can easily run standard training recipes, explore different model settings, and incorporate new ideas. In this paper, we highlight Sockeye’s features and benchmark it against other NMT toolkits on two language arcs from the 2017 Conference on Machine Translation (WMT): English-German and Latvian-English. We report competitive BLEU scores across all three architectures, including an overall best score for Sockeye’s transformer implementation. To facilitate further comparison, we release all system outputs and training scripts used in our experiments. The Sockeye toolkit is free software released under the Apache 2.0 license.
Tasks Machine Translation
Published 2017-12-15
URL http://arxiv.org/abs/1712.05690v2
PDF http://arxiv.org/pdf/1712.05690v2.pdf
PWC https://paperswithcode.com/paper/sockeye-a-toolkit-for-neural-machine
Repo https://github.com/Izecson/saml-nmt
Framework mxnet
comments powered by Disqus