October 17, 2019

2897 words 14 mins read

Paper Group ANR 697

AtDelfi: Automatically Designing Legible, Full Instructions For Games. Better Runtime Guarantees Via Stochastic Domination. Uncovering the Social Interaction in Swarm Intelligence with Network Science. Generalized Hadamard-Product Fusion Operators for Visual Question Answering. MLCapsule: Guarded Offline Deployment of Machine Learning as a Service. …

AtDelfi: Automatically Designing Legible, Full Instructions For Games


Title	AtDelfi: Automatically Designing Legible, Full Instructions For Games
Authors	Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros, Tiago Machado, Andy Nealen, Julian Togelius
Abstract	This paper introduces a fully automatic method for generating video game tutorials. The AtDELFI system (AuTomatically DEsigning Legible, Full Instructions for games) was created to investigate procedural generation of instructions that teach players how to play video games. We present a representation of game rules and mechanics using a graph system as well as a tutorial generation method that uses said graph representation. We demonstrate the concept by testing it on games within the General Video Game Artificial Intelligence (GVG-AI) framework; the paper discusses tutorials generated for eight different games. Our findings suggest that a graph representation scheme works well for simple arcade style games such as Space Invaders and Pacman, but it appears that tutorials for more complex games might require higher-level understanding of the game than just single mechanics.
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04375v2
PDF	http://arxiv.org/pdf/1807.04375v2.pdf
PWC	https://paperswithcode.com/paper/atdelfi-automatically-designing-legible-full
Repo
Framework

Better Runtime Guarantees Via Stochastic Domination


Title	Better Runtime Guarantees Via Stochastic Domination
Authors	Benjamin Doerr
Abstract	Apart from few exceptions, the mathematical runtime analysis of evolutionary algorithms is mostly concerned with expected runtimes. In this work, we argue that stochastic domination is a notion that should be used more frequently in this area. Stochastic domination allows to formulate much more informative performance guarantees, it allows to decouple the algorithm analysis into the true algorithmic part of detecting a domination statement and the probability-theoretical part of deriving the desired probabilistic guarantees from this statement, and it helps finding simpler and more natural proofs. As particular results, we prove a fitness level theorem which shows that the runtime is dominated by a sum of independent geometric random variables, we prove the first tail bounds for several classic runtime problems, and we give a short and natural proof for Witt’s result that the runtime of any $(\mu,p)$ mutation-based algorithm on any function with unique optimum is subdominated by the runtime of a variant of the \oea on the \onemax function. As side-products, we determine the fastest unbiased (1+1) algorithm for the \leadingones benchmark problem, both in the general case and when restricted to static mutation operators, and we prove a Chernoff-type tail bound for sums of independent coupon collector distributions.
Tasks
Published	2018-01-13
URL	http://arxiv.org/abs/1801.04487v5
PDF	http://arxiv.org/pdf/1801.04487v5.pdf
PWC	https://paperswithcode.com/paper/better-runtime-guarantees-via-stochastic
Repo
Framework


Title	Uncovering the Social Interaction in Swarm Intelligence with Network Science
Authors	Marcos Oliveira, Diego Pinheiro, Mariana Macedo, Carmelo Bastos-Filho, Ronaldo Menezes
Abstract	Swarm intelligence is the collective behavior emerging in systems with locally interacting components. Because of their self-organization capabilities, swarm-based systems show essential properties for handling real-world problems such as robustness, scalability, and flexibility. Yet, we do not know why swarm-based algorithms work well and neither we can compare the different approaches in the literature. The lack of a common framework capable of characterizing these several swarm-based algorithms, transcending their particularities, has led to a stream of publications inspired by different aspects of nature without a systematic comparison over existing approaches. Here, we address this gap by introducing a network-based framework—the interaction network—to examine computational swarm-based systems via the optics of the social dynamics of such interaction network; a clear example of network science being applied to bring further clarity to a complicated field within artificial intelligence. We discuss the social interactions of four well-known swarm-based algorithms and provide an in-depth case study of the Particle Swarm Optimization. The interaction network enables researchers to study swarm algorithms as systems, removing the algorithm particularities from the analyses while focusing on the structure of the social interactions.
Tasks
Published	2018-11-08
URL	https://arxiv.org/abs/1811.03539v2
PDF	https://arxiv.org/pdf/1811.03539v2.pdf
PWC	https://paperswithcode.com/paper/unveiling-swarm-intelligence-with-network
Repo
Framework

Generalized Hadamard-Product Fusion Operators for Visual Question Answering


Title	Generalized Hadamard-Product Fusion Operators for Visual Question Answering
Authors	Brendan Duke, Graham W. Taylor
Abstract	We propose a generalized class of multimodal fusion operators for the task of visual question answering (VQA). We identify generalizations of existing multimodal fusion operators based on the Hadamard product, and show that specific non-trivial instantiations of this generalized fusion operator exhibit superior performance in terms of OpenEnded accuracy on the VQA task. In particular, we introduce Nonlinearity Ensembling, Feature Gating, and post-fusion neural network layers as fusion operator components, culminating in an absolute percentage point improvement of $1.1%$ on the VQA 2.0 test-dev set over baseline fusion operators, which use the same features as input. We use our findings as evidence that our generalized class of fusion operators could lead to the discovery of even superior task-specific operators when used as a search space in an architecture search over fusion operators.
Tasks	Neural Architecture Search, Question Answering, Visual Question Answering
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09374v2
PDF	http://arxiv.org/pdf/1803.09374v2.pdf
PWC	https://paperswithcode.com/paper/generalized-hadamard-product-fusion-operators
Repo
Framework

MLCapsule: Guarded Offline Deployment of Machine Learning as a Service


Title	MLCapsule: Guarded Offline Deployment of Machine Learning as a Service
Authors	Lucjan Hanzlik, Yang Zhang, Kathrin Grosse, Ahmed Salem, Max Augustin, Michael Backes, Mario Fritz
Abstract	With the widespread use of machine learning (ML) techniques, ML as a service has become increasingly popular. In this setting, an ML model resides on a server and users can query it with their data via an API. However, if the user’s input is sensitive, sending it to the server is undesirable and sometimes even legally not possible. Equally, the service provider does not want to share the model by sending it to the client for protecting its intellectual property and pay-per-query business model. In this paper, we propose MLCapsule, a guarded offline deployment of machine learning as a service. MLCapsule executes the model locally on the user’s side and therefore the data never leaves the client. Meanwhile, MLCapsule offers the service provider the same level of control and security of its model as the commonly used server-side execution. In addition, MLCapsule is applicable to offline applications that require local execution. Beyond protecting against direct model access, we couple the secure offline deployment with defenses against advanced attacks on machine learning models such as model stealing, reverse engineering, and membership inference.
Tasks
Published	2018-08-01
URL	http://arxiv.org/abs/1808.00590v2
PDF	http://arxiv.org/pdf/1808.00590v2.pdf
PWC	https://paperswithcode.com/paper/mlcapsule-guarded-offline-deployment-of
Repo
Framework

A Dynamic Network and Representation LearningApproach for Quantifying Economic Growth fromSatellite Imagery


Title	A Dynamic Network and Representation LearningApproach for Quantifying Economic Growth fromSatellite Imagery
Authors	Jiqian Dong, Gopaljee Atulya, Kartikeya Bhardwaj, Radu Marculescu
Abstract	Quantifying the improvement in human living standard, as well as the city growth in developing countries, is a challenging problem due to the lack of reliable economic data. Therefore, there is a fundamental need for alternate, largely unsupervised, computational methods that can estimate the economic conditions in the developing regions. To this end, we propose a new network science- and representation learning-based approach that can quantify economic indicators and visualize the growth of various regions. More precisely, we first create a dynamic network drawn out of high-resolution nightlight satellite images. We then demonstrate that using representation learning to mine the resulting network, our proposed approach can accurately predict spatial gross economic expenditures over large regions. Our method, which requires only nightlight images and limited survey data, can capture city-growth, as well as how people’s living standard is changing; this can ultimately facilitate the decision makers’ understanding of growth without heavily relying on expensive and time-consuming surveys.
Tasks	Representation Learning
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00141v1
PDF	http://arxiv.org/pdf/1812.00141v1.pdf
PWC	https://paperswithcode.com/paper/a-dynamic-network-and-representation
Repo
Framework

Evaluating Semantic Rationality of a Sentence: A Sememe-Word-Matching Neural Network based on HowNet


Title	Evaluating Semantic Rationality of a Sentence: A Sememe-Word-Matching Neural Network based on HowNet
Authors	Shu Liu, Jingjing Xu, Xuancheng Ren, Xu Sun
Abstract	Automatic evaluation of semantic rationality is an important yet challenging task, and current automatic techniques cannot well identify whether a sentence is semantically rational. The methods based on the language model do not measure the sentence by rationality but by commonness. The methods based on the similarity with human written sentences will fail if human-written references are not available. In this paper, we propose a novel model called Sememe-Word-Matching Neural Network (SWM-NN) to tackle semantic rationality evaluation by taking advantage of sememe knowledge base HowNet. The advantage is that our model can utilize a proper combination of sememes to represent the fine-grained semantic meanings of a word within the specific contexts. We use the fine-grained semantic representation to help the model learn the semantic dependency among words. To evaluate the effectiveness of the proposed model, we build a large-scale rationality evaluation dataset. Experimental results on this dataset show that the proposed model outperforms the competitive baselines with a 5.4% improvement in accuracy.
Tasks	Language Modelling
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03999v1
PDF	http://arxiv.org/pdf/1809.03999v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-semantic-rationality-of-a-sentence
Repo
Framework

Multi-scale prediction for robust hand detection and classification


Title	Multi-scale prediction for robust hand detection and classification
Authors	Ding Lu, Yong Wang, Robert Laganiere, Xinbin Luo, Shan Fu
Abstract	In this paper, we present a multi-scale Fully Convolutional Networks (MSP-RFCN) to robustly detect and classify human hands under various challenging conditions. In our approach, the input image is passed through the proposed network to generate score maps, based on multi-scale predictions. The network has been specifically designed to deal with small objects. It uses an architecture based on region proposals generated at multiple scales. Our method is evaluated on challenging hand datasets, namely the Vision for Intelligent Vehicles and Applications (VIVA) Challenge and the Oxford hand dataset. It is compared against recent hand detection algorithms. The experimental results demonstrate that our proposed method achieves state-of-the-art detection for hands of various sizes.
Tasks
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08220v1
PDF	http://arxiv.org/pdf/1804.08220v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-prediction-for-robust-hand
Repo
Framework

Multiresolution Tree Networks for 3D Point Cloud Processing


Title	Multiresolution Tree Networks for 3D Point Cloud Processing
Authors	Matheus Gadelha, Rui Wang, Subhransu Maji
Abstract	We present multiresolution tree-structured networks to process point clouds for 3D shape understanding and generation tasks. Our network represents a 3D shape as a set of locality-preserving 1D ordered list of points at multiple resolutions. This allows efficient feed-forward processing through 1D convolutions, coarse-to-fine analysis through a multi-grid architecture, and it leads to faster convergence and small memory footprint during training. The proposed tree-structured encoders can be used to classify shapes and outperform existing point-based architectures on shape classification benchmarks, while tree-structured decoders can be used for generating point clouds directly and they outperform existing approaches for image-to-shape inference tasks learned using the ShapeNet dataset. Our model also allows unsupervised learning of point-cloud based shapes by using a variational autoencoder, leading to higher-quality generated shapes.
Tasks
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03520v2
PDF	http://arxiv.org/pdf/1807.03520v2.pdf
PWC	https://paperswithcode.com/paper/multiresolution-tree-networks-for-3d-point
Repo
Framework

ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio


Title	ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio
Authors	Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau
Abstract	Adversarial machine learning research has recently demonstrated the feasibility to confuse automatic speech recognition (ASR) models by introducing acoustically imperceptible perturbations to audio samples. To help researchers and practitioners gain better understanding of the impact of such attacks, and to provide them with tools to help them more easily evaluate and craft strong defenses for their models, we present ADAGIO, the first tool designed to allow interactive experimentation with adversarial attacks and defenses on an ASR model in real time, both visually and aurally. ADAGIO incorporates AMR and MP3 audio compression techniques as defenses, which users can interactively apply to attacked audio samples. We show that these techniques, which are based on psychoacoustic principles, effectively eliminate targeted attacks, reducing the attack success rate from 92.5% to 0%. We will demonstrate ADAGIO and invite the audience to try it on the Mozilla Common Voice dataset.
Tasks	Adversarial Attack, Speech Recognition
Published	2018-05-30
URL	http://arxiv.org/abs/1805.11852v1
PDF	http://arxiv.org/pdf/1805.11852v1.pdf
PWC	https://paperswithcode.com/paper/adagio-interactive-experimentation-with
Repo
Framework

Machine Learning Harnesses Molecular Dynamics to Discover New $μ$ Opioid Chemotypes


Title	Machine Learning Harnesses Molecular Dynamics to Discover New $μ$ Opioid Chemotypes
Authors	Evan N. Feinberg, Amir Barati Farimani, Rajendra Uprety, Amanda Hunkele, Gavril W. Pasternak, Susruta Majumdar, Vijay S. Pande
Abstract	Computational chemists typically assay drug candidates by virtually screening compounds against crystal structures of a protein despite the fact that some targets, like the $\mu$ Opioid Receptor and other members of the GPCR family, traverse many non-crystallographic states. We discover new conformational states of $\mu OR$ with molecular dynamics simulation and then machine learn ligand-structure relationships to predict opioid ligand function. These artificial intelligence models identified a novel $\mu$ opioid chemotype.
Tasks
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04479v1
PDF	http://arxiv.org/pdf/1803.04479v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-harnesses-molecular-dynamics
Repo
Framework

Hierarchical Long-term Video Prediction without Supervision


Title	Hierarchical Long-term Video Prediction without Supervision
Authors	Nevan Wichers, Ruben Villegas, Dumitru Erhan, Honglak Lee
Abstract	Much of recent research has been devoted to video prediction and generation, yet most of the previous works have demonstrated only limited success in generating videos on short-term horizons. The hierarchical video prediction method by Villegas et al. (2017) is an example of a state-of-the-art method for long-term video prediction, but their method is limited because it requires ground truth annotation of high-level structures (e.g., human joint landmarks) at training time. Our network encodes the input frame, predicts a high-level encoding into the future, and then a decoder with access to the first frame produces the predicted image from the predicted encoding. The decoder also produces a mask that outlines the predicted foreground object (e.g., person) as a by-product. Unlike Villegas et al. (2017), we develop a novel training method that jointly trains the encoder, the predictor, and the decoder together without highlevel supervision; we further improve upon this by using an adversarial loss in the feature space to train the predictor. Our method can predict about 20 seconds into the future and provides better results compared to Denton and Fergus (2018) and Finn et al. (2016) on the Human 3.6M dataset.
Tasks	Video Prediction
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04768v1
PDF	http://arxiv.org/pdf/1806.04768v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-long-term-video-prediction
Repo
Framework

Compositional Attention Networks for Interpretability in Natural Language Question Answering


Title	Compositional Attention Networks for Interpretability in Natural Language Question Answering
Authors	Muru Selvakumar, Suriyadeepan Ramamoorthy, Vaidheeswaran Archana, Malaikannan Sankarasubbu
Abstract	MAC Net is a compositional attention network designed for Visual Question Answering. We propose a modified MAC net architecture for Natural Language Question Answering. Question Answering typically requires Language Understanding and multi-step Reasoning. MAC net’s unique architecture - the separation between memory and control, facilitates data-driven iterative reasoning. This makes it an ideal candidate for solving tasks that involve logical reasoning. Our experiments with 20 bAbI tasks demonstrate the value of MAC net as a data-efficient and interpretable architecture for Natural Language Question Answering. The transparent nature of MAC net provides a highly granular view of the reasoning steps taken by the network in answering a query.
Tasks	Question Answering, Visual Question Answering
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12698v1
PDF	http://arxiv.org/pdf/1810.12698v1.pdf
PWC	https://paperswithcode.com/paper/compositional-attention-networks-for
Repo
Framework

Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server


Title	Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server
Authors	Cheng Daning, Xia Fen, Li Shigang, Zhang Yunquan
Abstract	In AI research and industry, machine learning is the most widely used tool. One of the most important machine learning algorithms is Gradient Boosting Decision Tree, i.e. GBDT whose training process needs considerable computational resources and time. To shorten GBDT training time, many works tried to apply GBDT on Parameter Server. However, those GBDT algorithms are synchronous parallel algorithms which fail to make full use of Parameter Server. In this paper, we examine the possibility of using asynchronous parallel methods to train GBDT model and name this algorithm as asynch-SGBDT (asynchronous parallel stochastic gradient boosting decision tree). Our theoretical and experimental results indicate that the scalability of asynch-SGBDT is influenced by the sample diversity of datasets, sampling rate, step length and the setting of GBDT tree. Experimental results also show asynch-SGBDT training process reaches a linear speedup in asynchronous parallel manner when datasets and GBDT trees meet high scalability requirements.
Tasks
Published	2018-04-12
URL	https://arxiv.org/abs/1804.04659v4
PDF	https://arxiv.org/pdf/1804.04659v4.pdf
PWC	https://paperswithcode.com/paper/asynch-sgbdt-asynchronous-parallel-stochastic
Repo
Framework

Star Shape Prior in Fully Convolutional Networks for Skin Lesion Segmentation


Title	Star Shape Prior in Fully Convolutional Networks for Skin Lesion Segmentation
Authors	Zahra Mirikharaji, Ghassan Hamarneh
Abstract	Semantic segmentation is an important preliminary step towards automatic medical image interpretation. Recently deep convolutional neural networks have become the first choice for the task of pixel-wise class prediction. While incorporating prior knowledge about the structure of target objects has proven effective in traditional energy-based segmentation approaches, there has not been a clear way for encoding prior knowledge into deep learning frameworks. In this work, we propose a new loss term that encodes the star shape prior into the loss function of an end-to-end trainable fully convolutional network (FCN) framework. We penalize non-star shape segments in FCN prediction maps to guarantee a global structure in segmentation results. Our experiments demonstrate the advantage of regularizing FCN parameters by the star shape prior and our results on the ISBI 2017 skin segmentation challenge data set achieve the first rank in the segmentation task among $21$ participating teams.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08437v1
PDF	http://arxiv.org/pdf/1806.08437v1.pdf
PWC	https://paperswithcode.com/paper/star-shape-prior-in-fully-convolutional
Repo
Framework