July 28, 2019

2992 words 15 mins read

Paper Group ANR 346

Discretization-free Knowledge Gradient Methods for Bayesian Optimization. Machine vs Machine: Minimax-Optimal Defense Against Adversarial Examples. AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces. Overview: Generalizations of Multi-Agent Path Finding to Real-World Scenarios. Comprehensive …

Discretization-free Knowledge Gradient Methods for Bayesian Optimization


Title	Discretization-free Knowledge Gradient Methods for Bayesian Optimization
Authors	Jian Wu, Peter I. Frazier
Abstract	This paper studies Bayesian ranking and selection (R&S) problems with correlated prior beliefs and continuous domains, i.e. Bayesian optimization (BO). Knowledge gradient methods [Frazier et al., 2008, 2009] have been widely studied for discrete R&S problems, which sample the one-step Bayes-optimal point. When used over continuous domains, previous work on the knowledge gradient [Scott et al., 2011, Wu and Frazier, 2016, Wu et al., 2017] often rely on a discretized finite approximation. However, the discretization introduces error and scales poorly as the dimension of domain grows. In this paper, we develop a fast discretization-free knowledge gradient method for Bayesian optimization. Our method is not restricted to the fully sequential setting, but useful in all settings where knowledge gradient can be used over continuous domains. We show how our method can be generalized to handle (i) batch of points suggestion (parallel knowledge gradient); (ii) the setting where derivative information is available in the optimization process (derivative-enabled knowledge gradient). In numerical experiments, we demonstrate that the discretization-free knowledge gradient method finds global optima significantly faster than previous Bayesian optimization algorithms on both synthetic test functions and real-world applications, especially when function evaluations are noisy; and derivative-enabled knowledge gradient can further improve the performances, even outperforming the gradient-based optimizer such as BFGS when derivative information is available.
Tasks
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06541v2
PDF	http://arxiv.org/pdf/1707.06541v2.pdf
PWC	https://paperswithcode.com/paper/discretization-free-knowledge-gradient
Repo
Framework

Machine vs Machine: Minimax-Optimal Defense Against Adversarial Examples


Title	Machine vs Machine: Minimax-Optimal Defense Against Adversarial Examples
Authors	Jihun Hamm, Akshay Mehra
Abstract	Recently, researchers have discovered that the state-of-the-art object classifiers can be fooled easily by small perturbations in the input unnoticeable to human eyes. It is also known that an attacker can generate strong adversarial examples if she knows the classifier parameters. Conversely, a defender can robustify the classifier by retraining if she has access to the adversarial examples. We explain and formulate this adversarial example problem as a two-player continuous zero-sum game, and demonstrate the fallacy of evaluating a defense or an attack as a static problem. To find the best worst-case defense against whitebox attacks, we propose a continuous minimax optimization algorithm. We demonstrate the minimax defense with two types of attack classes – gradient-based and neural network-based attacks. Experiments with the MNIST and the CIFAR-10 datasets demonstrate that the defense found by numerical minimax optimization is indeed more robust than non-minimax defenses. We discuss directions for improving the result toward achieving robustness against multiple types of attack classes.
Tasks
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04368v3
PDF	http://arxiv.org/pdf/1711.04368v3.pdf
PWC	https://paperswithcode.com/paper/machine-vs-machine-minimax-optimal-defense
Repo
Framework

AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces


Title	AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces
Authors	Mahmoud Afifi, Abdelrahman Abdelhamed
Abstract	Gender classification aims at recognizing a person’s gender. Despite the high accuracy achieved by state-of-the-art methods for this task, there is still room for improvement in generalized and unrestricted datasets. In this paper, we advocate a new strategy inspired by the behavior of humans in gender recognition. Instead of dealing with the face image as a sole feature, we rely on the combination of isolated facial features and a holistic feature which we call the foggy face. Then, we use these features to train deep convolutional neural networks followed by an AdaBoost-based score fusion to infer the final gender class. We evaluate our method on four challenging datasets to demonstrate its efficacy in achieving better or on-par accuracy with state-of-the-art methods. In addition, we present a new face dataset that intensifies the challenges of occluded faces and illumination changes, which we believe to be a much-needed resource for gender classification research.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.04277v5
PDF	http://arxiv.org/pdf/1706.04277v5.pdf
PWC	https://paperswithcode.com/paper/afif4-deep-gender-classification-based-on
Repo
Framework

Overview: Generalizations of Multi-Agent Path Finding to Real-World Scenarios


Title	Overview: Generalizations of Multi-Agent Path Finding to Real-World Scenarios
Authors	Hang Ma, Sven Koenig, Nora Ayanian, Liron Cohen, Wolfgang Hoenig, T. K. Satish Kumar, Tansel Uras, Hong Xu, Craig Tovey, Guni Sharon
Abstract	Multi-agent path finding (MAPF) is well-studied in artificial intelligence, robotics, theoretical computer science and operations research. We discuss issues that arise when generalizing MAPF methods to real-world scenarios and four research directions that address them. We emphasize the importance of addressing these issues as opposed to developing faster methods for the standard formulation of the MAPF problem.
Tasks	Multi-Agent Path Finding
Published	2017-02-17
URL	http://arxiv.org/abs/1702.05515v1
PDF	http://arxiv.org/pdf/1702.05515v1.pdf
PWC	https://paperswithcode.com/paper/overview-generalizations-of-multi-agent-path
Repo
Framework

Comprehensive Data Set for Automatic Single Camera Visual Speed Measurement


Title	Comprehensive Data Set for Automatic Single Camera Visual Speed Measurement
Authors	Jakub Sochor, Roman Juránek, Jakub Špaňhel, Lukáš Maršík, Adam Široký, Adam Herout, Pavel Zemčík
Abstract	In this paper, we focus on traffic camera calibration and a visual speed measurement from a single monocular camera, which is an important task of visual traffic surveillance. Existing methods addressing this problem are difficult to compare due to a lack of a common data set with reliable ground truth. Therefore, it is not clear how the methods compare in various aspects and what factors are affecting their performance. We captured a new data set of 18 full-HD videos, each around 1 hr long, captured at six different locations. Vehicles in the videos (20865 instances in total) are annotated with the precise speed measurements from optical gates using LiDAR and verified with several reference GPS tracks. We made the data set available for download and it contains the videos and metadata (calibration, lengths of features in image, annotations, and so on) for future comparison and evaluation. Camera calibration is the most crucial part of the speed measurement; therefore, we provide a brief overview of the methods and analyze a recently published method for fully automatic camera calibration and vehicle speed measurement and report the results on this data set in detail.
Tasks	Calibration
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06441v2
PDF	http://arxiv.org/pdf/1702.06441v2.pdf
PWC	https://paperswithcode.com/paper/comprehensive-data-set-for-automatic-single
Repo
Framework

Memoisation: Purely, Left-recursively, and with (Continuation Passing) Style


Title	Memoisation: Purely, Left-recursively, and with (Continuation Passing) Style
Authors	Samer Abdallah
Abstract	Memoisation, or tabling, is a well-known technique that yields large improvements in the performance of some recursive computations. Tabled resolution in Prologs such as XSB and B-Prolog can transform so called left-recursive predicates from non-terminating computations into finite and well-behaved ones. In the functional programming literature, memoisation has usually been implemented in a way that does not handle left-recursion, requiring supplementary mechanisms to prevent non-termination. A notable exception is Johnson’s (1995) continuation passing approach in Scheme. This, however, relies on mutation of a memo table data structure and coding in explicit continuation passing style. We show how Johnson’s approach can be implemented purely functionally in a modern, strongly typed functional language (OCaml), presented via a monadic interface that hides the implementation details, yet providing a way to return a compact represention of the memo tables at the end of the computation.
Tasks
Published	2017-07-15
URL	http://arxiv.org/abs/1707.04724v1
PDF	http://arxiv.org/pdf/1707.04724v1.pdf
PWC	https://paperswithcode.com/paper/memoisation-purely-left-recursively-and-with
Repo
Framework

Validation of Enhanced Emotion Enabled Cognitive Agent Using Virtual Overlay Multi-Agent System Approach


Title	Validation of Enhanced Emotion Enabled Cognitive Agent Using Virtual Overlay Multi-Agent System Approach
Authors	Faisal Riaz, Muaz A. Niazi
Abstract	Making roads safer by avoiding road collisions is one of the main reasons for inventing Autonomous vehicles (AVs). In this context, designing agent-based collision avoidance components of AVs which truly represent human cognition and emotions look is a more feasible approach as agents can replace human drivers. However, to the best of our knowledge, very few human emotion and cognition-inspired agent-based studies have previously been conducted in this domain. Furthermore, these agent-based solutions have not been validated using any key validation technique. Keeping in view this lack of validation practices, we have selected state-of-the-art Emotion Enabled Cognitive Agent (EEC_Agent), which was proposed to avoid lateral collisions between semi-AVs. The architecture of EEC_Agent has been revised using Exploratory Agent Based Modeling (EABM) level of the Cognitive Agent Based Computing (CABC) framework and real-time fear emotion generation mechanism using the Ortony, Clore & Collins (OCC) model has also been introduced. Then the proposed fear generation mechanism has been validated using the Validated Agent Based Modeling level of CABC framework using a Virtual Overlay MultiAgent System (VOMAS). Extensive simulation and practical experiments demonstrate that the Enhanced EEC_Agent exhibits the capability to feel different levels of fear, according to different traffic situations and also needs a smaller Stopping Sight Distance (SSD) and Overtaking Sight Distance (OSD) as compared to human drivers.
Tasks	Autonomous Vehicles
Published	2017-08-04
URL	http://arxiv.org/abs/1708.01628v1
PDF	http://arxiv.org/pdf/1708.01628v1.pdf
PWC	https://paperswithcode.com/paper/validation-of-enhanced-emotion-enabled
Repo
Framework

Deep Contextual Recurrent Residual Networks for Scene Labeling


Title	Deep Contextual Recurrent Residual Networks for Scene Labeling
Authors	T. Hoang Ngan Le, Chi Nhan Duong, Ligong Han, Khoa Luu, Marios Savvides, Dipan Pal
Abstract	Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this issue, we propose a novel approach, Contextual Recurrent Residual Networks (CRRN) which is able to simultaneously handle rich visual representation learning and long-range context modeling within a fully end-to-end deep network. Furthermore, our proposed end-to-end CRRN is completely trained from scratch, without using any pre-trained models in contrast to most existing methods usually fine-tuned from the state-of-the-art pre-trained models, e.g. VGG-16, ResNet, etc. The experiments are conducted on four challenging scene labeling datasets, i.e. SiftFlow, CamVid, Stanford background and SUN datasets, and compared against various state-of-the-art scene labeling methods.
Tasks	Representation Learning, Scene Labeling
Published	2017-04-12
URL	http://arxiv.org/abs/1704.03594v1
PDF	http://arxiv.org/pdf/1704.03594v1.pdf
PWC	https://paperswithcode.com/paper/deep-contextual-recurrent-residual-networks
Repo
Framework

DiffuserCam: Lensless Single-exposure 3D Imaging


Title	DiffuserCam: Lensless Single-exposure 3D Imaging
Authors	Nick Antipa, Grace Kuo, Reinhard Heckel, Ben Mildenhall, Emrah Bostan, Ren Ng, Laura Waller
Abstract	We demonstrate a compact and easy-to-build computational camera for single-shot 3D imaging. Our lensless system consists solely of a diffuser placed in front of a standard image sensor. Every point within the volumetric field-of-view projects a unique pseudorandom pattern of caustics on the sensor. By using a physical approximation and simple calibration scheme, we solve the large-scale inverse problem in a computationally efficient way. The caustic patterns enable compressed sensing, which exploits sparsity in the sample to solve for more 3D voxels than pixels on the 2D sensor. Our 3D voxel grid is chosen to match the experimentally measured two-point optical resolution across the field-of-view, resulting in 100 million voxels being reconstructed from a single 1.3 megapixel image. However, the effective resolution varies significantly with scene content. Because this effect is common to a wide range of computational cameras, we provide new theory for analyzing resolution in such systems.
Tasks	Calibration
Published	2017-10-05
URL	http://arxiv.org/abs/1710.02134v1
PDF	http://arxiv.org/pdf/1710.02134v1.pdf
PWC	https://paperswithcode.com/paper/diffusercam-lensless-single-exposure-3d
Repo
Framework

Interpreting Outliers: Localized Logistic Regression for Density Ratio Estimation


Title	Interpreting Outliers: Localized Logistic Regression for Density Ratio Estimation
Authors	Makoto Yamada, Song Liu, Samuel Kaski
Abstract	We propose an inlier-based outlier detection method capable of both identifying the outliers and explaining why they are outliers, by identifying the outlier-specific features. Specifically, we employ an inlier-based outlier detection criterion, which uses the ratio of inlier and test probability densities as a measure of plausibility of being an outlier. For estimating the density ratio function, we propose a localized logistic regression algorithm. Thanks to the locality of the model, variable selection can be outlier-specific, and will help interpret why points are outliers in a high-dimensional space. Through synthetic experiments, we show that the proposed algorithm can successfully detect the important features for outliers. Moreover, we show that the proposed algorithm tends to outperform existing algorithms in benchmark datasets.
Tasks	Outlier Detection
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06354v1
PDF	http://arxiv.org/pdf/1702.06354v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-outliers-localized-logistic
Repo
Framework

Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval


Title	Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval
Authors	Daiguo Deng, Ruomei Wang, Hefeng Wu, Huayong He, Qi Li, Xiaonan Luo
Abstract	Fabric image retrieval is beneficial to many applications including clothing searching, online shopping and cloth modeling. Learning pairwise image similarity is of great importance to an image retrieval task. With the resurgence of Convolutional Neural Networks (CNNs), recent works have achieved significant progresses via deep representation learning with metric embedding, which drives similar examples close to each other in a feature space, and dissimilar ones apart from each other. In this paper, we propose a novel embedding method termed focus ranking that can be easily unified into a CNN for jointly learning image representations and metrics in the context of fine-grained fabric image retrieval. Focus ranking aims to rank similar examples higher than all dissimilar ones by penalizing ranking disorders via the minimization of the overall cost attributed to similar samples being ranked below dissimilar ones. At the training stage, training samples are organized into focus ranking units for efficient optimization. We build a large-scale fabric image retrieval dataset (FIRD) with about 25,000 images of 4,300 fabrics, and test the proposed model on the FIRD dataset. Experimental results show the superiority of the proposed model over existing metric embedding models.
Tasks	Image Retrieval, Representation Learning
Published	2017-12-29
URL	http://arxiv.org/abs/1712.10211v1
PDF	http://arxiv.org/pdf/1712.10211v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-similarity-models-with-focus
Repo
Framework


Title	Burn-In Demonstrations for Multi-Modal Imitation Learning
Authors	Alex Kuefler, Mykel J. Kochenderfer
Abstract	Recent work on imitation learning has generated policies that reproduce expert behavior from multi-modal data. However, past approaches have focused only on recreating a small number of distinct, expert maneuvers, or have relied on supervised learning techniques that produce unstable policies. This work extends InfoGAIL, an algorithm for multi-modal imitation learning, to reproduce behavior over an extended period of time. Our approach involves reformulating the typical imitation learning setting to include “burn-in demonstrations” upon which policies are conditioned at test time. We demonstrate that our approach outperforms standard InfoGAIL in maximizing the mutual information between predicted and unseen style labels in road scene simulations, and we show that our method leads to policies that imitate expert autonomous driving systems over long time horizons.
Tasks	Autonomous Driving, Imitation Learning
Published	2017-10-13
URL	http://arxiv.org/abs/1710.05090v1
PDF	http://arxiv.org/pdf/1710.05090v1.pdf
PWC	https://paperswithcode.com/paper/burn-in-demonstrations-for-multi-modal
Repo
Framework

AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training


Title	AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
Authors	Chia-Yu Chen, Jungwook Choi, Daniel Brand, Ankur Agrawal, Wei Zhang, Kailash Gopalakrishnan
Abstract	Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms (offering 100 of TeraOps/s of computational capacity) is expected to be severely communication constrained. To overcome this limitation, new gradient compression techniques are needed that are computationally friendly, applicable to a wide variety of layers seen in Deep Neural Networks and adaptable to variations in network architectures as well as their hyper-parameters. In this paper we introduce a novel technique - the Adaptive Residual Gradient Compression (AdaComp) scheme. AdaComp is based on localized selection of gradient residues and automatically tunes the compression rate depending on local activity. We show excellent results on a wide spectrum of state of the art Deep Learning models in multiple domains (vision, speech, language), datasets (MNIST, CIFAR10, ImageNet, BN50, Shakespeare), optimizers (SGD with momentum, Adam) and network parameters (number of learners, minibatch-size etc.). Exploiting both sparsity and quantization, we demonstrate end-to-end compression rates of ~200X for fully-connected and recurrent layers, and ~40X for convolutional layers, without any noticeable degradation in model accuracies.
Tasks	Quantization
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02679v1
PDF	http://arxiv.org/pdf/1712.02679v1.pdf
PWC	https://paperswithcode.com/paper/adacomp-adaptive-residual-gradient
Repo
Framework

AlignGAN: Learning to Align Cross-Domain Images with Conditional Generative Adversarial Networks


Title	AlignGAN: Learning to Align Cross-Domain Images with Conditional Generative Adversarial Networks
Authors	Xudong Mao, Qing Li, Haoran Xie
Abstract	Recently, several methods based on generative adversarial network (GAN) have been proposed for the task of aligning cross-domain images or learning a joint distribution of cross-domain images. One of the methods is to use conditional GAN for alignment. However, previous attempts of adopting conditional GAN do not perform as well as other methods. In this work we present an approach for improving the capability of the methods which are based on conditional GAN. We evaluate the proposed method on numerous tasks and the experimental results show that it is able to align the cross-domain images successfully in absence of paired samples. Furthermore, we also propose another model which conditions on multiple information such as domain information and label information. Conditioning on domain information and label information, we are able to conduct label propagation from the source domain to the target domain. A 2-step alternating training algorithm is proposed to learn this model.
Tasks
Published	2017-07-05
URL	http://arxiv.org/abs/1707.01400v1
PDF	http://arxiv.org/pdf/1707.01400v1.pdf
PWC	https://paperswithcode.com/paper/aligngan-learning-to-align-cross-domain
Repo
Framework

Identity-Aware Textual-Visual Matching with Latent Co-attention


Title	Identity-Aware Textual-Visual Matching with Latent Co-attention
Authors	Shuang Li, Tong Xiao, Hongsheng Li, Wei Yang, Xiaogang Wang
Abstract	Textual-visual matching aims at measuring similarities between sentence descriptions and images. Most existing methods tackle this problem without effectively utilizing identity-level annotations. In this paper, we propose an identity-aware two-stage framework for the textual-visual matching problem. Our stage-1 CNN-LSTM network learns to embed cross-modal features with a novel Cross-Modal Cross-Entropy (CMCE) loss. The stage-1 network is able to efficiently screen easy incorrect matchings and also provide initial training point for the stage-2 training. The stage-2 CNN-LSTM network refines the matching results with a latent co-attention mechanism. The spatial attention relates each word with corresponding image regions while the latent semantic attention aligns different sentence structures to make the matching results more robust to sentence structure variations. Extensive experiments on three datasets with identity-level annotations show that our framework outperforms state-of-the-art approaches by large margins.
Tasks
Published	2017-08-07
URL	http://arxiv.org/abs/1708.01988v1
PDF	http://arxiv.org/pdf/1708.01988v1.pdf
PWC	https://paperswithcode.com/paper/identity-aware-textual-visual-matching-with
Repo
Framework