January 29, 2020

3104 words 15 mins read

Paper Group ANR 556

Self-regularizing restricted Boltzmann machines. Fast Universal Style Transfer for Artistic and Photorealistic Rendering. Reconfigurable Interaction for MAS Modelling. Learning Near-optimal Convex Combinations of Basis Models with Generalization Guarantees. Graph based adaptive evolutionary algorithm for continuous optimization. Argoverse: 3D Track …

Self-regularizing restricted Boltzmann machines


Title	Self-regularizing restricted Boltzmann machines
Authors	Orestis Loukas
Abstract	Focusing on the grand-canonical extension of the ordinary restricted Boltzmann machine, we suggest an energy-based model for feature extraction that uses a layer of hidden units with varying size. By an appropriate choice of the chemical potential and given a sufficiently large number of hidden resources the generative model is able to efficiently deduce the optimal number of hidden units required to learn the target data with exceedingly small generalization error. The formal simplicity of the grand-canonical ensemble combined with a rapidly converging ansatz in mean-field theory enable us to recycle well-established numerical algothhtims during training, like contrastive divergence, with only minor changes. As a proof of principle and to demonstrate the novel features of grand-canonical Boltzmann machines, we train our generative models on data from the Ising theory and MNIST.
Tasks
Published	2019-12-09
URL	https://arxiv.org/abs/1912.05634v1
PDF	https://arxiv.org/pdf/1912.05634v1.pdf
PWC	https://paperswithcode.com/paper/self-regularizing-restricted-boltzmann
Repo
Framework

Fast Universal Style Transfer for Artistic and Photorealistic Rendering


Title	Fast Universal Style Transfer for Artistic and Photorealistic Rendering
Authors	Jie An, Haoyi Xiong, Jiebo Luo, Jun Huan, Jinwen Ma
Abstract	Universal style transfer is an image editing task that renders an input content image using the visual style of arbitrary reference images, including both artistic and photorealistic stylization. Given a pair of images as the source of content and the reference of style, existing solutions usually first train an auto-encoder (AE) to reconstruct the image using deep features and then embeds pre-defined style transfer modules into the AE reconstruction procedure to transfer the style of the reconstructed image through modifying the deep features. While existing methods typically need multiple rounds of time-consuming AE reconstruction for better stylization, our work intends to design novel neural network architectures on top of AE for fast style transfer with fewer artifacts and distortions all in one pass of end-to-end inference. To this end, we propose two network architectures named ArtNet and PhotoNet to improve artistic and photo-realistic stylization, respectively. Extensive experiments demonstrate that ArtNet generates images with fewer artifacts and distortions against the state-of-the-art artistic transfer algorithms, while PhotoNet improves the photorealistic stylization results by creating sharp images faithfully preserving rich details of the input content. Moreover, ArtNet and PhotoNet can achieve 3X to 100X speed-up over the state-of-the-art algorithms, which is a major advantage for large content images.
Tasks	Style Transfer
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03118v1
PDF	https://arxiv.org/pdf/1907.03118v1.pdf
PWC	https://paperswithcode.com/paper/fast-universal-style-transfer-for-artistic
Repo
Framework

Reconfigurable Interaction for MAS Modelling


Title	Reconfigurable Interaction for MAS Modelling
Authors	Yehia Abd Alrahman, Giuseppe Perelli, Nir Piterman
Abstract	We propose a formalism to model and reason about multi-agent systems. We allow agents to interact and communicate in different modes so that they can pursue joint tasks; agents may dynamically synchronize, exchange data, adapt their behaviour, and reconfigure their communication interfaces. The formalism defines a local behaviour based on shared variables and a global one based on message passing. We extend LTL to be able to reason explicitly about the intentions of the different agents and their interaction protocols. We also study the complexity of satisfiability and model-checking of this extension.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.10793v2
PDF	https://arxiv.org/pdf/1906.10793v2.pdf
PWC	https://paperswithcode.com/paper/a-computational-framework-for-adaptive
Repo
Framework

Learning Near-optimal Convex Combinations of Basis Models with Generalization Guarantees


Title	Learning Near-optimal Convex Combinations of Basis Models with Generalization Guarantees
Authors	Tan Nguyen, Nan Ye, Peter L. Bartlett
Abstract	The problem of learning an optimal convex combination of basis models has been studied in a number of works, with a focus on the theoretical analysis, but little investigation on the empirical performance of the approach. In this paper, we present some new theoretical insights, and empirical results that demonstrate the effectiveness of the approach. Theoretically, we first consider whether we can replace convex combinations by linear combinations, and obtain convergence results similar to existing results for learning from a convex hull. We present a negative result showing that the linear hull of very simple basis functions can have unbounded capacity, and is thus prone to overfitting. On the other hand, convex hulls are still rich but have bounded capacities. In addition, we obtain a generalization bound for a general class of Lipschitz loss functions. Empirically, we first discuss how a convex combination can be greedily learned with early stopping, and how a convex combination can be non-greedily learned when the number of basis models is known a priori. Our experiments suggest that the greedy scheme is competitive with or better than several baselines, including boosting and random forests. The greedy algorithm requires little effort in hyper-parameter tuning, and also seems to adapt to the underlying complexity of the problem.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03742v1
PDF	https://arxiv.org/pdf/1910.03742v1.pdf
PWC	https://paperswithcode.com/paper/learning-near-optimal-convex-combinations-of
Repo
Framework

Graph based adaptive evolutionary algorithm for continuous optimization


Title	Graph based adaptive evolutionary algorithm for continuous optimization
Authors	Asmaa Ghoumari, Amir Nakib
Abstract	he greatest weakness of evolutionary algorithms, widely used today, is the premature convergence due to the loss of population diversity over generations. To overcome this problem, several algorithms have been proposed, such as the Graph-based Evolutionary Algorithm (GEA) \cite{1} which uses graphs to model the structure of the population, but also memetic or differential evolution algorithms \cite{2,3}, or diversity-based ones \cite{4,5} have been designed. These algorithms are based on multi-populations, or often rather focus on the self-tuning parameters, however, they become complex to tune because of their high number of parameters. In this paper, our approach consists of an evolutionary algorithm that allows a dynamic adaptation of the search operators based on a graph in order to limit the loss of diversity and reduce the design complexity.
Tasks
Published	2019-08-05
URL	https://arxiv.org/abs/1908.08014v1
PDF	https://arxiv.org/pdf/1908.08014v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-adaptive-evolutionary-algorithm
Repo
Framework

Argoverse: 3D Tracking and Forecasting with Rich Maps


Title	Argoverse: 3D Tracking and Forecasting with Rich Maps
Authors	Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, James Hays
Abstract	We present Argoverse – two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Argoverse was collected by a fleet of autonomous vehicles in Pittsburgh and Miami. The Argoverse 3D Tracking dataset includes 360 degree images from 7 cameras with overlapping fields of view, 3D point clouds from long range LiDAR, 6-DOF pose, and 3D track annotations. Notably, it is the only modern AV dataset that provides forward-facing stereo imagery. The Argoverse Motion Forecasting dataset includes more than 300,000 5-second tracked scenarios with a particular vehicle identified for trajectory forecasting. Argoverse is the first autonomous vehicle dataset to include “HD maps” with 290 km of mapped lanes with geometric and semantic metadata. All data is released under a Creative Commons license at www.argoverse.org. In our baseline experiments, we illustrate how detailed map information such as lane direction, driveable area, and ground height improves the accuracy of 3D object tracking and motion forecasting. Our tracking and forecasting experiments represent only an initial exploration of the use of rich maps in robotic perception. We hope that Argoverse will enable the research community to explore these problems in greater depth.
Tasks	Autonomous Vehicles, Motion Forecasting, Object Tracking
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02620v1
PDF	https://arxiv.org/pdf/1911.02620v1.pdf
PWC	https://paperswithcode.com/paper/argoverse-3d-tracking-and-forecasting-with-1
Repo
Framework

Fast Task Inference with Variational Intrinsic Successor Features


Title	Fast Task Inference with Variational Intrinsic Successor Features
Authors	Steven Hansen, Will Dabney, Andre Barreto, Tom Van de Wiele, David Warde-Farley, Volodymyr Mnih
Abstract	It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}. However, one limitation of this formulation is generalizing behaviors beyond the finite set being explicitly learned, as is needed for use on subsequent tasks. Successor features \citep{dayan93improving, barreto2017successor} provide an appealing solution to this generalization problem, but require defining the reward function as linear in some grounded feature space. In this paper, we show that these two techniques can be combined, and that each method solves the other’s primary limitation. To do so we introduce Variational Intrinsic Successor FeatuRes (VISR), a novel algorithm which learns controllable features that can be leveraged to provide enhanced generalization and fast task inference through the successor feature framework. We empirically validate VISR on the full Atari suite, in a novel setup wherein the rewards are only exposed briefly after a long unsupervised phase. Achieving human-level performance on 14 games and beating all baselines, we believe VISR represents a step towards agents that rapidly learn from limited feedback.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05030v2
PDF	https://arxiv.org/pdf/1906.05030v2.pdf
PWC	https://paperswithcode.com/paper/fast-task-inference-with-variational
Repo
Framework

Building High-Quality Auction Fraud Dataset


Title	Building High-Quality Auction Fraud Dataset
Authors	Sulaf Elshaar, Samira Sadaoui
Abstract	Given the magnitude of online auction transactions, it is difficult to safeguard consumers from dishonest sellers, such as shill bidders. To date, the application of Machine Learning Techniques (MLTs) to auction fraud has been limited, unlike their applications for combatting other types of fraud. Shill Bidding (SB) is a severe auction fraud, which is driven by modern-day technologies and clever scammers. The difficulty of identifying the behavior of sophisticated fraudsters and the unavailability of training datasets hinder the research on SB detection. In this study, we developed a high-quality SB dataset. To do so, first, we crawled and preprocessed a large number of commercial auctions and bidders’ history as well. We thoroughly preprocessed both datasets to make them usable for the computation of the SB metrics. Nevertheless, this operation requires a deep understanding of the behavior of auctions and bidders. Second, we introduced two new SB pattern s and implemented other existing SB patterns. Finally, we removed outliers to improve the quality of training SB data.
Tasks	Fraud Detection
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04272v3
PDF	https://arxiv.org/pdf/1906.04272v3.pdf
PWC	https://paperswithcode.com/paper/building-high-quality-auction-fraud-dataset
Repo
Framework

Decision Making with Machine Learning and ROC Curves


Title	Decision Making with Machine Learning and ROC Curves
Authors	Kai Feng, Han Hong, Ke Tang, Jingyuan Wang
Abstract	The Receiver Operating Characteristic (ROC) curve is a representation of the statistical information discovered in binary classification problems and is a key concept in machine learning and data science. This paper studies the statistical properties of ROC curves and its implication on model selection. We analyze the implications of different models of incentive heterogeneity and information asymmetry on the relation between human decisions and the ROC curves. Our theoretical discussion is illustrated in the context of a large data set of pregnancy outcomes and doctor diagnosis from the Pre-Pregnancy Checkups of reproductive age couples in Henan Province provided by the Chinese Ministry of Health.
Tasks	Decision Making, Model Selection
Published	2019-05-05
URL	https://arxiv.org/abs/1905.02810v1
PDF	https://arxiv.org/pdf/1905.02810v1.pdf
PWC	https://paperswithcode.com/paper/decision-making-with-machine-learning-and-roc
Repo
Framework

Relative Afferent Pupillary Defect Screening through Transfer Learning


Title	Relative Afferent Pupillary Defect Screening through Transfer Learning
Authors	Dogancan Temel, Melvin J. Mathew, Ghassan AlRegib, Yousuf M. Khalifa
Abstract	Abnormalities in pupillary light reflex can indicate optic nerve disorders that may lead to permanent visual loss if not diagnosed in an early stage. In this study, we focus on relative afferent pupillary defect (RAPD), which is based on the difference between the reactions of the eyes when they are exposed to light stimuli. Incumbent RAPD assessment methods are based on subjective practices that can lead to unreliable measurements. To eliminate subjectivity and obtain reliable measurements, we introduced an automated framework to detect RAPD. For validation, we conducted a clinical study with lab-on-a-headset, which can perform automated light reflex test. In addition to benchmarking handcrafted algorithms, we proposed a transfer learning-based approach that transformed a deep learning-based generic object recognition algorithm into a pupil detector. Based on the conducted experiments, proposed algorithm RAPDNet can achieve a sensitivity and a specificity of 90.6% over 64 test cases in a balanced set, which corresponds to an AUC of 0.929 in ROC analysis. According to our benchmark with three handcrafted algorithms and nine performance metrics, RAPDNet outperforms all other algorithms in every performance category.
Tasks	Object Recognition, Transfer Learning
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02300v1
PDF	https://arxiv.org/pdf/1908.02300v1.pdf
PWC	https://paperswithcode.com/paper/relative-afferent-pupillary-defect-screening
Repo
Framework

Robustness Verification of Support Vector Machines


Title	Robustness Verification of Support Vector Machines
Authors	Francesco Ranzato, Marco Zanella
Abstract	We study the problem of formally verifying the robustness to adversarial examples of support vector machines (SVMs), a major machine learning model for classification and regression tasks. Following a recent stream of works on formal robustness verification of (deep) neural networks, our approach relies on a sound abstract version of a given SVM classifier to be used for checking its robustness. This methodology is parametric on a given numerical abstraction of real values and, analogously to the case of neural networks, needs neither abstract least upper bounds nor widening operators on this abstraction. The standard interval domain provides a simple instantiation of our abstraction technique, which is enhanced with the domain of reduced affine forms, which is an efficient abstraction of the zonotope abstract domain. This robustness verification technique has been fully implemented and experimentally evaluated on SVMs based on linear and nonlinear (polynomial and radial basis function) kernels, which have been trained on the popular MNIST dataset of images and on the recent and more challenging Fashion-MNIST dataset. The experimental results of our prototype SVM robustness verifier appear to be encouraging: this automated verification is fast, scalable and shows significantly high percentages of provable robustness on the test set of MNIST, in particular compared to the analogous provable robustness of neural networks.
Tasks
Published	2019-04-26
URL	http://arxiv.org/abs/1904.11803v1
PDF	http://arxiv.org/pdf/1904.11803v1.pdf
PWC	https://paperswithcode.com/paper/robustness-verification-of-support-vector
Repo
Framework

End-to-end training of time domain audio separation and recognition


Title	End-to-end training of time domain audio separation and recognition
Authors	Thilo von Neumann, Keisuke Kinoshita, Lukas Drude, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach
Abstract	The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multi-speaker speech recognition. However, up until now, state-of-the-art neural network-based time domain source separation has not yet been combined with E2E speech recognition. We here demonstrate how to combine a separation module based on a Convolutional Time domain Audio Separation Network (Conv-TasNet) with an E2E speech recognizer and how to train such a model jointly by distributing it over multiple GPUs or by approximating truncated back-propagation for the convolutional front-end. To put this work into perspective and illustrate the complexity of the design space, we provide a compact overview of single-channel multi-speaker recognition systems. Our experiments show a word error rate of 11.0% on WSJ0-2mix and indicate that our joint time domain model can yield substantial improvements over cascade DNN-HMM and monolithic E2E frequency domain systems proposed so far.
Tasks	Speaker Recognition, Speech Recognition, Speech Separation
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08462v2
PDF	https://arxiv.org/pdf/1912.08462v2.pdf
PWC	https://paperswithcode.com/paper/ene-to-end-training-of-time-domain-audio
Repo
Framework

FH-GAN: Face Hallucination and Recognition using Generative Adversarial Network


Title	FH-GAN: Face Hallucination and Recognition using Generative Adversarial Network
Authors	Bayram Bayramli, Usman Ali, Te Qi, Hongtao Lu
Abstract	There are many factors affecting visual face recognition, such as low resolution images, aging, illumination and pose variance, etc. One of the most important problem is low resolution face images which can result in bad performance on face recognition. Most of the general face recognition algorithms usually assume a sufficient resolution for the face images. However, in practice many applications often do not have sufficient image resolutions. The modern face hallucination models demonstrate reasonable performance to reconstruct high-resolution images from its corresponding low resolution images. However, they do not consider identity level information during hallucination which directly affects results of the recognition of low resolution faces. To address this issue, we propose a Face Hallucination Generative Adversarial Network (FH-GAN) which improves the quality of low resolution face images and accurately recognize those low quality images. Concretely, we make the following contributions: 1) we propose FH-GAN network, an end-to-end system, that improves both face hallucination and face recognition simultaneously. The novelty of this proposed network depends on incorporating identity information in a GAN-based face hallucination algorithm via combining a face recognition network for identity preserving. 2) We also propose a new face hallucination network, namely Dense Sparse Network (DSNet), which improves upon the state-of-art in face hallucination. 3) We demonstrate benefits of training the face recognition and GAN-based DSNet jointly by reporting good result on face hallucination and recognition.
Tasks	Face Hallucination, Face Recognition
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06537v1
PDF	https://arxiv.org/pdf/1905.06537v1.pdf
PWC	https://paperswithcode.com/paper/fh-gan-face-hallucination-and-recognition
Repo
Framework

Preference Neural Network


Title	Preference Neural Network
Authors	Ayman Elgharabawy
Abstract	This paper proposes a preference neural network (PNN) to address the problem of indifference preferences orders with new activation function. PNN also solves the Multi-label ranking problem, where labels may have indifference preference orders or subgroups are equally ranked. PNN follows a multi-layer feedforward architecture with fully connected neurons. Each neuron contains a novel smooth stairstep activation function based on the number of preference orders. PNN inputs represent data features and output neurons represent label indexes. The proposed PNN is evaluated using new preference mining dataset that contains repeated label values which have not experimented before. PNN outperforms five previously proposed methods for strict label ranking in terms of accurate results with high computational efficiency.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02345v1
PDF	http://arxiv.org/pdf/1904.02345v1.pdf
PWC	https://paperswithcode.com/paper/preference-neural-network
Repo
Framework

Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning


Title	Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning
Authors	Yukai Shi, Guanbin Li, Qingxing Cao, Keze Wang, Liang Lin
Abstract	Face hallucination is a domain-specific super-resolution problem that aims to generate a high-resolution (HR) face image from a low-resolution~(LR) input. In contrast to the existing patch-wise super-resolution models that divide a face image into regular patches and independently apply LR to HR mapping to each patch, we implement deep reinforcement learning and develop a novel attention-aware face hallucination (Attention-FH) framework, which recurrently learns to attend a sequence of patches and performs facial part enhancement by fully exploiting the global interdependency of the image. Specifically, our proposed framework incorporates two components: a recurrent policy network for dynamically specifying a new attended region at each time step based on the status of the super-resolved image and the past attended region sequence, and a local enhancement network for selected patch hallucination and global state updating. The Attention-FH model jointly learns the recurrent policy network and local enhancement network through maximizing a long-term reward that reflects the hallucination result with respect to the whole HR image. Extensive experiments demonstrate that our Attention-FH significantly outperforms the state-of-the-art methods on in-the-wild face images with large pose and illumination variations.
Tasks	Face Hallucination, Super-Resolution
Published	2019-05-04
URL	https://arxiv.org/abs/1905.01509v1
PDF	https://arxiv.org/pdf/1905.01509v1.pdf
PWC	https://paperswithcode.com/paper/face-hallucination-by-attentive-sequence
Repo
Framework