October 17, 2019

3026 words 15 mins read

Paper Group ANR 905

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments. Fast Face Image Synthesis with Minimal Training. Evaluation of Feature Detector-Descriptor for Real Object Matching under Various Conditions of Ilumination and Affine Transformation. A Theory of Statistical Inference for Ensuring the Robustness of Scienti …

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments


Title	Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments
Authors	Yan Zheng, Jianye Hao, Zongzhang Zhang
Abstract	Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention. Existing multiagent DRL algorithms are inefficient when facing with the non-stationarity due to agents update their policies simultaneously in stochastic cooperative environments. This paper extends the recently proposed weighted double estimator to the multiagent domain and propose a multiagent DRL framework, named weighted double deep Q-network (WDDQN). By utilizing the weighted double estimator and the deep neural network, WDDQN can not only reduce the bias effectively but also be extended to scenarios with raw visual inputs. To achieve efficient cooperation in the multiagent domain, we introduce the lenient reward network and the scheduled replay strategy. Experiments show that the WDDQN outperforms the existing DRL and multiaent DRL algorithms, i.e., double DQN and lenient Q-learning, in terms of the average reward and the convergence rate in stochastic cooperative environments.
Tasks	Q-Learning
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08534v2
PDF	http://arxiv.org/pdf/1802.08534v2.pdf
PWC	https://paperswithcode.com/paper/weighted-double-deep-multiagent-reinforcement
Repo
Framework

Fast Face Image Synthesis with Minimal Training


Title	Fast Face Image Synthesis with Minimal Training
Authors	Sandipan Banerjee, Walter J. Scheirer, Kevin W. Bowyer, Patrick J. Flynn
Abstract	We propose an algorithm to generate realistic face images of both real and synthetic identities (people who do not exist) with different facial yaw, shape and resolution.The synthesized images can be used to augment datasets to train CNNs or as massive distractor sets for biometric verification experiments without any privacy concerns. Additionally, law enforcement can make use of this technique to train forensic experts to recognize faces. Our method samples face components from a pool of multiple face images of real identities to generate the synthetic texture. Then, a real 3D head model compatible to the generated texture is used to render it under different facial yaw transformations. We perform multiple quantitative experiments to assess the effectiveness of our synthesis procedure in CNN training and its potential use to generate distractor face images. Additionally, we compare our method with popular GAN models in terms of visual quality and execution time.
Tasks	Image Generation
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01474v3
PDF	http://arxiv.org/pdf/1811.01474v3.pdf
PWC	https://paperswithcode.com/paper/fast-face-image-synthesis-with-minimal
Repo
Framework

Evaluation of Feature Detector-Descriptor for Real Object Matching under Various Conditions of Ilumination and Affine Transformation


Title	Evaluation of Feature Detector-Descriptor for Real Object Matching under Various Conditions of Ilumination and Affine Transformation
Authors	Novanto Yudistira, Achmad Ridok, Ali Fauzi
Abstract	This study attempts to provide explanations, descriptions and evaluations of some most popular and current combinations of description and descriptor frameworks, namely SIFT, SURF, MSER, and BRISK for keypoint extractors and SIFT, SURF, BRISK, and FREAK for descriptors. Evaluations are made based on the number of matches of keypoints and repeatability in various image variations. It is used as the main parameter to assess how well combinations of algorithms are in matching objects with different variations. There are many papers that describe the comparison of detection and description features to detect objects in images under various conditions, but the combination of algorithms attached to them has not been much discussed. The problem domain is limited to different illumination levels and affine transformations from different perspectives. To evaluate the robustness of all combinations of algorithms, we use a stereo image matching case.
Tasks
Published	2018-04-28
URL	http://arxiv.org/abs/1804.10855v2
PDF	http://arxiv.org/pdf/1804.10855v2.pdf
PWC	https://paperswithcode.com/paper/evaluation-of-feature-detector-descriptor-for
Repo
Framework

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results


Title	A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results
Authors	Beau Coker, Cynthia Rudin, Gary King
Abstract	Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong, but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one’s own hypotheses. Since the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. We derive “hacking intervals,” which are the range of a summary statistic one may obtain given a class of possible endogenous manipulations of the data. Hacking intervals require no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval, and is often easier to interpret than a classical confidence interval. Some versions of hacking intervals turn out to be equivalent to classical confidence intervals, which means they may also provide a more intuitive and potentially more useful interpretation of classical confidence intervals
Tasks
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08646v1
PDF	http://arxiv.org/pdf/1804.08646v1.pdf
PWC	https://paperswithcode.com/paper/a-theory-of-statistical-inference-for
Repo
Framework

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy


Title	Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy
Authors	En Li, Zhi Zhou, Xu Chen
Abstract	As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance and energy overhead. While offloading DNNs to the cloud for execution suffers unpredictable performance, due to the uncontrolled long wide-area network latency. To address these challenges, in this paper, we propose Edgent, a collaborative and on-demand DNN co-inference framework with device-edge synergy. Edgent pursues two design knobs: (1) DNN partitioning that adaptively partitions DNN computation between device and edge, in order to leverage hybrid computation resources in proximity for real-time DNN inference. (2) DNN right-sizing that accelerates DNN inference through early-exit at a proper intermediate DNN layer to further reduce the computation latency. The prototype implementation and extensive evaluations based on Raspberry Pi demonstrate Edgent’s effectiveness in enabling on-demand low-latency edge intelligence.
Tasks
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07840v4
PDF	http://arxiv.org/pdf/1806.07840v4.pdf
PWC	https://paperswithcode.com/paper/edge-intelligence-on-demand-deep-learning
Repo
Framework

Investigating the Effect of Music and Lyrics on Spoken-Word Recognition


Title	Investigating the Effect of Music and Lyrics on Spoken-Word Recognition
Authors	Odette Scharenborg, Martha Larson
Abstract	Background music in social interaction settings can hinder conversation. Yet, little is known of how specific properties of music impact speech processing. This paper addresses this knowledge gap by investigating 1) whether the masking effect of background music with lyrics is larger than that of music without lyrics, and 2) whether the masking effect is larger for more complex music. To answer these questions, a word identification experiment was run in which Dutch participants listened to Dutch CVC words embedded in stretches of background music in two conditions, with and without lyrics, and at three SNRs. Three songs were used of different genres and complexities. Music stretches with and without lyrics were sampled from the same song in order to control for factors beyond the presence of lyrics. The results showed a clear negative impact of the presence of lyrics in background music on spoken-word recognition. This impact is independent of complexity. The results suggest that social spaces (e.g., restaurants, caf'es and bars) should make careful choices of music to promote conversation, and open a path for future work.
Tasks
Published	2018-03-13
URL	http://arxiv.org/abs/1803.05058v1
PDF	http://arxiv.org/pdf/1803.05058v1.pdf
PWC	https://paperswithcode.com/paper/investigating-the-effect-of-music-and-lyrics
Repo
Framework

A Bayesian model of acquisition and clearance of bacterial colonization


Title	A Bayesian model of acquisition and clearance of bacterial colonization
Authors	Marko Järvenpää, Mohamad R. Abdul Sater, Georgia K. Lagoudas, Paul C. Blainey, Loren G. Miller, James A. McKinnell, Susan S. Huang, Yonatan H. Grad, Pekka Marttinen
Abstract	Bacterial populations that colonize a host play important roles in host health, including serving as a reservoir that transmits to other hosts and from which invasive strains emerge, thus emphasizing the importance of understanding rates of acquisition and clearance of colonizing populations. Studies of colonization dynamics have been based on assessment of whether serial samples represent a single population or distinct colonization events. A common solution to estimate acquisition and clearance rates is to use a fixed genetic distance threshold. However, this approach is often inadequate to account for the diversity of the underlying within-host evolving population, the time intervals between consecutive measurements, and the uncertainty in the estimated acquisition and clearance rates. Here, we summarize recently submitted work \cite{jarvenpaa2018named} and present a Bayesian model that provides probabilities of whether two strains should be considered the same, allowing to determine bacterial clearance and acquisition from genomes sampled over time. We explicitly model the within-host variation using population genetic simulation, and the inference is done by combining information from multiple data sources by using a combination of Approximate Bayesian Computation (ABC) and Markov Chain Monte Carlo (MCMC). We use the method to analyse a collection of methicillin resistant Staphylococcus aureus (MRSA) isolates.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10958v1
PDF	http://arxiv.org/pdf/1811.10958v1.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-model-of-acquisition-and-clearance
Repo
Framework

Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications


Title	Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications
Authors	Jongsoo Park, Maxim Naumov, Protonu Basu, Summer Deng, Aravind Kalaiah, Daya Khudia, James Law, Parth Malani, Andrey Malevich, Satish Nadathur, Juan Pino, Martin Schatz, Alexander Sidorov, Viswanath Sivakumar, Andrew Tulloch, Xiaodong Wang, Yiming Wu, Hector Yuen, Utku Diril, Dmytro Dzhulgakov, Kim Hazelwood, Bill Jia, Yangqing Jia, Lin Qiao, Vijay Rao, Nadav Rotem, Sungjoo Yoo, Mikhail Smelyanskiy
Abstract	The application of deep learning techniques resulted in remarkable improvement of machine learning models. In this paper provides detailed characterizations of deep learning models used in many Facebook social network services. We present computational characteristics of our models, describe high performance optimizations targeting existing systems, point out their limitations and make suggestions for the future general-purpose/accelerated inference hardware. Also, we highlight the need for better co-design of algorithms, numerics and computing platforms to address the challenges of workloads often run in data centers.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09886v2
PDF	http://arxiv.org/pdf/1811.09886v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-inference-in-facebook-data
Repo
Framework

Q-CP: Learning Action Values for Cooperative Planning


Title	Q-CP: Learning Action Values for Cooperative Planning
Authors	Francesco Riccio, Roberto Capobianco, Daniele Nardi
Abstract	Research on multi-robot systems has demonstrated promising results in manifold applications and domains. Still, efficiently learning an effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state dimensionality (e.g. hyper-redundant and groups of robot). To alleviate this problem, we present Q-CP a cooperative model-based reinforcement learning algorithm, which exploits action values to both (1) guide the exploration of the state space and (2) generate effective policies. Specifically, we exploit Q-learning to attack the curse-of-dimensionality in the iterations of a Monte-Carlo Tree Search. We implement and evaluate Q-CP on different stochastic cooperative (general-sum) games: (1) a simple cooperative navigation problem among 3 robots, (2) a cooperation scenario between a pair of KUKA YouBots performing hand-overs, and (3) a coordination task between two mobile robots entering a door. The obtained results show the effectiveness of Q-CP in the chosen applications, where action values drive the exploration and reduce the computational demand of the planning process while achieving good performance.
Tasks	Q-Learning
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00297v1
PDF	http://arxiv.org/pdf/1803.00297v1.pdf
PWC	https://paperswithcode.com/paper/q-cp-learning-action-values-for-cooperative
Repo
Framework

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation


Title	DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation
Authors	Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain
Abstract	Human auditory cortex excels at selectively suppressing background noise to focus on a target speaker. The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises. In this study, we propose a novel deep neural network (DNN) based audiovisual (AV) mask estimation model. The proposed AV mask estimation model contextually integrates the temporal dynamics of both audio and noise-immune visual features for improved mask estimation and speech separation. For optimal AV features extraction and ideal binary mask (IBM) estimation, a hybrid DNN architecture is exploited to leverages the complementary strengths of a stacked long short term memory (LSTM) and convolution LSTM network. The comparative simulation results in terms of speech quality and intelligibility demonstrate significant performance improvement of our proposed AV mask estimation model as compared to audio-only and visual-only mask estimation approaches for both speaker dependent and independent scenarios.
Tasks	Speech Separation
Published	2018-07-31
URL	http://arxiv.org/abs/1808.00060v1
PDF	http://arxiv.org/pdf/1808.00060v1.pdf
PWC	https://paperswithcode.com/paper/dnn-driven-speaker-independent-audio-visual
Repo
Framework

Semantic Analysis of (Reflectional) Visual Symmetry: A Human-Centred Computational Model for Declarative Explainability


Title	Semantic Analysis of (Reflectional) Visual Symmetry: A Human-Centred Computational Model for Declarative Explainability
Authors	Jakob Suchan, Mehul Bhatt, Srikrishna Vardarajan, Seyed Ali Amirshahi, Stella Yu
Abstract	We present a computational model for the semantic interpretation of symmetry in naturalistic scenes. Key features include a human-centred representation, and a declarative, explainable interpretation model supporting deep semantic question-answering founded on an integration of methods in knowledge representation and deep learning based computer vision. In the backdrop of the visual arts, we showcase the framework’s capability to generate human-centred, queryable, relational structures, also evaluating the framework with an empirical study on the human perception of visual symmetry. Our framework represents and is driven by the application of foundational, integrated Vision and Knowledge Representation and Reasoning methods for applications in the arts, and the psychological and social sciences.
Tasks	Question Answering
Published	2018-05-31
URL	http://arxiv.org/abs/1806.07376v2
PDF	http://arxiv.org/pdf/1806.07376v2.pdf
PWC	https://paperswithcode.com/paper/semantic-analysis-of-reflectional-visual
Repo
Framework

Open Source Dataset and Machine Learning Techniques for Automatic Recognition of Historical Graffiti


Title	Open Source Dataset and Machine Learning Techniques for Automatic Recognition of Historical Graffiti
Authors	Nikita Gordienko, Peng Gang, Yuri Gordienko, Wei Zeng, Oleg Alienin, Oleksandr Rokovyi, Sergii Stirenko
Abstract	Machine learning techniques are presented for automatic recognition of the historical letters (XI-XVIII centuries) carved on the stoned walls of St.Sophia cathedral in Kyiv (Ukraine). A new image dataset of these carved Glagolitic and Cyrillic letters (CGCL) was assembled and pre-processed for recognition and prediction by machine learning methods. The dataset consists of more than 4000 images for 34 types of letters. The explanatory data analysis of CGCL and notMNIST datasets shown that the carved letters can hardly be differentiated by dimensionality reduction methods, for example, by t-distributed stochastic neighbor embedding (tSNE) due to the worse letter representation by stone carving in comparison to hand writing. The multinomial logistic regression (MLR) and a 2D convolutional neural network (CNN) models were applied. The MLR model demonstrated the area under curve (AUC) values for receiver operating characteristic (ROC) are not lower than 0.92 and 0.60 for notMNIST and CGCL, respectively. The CNN model gave AUC values close to 0.99 for both notMNIST and CGCL (despite the much smaller size and quality of CGCL in comparison to notMNIST) under condition of the high lossy data augmentation. CGCL dataset was published to be available for the data science community as an open source resource.
Tasks	Data Augmentation, Dimensionality Reduction
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10862v1
PDF	http://arxiv.org/pdf/1808.10862v1.pdf
PWC	https://paperswithcode.com/paper/open-source-dataset-and-machine-learning
Repo
Framework

TopRank: A practical algorithm for online stochastic ranking


Title	TopRank: A practical algorithm for online stochastic ranking
Authors	Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvari
Abstract	Online learning to rank is a sequential decision-making problem where in each round the learning agent chooses a list of items and receives feedback in the form of clicks from the user. Many sample-efficient algorithms have been proposed for this problem that assume a specific click model connecting rankings and user behavior. We propose a generalized click model that encompasses many existing models, including the position-based and cascade models. Our generalization motivates a novel online learning algorithm based on topological sort, which we call TopRank. TopRank is (a) more natural than existing algorithms, (b) has stronger regret guarantees than existing algorithms with comparable generality, (c) has a more insightful proof that leaves the door open to many generalizations, (d) outperforms existing algorithms empirically.
Tasks	Decision Making, Learning-To-Rank
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02248v2
PDF	http://arxiv.org/pdf/1806.02248v2.pdf
PWC	https://paperswithcode.com/paper/toprank-a-practical-algorithm-for-online
Repo
Framework

KRISM — Krylov Subspace-based Optical Computing of Hyperspectral Images


Title	KRISM — Krylov Subspace-based Optical Computing of Hyperspectral Images
Authors	Vishwanath Saragadam, Aswin C. Sankaranarayanan
Abstract	We present an adaptive imaging technique that optically computes a low-rank approximation of a scene’s hyperspectral image, conceptualized as a matrix. Central to the proposed technique is the optical implementation of two measurement operators: a spectrally-coded imager and a spatially-coded spectrometer. By iterating between the two operators, we show that the top singular vectors and singular values of a hyperspectral image can be adaptively and optically computed with only a few iterations. We present an optical design that uses pupil plane coding for implementing the two operations and show several compelling results using a lab prototype to demonstrate the effectiveness of the proposed hyperspectral imager.
Tasks
Published	2018-01-26
URL	https://arxiv.org/abs/1801.09343v4
PDF	https://arxiv.org/pdf/1801.09343v4.pdf
PWC	https://paperswithcode.com/paper/krism-krylov-subspace-based-optical-computing
Repo
Framework

Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration


Title	Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration
Authors	Xinyuan Zhang, Xin Yuan, Lawrence Carin
Abstract	Low-rank signal modeling has been widely leveraged to capture non-local correlation in image processing applications. We propose a new method that employs low-rank tensor factor analysis for tensors generated by grouped image patches. The low-rank tensors are fed into the alternative direction multiplier method (ADMM) to further improve image reconstruction. The motivating application is compressive sensing (CS), and a deep convolutional architecture is adopted to approximate the expensive matrix inversion in CS applications. An iterative algorithm based on this low-rank tensor factorization strategy, called NLR-TFA, is presented in detail. Experimental results on noiseless and noisy CS measurements demonstrate the superiority of the proposed approach, especially at low CS sampling rates.
Tasks	Compressive Sensing, Image Reconstruction, Image Restoration
Published	2018-03-19
URL	http://arxiv.org/abs/1803.06795v1
PDF	http://arxiv.org/pdf/1803.06795v1.pdf
PWC	https://paperswithcode.com/paper/nonlocal-low-rank-tensor-factor-analysis-for
Repo
Framework