January 29, 2020

3265 words 16 mins read

Paper Group ANR 520

NEARBY Platform: Algorithm for Automated Asteroids Detection in Astronomical Images. On the Efficiency of the Sinkhorn and Greenkhorn Algorithms and Their Acceleration for Optimal Transport. DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System. Analysis of the $(μ/μ_I,λ)$-CSA-ES with Repair by Projection Applied to a Conically …

NEARBY Platform: Algorithm for Automated Asteroids Detection in Astronomical Images


Title	NEARBY Platform: Algorithm for Automated Asteroids Detection in Astronomical Images
Authors	T. Stefanut, V. Bacu, C. Nandra, D. Balasz, D. Gorgan, O. Vaduvescu
Abstract	In the past two decades an increasing interest in discovering Near Earth Objects has been noted in the astronomical community. Dedicated surveys have been operated for data acquisition and processing, resulting in the present discovery of over 18.000 objects that are closer than 30 million miles of Earth. Nevertheless, recent events have shown that there still are many undiscovered asteroids that can be on collision course to Earth. This article presents an original NEO detection algorithm developed in the NEARBY research object, that has been integrated into an automated MOPS processing pipeline aimed at identifying moving space objects based on the blink method. Proposed solution can be considered an approach of Big Data processing and analysis, implementing visual analytics techniques for rapid human data validation.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02545v1
PDF	http://arxiv.org/pdf/1901.02545v1.pdf
PWC	https://paperswithcode.com/paper/nearby-platform-algorithm-for-automated
Repo
Framework

On the Efficiency of the Sinkhorn and Greenkhorn Algorithms and Their Acceleration for Optimal Transport


Title	On the Efficiency of the Sinkhorn and Greenkhorn Algorithms and Their Acceleration for Optimal Transport
Authors	Tianyi Lin, Nhat Ho, Michael I. Jordan
Abstract	We present new complexity results for several algorithms that approximately solve the regularized optimal transport (OT) problem between two discrete probability measures with at most $n$ atoms. First, we show that a greedy variant of the classical Sinkhorn algorithm, known as the \textit{Greenkhorn} algorithm, achieves the complexity bound of $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$, which improves the best known bound $\widetilde{\mathcal{O}}(n^2\varepsilon^{-3})$. Notably, this matches the best known complexity bound of the Sinkhorn algorithm and explains the superior performance of the Greenkhorn algorithm in practice. Furthermore, we generalize an adaptive primal-dual accelerated gradient descent (APDAGD) algorithm with mirror mapping $\phi$ and show that the resulting \textit{adaptive primal-dual accelerated mirror descent} (APDAMD) algorithm achieves the complexity bound of $\widetilde{\mathcal{O}}(n^2\sqrt{\delta}\varepsilon^{-1})$ where $\delta>0$ depends on $\phi$. We point out that an existing complexity bound for the APDAGD algorithm is not valid in general using a simple counterexample and then establish the complexity bound of $\widetilde{\mathcal{O}}(n^{5/2}\varepsilon^{-1})$ by exploiting the connection between the APDAMD and APDAGD algorithms. Moreover, we introduce accelerated Sinkhorn and Greenkhorn algorithms that achieve the complexity bound of $\widetilde{\mathcal{O}}(n^{7/3}\varepsilon^{-1})$, which improves on the complexity bounds $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$ of Sinkhorn and Greenkhorn algorithms in terms of $\varepsilon$. Experimental results on synthetic and real datasets demonstrate the favorable performance of new algorithms in practice.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.01437v6
PDF	https://arxiv.org/pdf/1906.01437v6.pdf
PWC	https://paperswithcode.com/paper/on-the-acceleration-of-the-sinkhorn-and
Repo
Framework

DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System


Title	DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System
Authors	Ankur Handa, Karl Van Wyk, Wei Yang, Jacky Liang, Yu-Wei Chao, Qian Wan, Stan Birchfield, Nathan Ratliff, Dieter Fox
Abstract	Teleoperation offers the possibility of imparting robotic systems with sophisticated reasoning skills, intuition, and creativity to perform tasks. However, current teleoperation solutions for high degree-of-actuation (DoA), multi-fingered robots are generally cost-prohibitive, while low-cost offerings usually provide reduced degrees of control. Herein, a low-cost, vision based teleoperation system, DexPilot, was developed that allows for complete control over the full 23 DoA robotic system by merely observing the bare human hand. DexPilot enables operators to carry out a variety of complex manipulation tasks that go beyond simple pick-and-place operations. This allows for collection of high dimensional, multi-modality, state-action data that can be leveraged in the future to learn sensorimotor policies for challenging manipulation tasks. The system performance was measured through speed and reliability metrics across two human demonstrators on a variety of tasks. The videos of the experiments can be found at https://sites.google.com/view/dex-pilot.
Tasks
Published	2019-10-07
URL	https://arxiv.org/abs/1910.03135v2
PDF	https://arxiv.org/pdf/1910.03135v2.pdf
PWC	https://paperswithcode.com/paper/dexpilot-vision-based-teleoperation-of
Repo
Framework

Analysis of the $(μ/μ_I,λ)$-CSA-ES with Repair by Projection Applied to a Conically Constrained Problem


Title	Analysis of the $(μ/μ_I,λ)$-CSA-ES with Repair by Projection Applied to a Conically Constrained Problem
Authors	Patrick Spettel, Hans-Georg Beyer
Abstract	Theoretical analyses of evolution strategies are indispensable for gaining a deep understanding of their inner workings. For constrained problems, rather simple problems are of interest in the current research. This work presents a theoretical analysis of a multi-recombinative evolution strategy with cumulative step size adaptation applied to a conically constrained linear optimization problem. The state of the strategy is modeled by random variables and a stochastic iterative mapping is introduced. For the analytical treatment, fluctuations are neglected and the mean value iterative system is considered. Non-linear difference equations are derived based on one-generation progress rates. Based on that, expressions for the steady state of the mean value iterative system are derived. By comparison with real algorithm runs, it is shown that for the considered assumptions, the theoretical derivations are able to predict the dynamics and the steady state values of the real runs.
Tasks
Published	2019-01-23
URL	https://arxiv.org/abs/1901.07871v2
PDF	https://arxiv.org/pdf/1901.07871v2.pdf
PWC	https://paperswithcode.com/paper/analysis-of-the-_i-csa-es-with-repair-by
Repo
Framework

A Novel Approach to OCR using Image Recognition based Classification for Ancient Tamil Inscriptions in Temples


Title	A Novel Approach to OCR using Image Recognition based Classification for Ancient Tamil Inscriptions in Temples
Authors	Lalitha Giridhar, Aishwarya Dharani and, Velmathi Guruviah
Abstract	Recognition of ancient Tamil characters has always been a challenge for epigraphers. This is primarily because the language has evolved over the several centuries and the character set over this time has both expanded and diversified. This proposed work focuses on improving optical character recognition techniques for ancient Tamil script which was in use between the 7th and 12th centuries. While comprehensively curating a functional data set for ancient Tamil characters is an arduous task, in this work, a data set has been curated using cropped images of characters found on certain temple inscriptions, specific to this time as a case study. After using Otsu thresholding method for binarization of the image a two dimensional convolution neural network is defined and used to train, classify and, recognize the ancient Tamil characters. To implement the optical character recognition techniques, the neural network is linked to the Tesseract using the pytesseract library of Python. As an added feature, the work also incorporates Google’s text to speech voice engine to produce an audio output of the digitized text. Various samples for both modern and ancient Tamil were collected and passed through the system. It is found that for Tamil inscriptions studied over the considered time period, a combined efficiency of 77.7 percent can be achieved.
Tasks	Optical Character Recognition
Published	2019-07-04
URL	https://arxiv.org/abs/1907.04917v1
PDF	https://arxiv.org/pdf/1907.04917v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-approach-to-ocr-using-image
Repo
Framework

Decentralized Dynamic Task Allocation in Swarm Robotic Systems for Disaster Response


Title	Decentralized Dynamic Task Allocation in Swarm Robotic Systems for Disaster Response
Authors	Payam Ghassemi, David DePauw, Souma Chowdhury
Abstract	Multiple robotic systems, working together, can provide important solutions to different real-world applications (e.g., disaster response), among which task allocation problems feature prominently. Very few existing decentralized multi-robotic task allocation (MRTA) methods simultaneously offer the following capabilities: consideration of task deadlines, consideration of robot range and task completion capacity limitations, and allowing asynchronous decision-making under dynamic task spaces. To provision these capabilities, this paper presents a computationally efficient algorithm that involves novel construction and matching of bipartite graphs. Its performance is tested on a multi-UAV flood response application.
Tasks	Decision Making
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04394v1
PDF	https://arxiv.org/pdf/1907.04394v1.pdf
PWC	https://paperswithcode.com/paper/decentralized-dynamic-task-allocation-in
Repo
Framework


Title	Algorithmic Rewriting of Health-Related Ads to Improve their Performance
Authors	Brit Youngmann, Ran Gilad-Bachrach, Danny Karmon, Elad Yom-Tov
Abstract	Search advertising is one of the most commonly-used methods of advertising. Past work has shown that search advertising can be employed to improve health by eliciting positive behavioral change. However, writing effective advertisements requires expertise and (possible expensive) experimentation, both of which may not be available to public health authorities wishing to elicit such behavioral changes, especially when dealing with a public health crises such as epidemic outbreaks. Here we develop an algorithm which builds on past advertising data to train a sequence-to-sequence Deep Neural Network which “translates” advertisements into optimized ads that are more likely to be clicked. The network is trained using more than 114 thousands ads shown on Microsoft Advertising. We apply this translator to two health related domains: Medical Symptoms (MS) and Preventative Healthcare (PH) and measure the improvements in click-through rates (CTR). Our experiments show that the generated ads are predicted to have higher CTR in 81% of MS ads and 76% of PH ads. To understand the differences between the generated ads and the original ones we develop estimators for the affective attributes of the ads. We show that the generated ads contain more calls-to-action and that they reflect higher valence (36% increase) and higher arousal (87%) on a sample of 1000 ads. Finally, we run an advertising campaign where 10 random ads and their rephrased versions from each of the domains are run in parallel. We show an average improvement in CTR of 68% for the generated ads compared to the original ads. Our results demonstrate the ability to automatically optimize advertisement for the health domain. We believe that our work offers health authorities an improved ability to help nudge people towards healthier behaviors while saving the time and cost needed to optimize advertising campaigns.
Tasks
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12274v2
PDF	https://arxiv.org/pdf/1910.12274v2.pdf
PWC	https://paperswithcode.com/paper/the-automated-copywriter-algorithmic
Repo
Framework

Domain Adaptation for Semantic Segmentation with Maximum Squares Loss


Title	Domain Adaptation for Semantic Segmentation with Maximum Squares Loss
Authors	Minghao Chen, Hongyang Xue, Deng Cai
Abstract	Deep neural networks for semantic segmentation always require a large number of samples with pixel-level labels, which becomes the major difficulty in their real-world applications. To reduce the labeling cost, unsupervised domain adaptation (UDA) approaches are proposed to transfer knowledge from labeled synthesized datasets to unlabeled real-world datasets. Recently, some semi-supervised learning methods have been applied to UDA and achieved state-of-the-art performance. One of the most popular approaches in semi-supervised learning is the entropy minimization method. However, when applying the entropy minimization to UDA for semantic segmentation, the gradient of the entropy is biased towards samples that are easy to transfer. To balance the gradient of well-classified target samples, we propose the maximum squares loss. Our maximum squares loss prevents the training process being dominated by easy-to-transfer samples in the target domain. Besides, we introduce the image-wise weighting ratio to alleviate the class imbalance in the unlabeled target domain. Both synthetic-to-real and cross-city adaptation experiments demonstrate the effectiveness of our proposed approach. The code is released at https://github. com/ZJULearning/MaxSquareLoss.
Tasks	Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13589v1
PDF	https://arxiv.org/pdf/1909.13589v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptation-for-semantic-segmentation-1
Repo
Framework

Learning from Videos with Deep Convolutional LSTM Networks


Title	Learning from Videos with Deep Convolutional LSTM Networks
Authors	Logan Courtney, Ramavarapu Sreenivas
Abstract	This paper explores the use of convolution LSTMs to simultaneously learn spatial- and temporal-information in videos. A deep network of convolutional LSTMs allows the model to access the entire range of temporal information at all spatial scales of the data. We describe our experiments involving convolution LSTMs for lipreading that demonstrate the model is capable of selectively choosing which spatiotemporal scales are most relevant for a particular dataset. The proposed deep architecture also holds promise in other applications where spatiotemporal features play a vital role without having to specifically cater the design of the network for the particular spatiotemporal features existent within the problem. For the Lip Reading in the Wild (LRW) dataset, our model slightly outperforms the previous state of the art (83.4% vs. 83.0%) and sets the new state of the art at 85.2% when the model is pretrained on the Lip Reading Sentences (LRS2) dataset.
Tasks	Lipreading
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04817v1
PDF	http://arxiv.org/pdf/1904.04817v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-videos-with-deep-convolutional
Repo
Framework

Dually Supervised Feature Pyramid for Object Detection and Segmentation


Title	Dually Supervised Feature Pyramid for Object Detection and Segmentation
Authors	Fan Yang, Cheng Lu, Yandong Guo, Longin Jan Latecki, Haibin Ling
Abstract	Feature pyramid architecture has been broadly adopted in object detection and segmentation to deal with multi-scale problem. However, in this paper we show that the capacity of the architecture has not been fully explored due to the inadequate utilization of the supervision information. Such insufficient utilization is caused by the supervision signal degradation in back propagation. Thus inspired, we propose a dually supervised method, named dually supervised FPN (DSFPN), to enhance the supervision signal when training the feature pyramid network (FPN). In particular, DSFPN is constructed by attaching extra prediction (i.e., detection or segmentation) heads to the bottom-up subnet of FPN. Hence, the features can be optimized by the additional heads before being forwarded to subsequent networks. Further, the auxiliary heads can serve as a regularization term to facilitate the model training. In addition, to strengthen the capability of the detection heads in DSFPN for handling two inhomogeneous tasks, i.e., classification and regression, the originally shared hidden feature space is separated by decoupling classification and regression subnets. To demonstrate the generalizability, effectiveness, and efficiency of the proposed method, DSFPN is integrated into four representative detectors (Faster RCNN, Mask RCNN, Cascade RCNN, and Cascade Mask RCNN) and assessed on the MS COCO dataset. Promising precision improvement, state-of-the-art performance, and negligible additional computational cost are demonstrated through extensive experiments. Code will be provided.
Tasks	Object Detection
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03730v2
PDF	https://arxiv.org/pdf/1912.03730v2.pdf
PWC	https://paperswithcode.com/paper/dually-supervised-feature-pyramid-for-object
Repo
Framework

Context-Gated Convolution


Title	Context-Gated Convolution
Authors	Xudong Lin, Lin Ma, Wei Liu, Shih-Fu Chang
Abstract	As the basic building block of Convolutional Neural Networks (CNNs), the convolutional layer is designed to extract local patterns and lacks the ability to model global context in its nature. Many efforts have been recently devoted to complementing CNNs with the global modeling ability, especially by a family of works on global feature interaction. In these works, the global context information is incorporated into local features before they are fed into convolutional layers. However, research on neuroscience reveals that the neurons’ ability of modifying their functions dynamically according to context is essential for the perceptual tasks, which has been overlooked in most of CNNs. Motivated by this, we propose one novel Context-Gated Convolution (CGC) to explicitly modify the weights of convolutional layers adaptively under the guidance of global context. As such, being aware of the global context, the modulated convolution kernel of our proposed CGC can better extract representative local patterns and compose discriminative features. Moreover, our proposed CGC is lightweight and applicable with modern CNN architectures, and consistently improves the performance of CNNs according to extensive experiments on image classification, action recognition, and machine translation.
Tasks	Image Classification, Machine Translation
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05577v3
PDF	https://arxiv.org/pdf/1910.05577v3.pdf
PWC	https://paperswithcode.com/paper/context-gated-convolution
Repo
Framework

Leveraging Text Repetitions and Denoising Autoencoders in OCR Post-correction


Title	Leveraging Text Repetitions and Denoising Autoencoders in OCR Post-correction
Authors	Kai Hakala, Aleksi Vesanto, Niko Miekka, Tapio Salakoski, Filip Ginter
Abstract	A common approach for improving OCR quality is a post-processing step based on models correcting misdetected characters and tokens. These models are typically trained on aligned pairs of OCR read text and their manually corrected counterparts. In this paper we show that the requirement of manually corrected training data can be alleviated by estimating the OCR errors from repeating text spans found in large OCR read text corpora and generating synthetic training examples following this error distribution. We use the generated data for training a character-level neural seq2seq model and evaluate the performance of the suggested model on a manually corrected corpus of Finnish newspapers mostly from the 19th century. The results show that a clear improvement over the underlying OCR system as well as previously suggested models utilizing uniformly generated noise can be achieved.
Tasks	Denoising, Optical Character Recognition
Published	2019-06-26
URL	https://arxiv.org/abs/1906.10907v1
PDF	https://arxiv.org/pdf/1906.10907v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-text-repetitions-and-denoising
Repo
Framework

A Generalized Training Approach for Multiagent Learning


Title	A Generalized Training Approach for Multiagent Learning
Authors	Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Perolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Remi Munos
Abstract	This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle as special cases, and (2) in principle applies to general-sum, many-player games. Despite this, prior studies of PSRO have been focused on two-player zero-sum games, a regime wherein Nash equilibria are tractably computable. In moving from two-player zero-sum games to more general settings, computation of Nash equilibria quickly becomes infeasible. Here, we extend the theoretical underpinnings of PSRO by considering an alternative solution concept, $\alpha$-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and applies readily to general-sum, many-player settings. We establish convergence guarantees in several games classes, and identify links between Nash equilibria and $\alpha$-Rank. We demonstrate the competitive performance of $\alpha$-Rank-based PSRO against an exact Nash solver-based PSRO in 2-player Kuhn and Leduc Poker. We then go beyond the reach of prior PSRO applications by considering 3- to 5-player poker games, yielding instances where $\alpha$-Rank achieves faster convergence than approximate Nash solvers, thus establishing it as a favorable general games solver. We also carry out an initial empirical validation in MuJoCo soccer, illustrating the feasibility of the proposed approach in another complex domain.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12823v2
PDF	https://arxiv.org/pdf/1909.12823v2.pdf
PWC	https://paperswithcode.com/paper/a-generalized-training-approach-for-1
Repo
Framework

MMD-Bayes: Robust Bayesian Estimation via Maximum Mean Discrepancy


Title	MMD-Bayes: Robust Bayesian Estimation via Maximum Mean Discrepancy
Authors	Badr-Eddine Chérief-Abdellatif, Pierre Alquier
Abstract	In some misspecified settings, the posterior distribution in Bayesian statistics may lead to inconsistent estimates. To fix this issue, it has been suggested to replace the likelihood by a pseudo-likelihood, that is the exponential of a loss function enjoying suitable robustness properties. In this paper, we build a pseudo-likelihood based on the Maximum Mean Discrepancy, defined via an embedding of probability distributions into a reproducing kernel Hilbert space. We show that this MMD-Bayes posterior is consistent and robust to model misspecification. As the posterior obtained in this way might be intractable, we also prove that reasonable variational approximations of this posterior enjoy the same properties. We provide details on a stochastic gradient algorithm to compute these variational approximations. Numerical simulations indeed suggest that our estimator is more robust to misspecification than the ones based on the likelihood.
Tasks
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13339v2
PDF	https://arxiv.org/pdf/1909.13339v2.pdf
PWC	https://paperswithcode.com/paper/mmd-bayes-robust-bayesian-estimation-via
Repo
Framework

Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies


Title	Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies
Authors	Muhammad A. Masood, Finale Doshi-Velez
Abstract	Standard reinforcement learning methods aim to master one way of solving a task whereas there may exist multiple near-optimal policies. Being able to identify this collection of near-optimal policies can allow a domain expert to efficiently explore the space of reasonable solutions. Unfortunately, existing approaches that quantify uncertainty over policies are not ultimately relevant to finding policies with qualitatively distinct behaviors. In this work, we formalize the difference between policies as a difference between the distribution of trajectories induced by each policy, which encourages diversity with respect to both state visitation and action choices. We derive a gradient-based optimization technique that can be combined with existing policy gradient methods to now identify diverse collections of well-performing policies. We demonstrate our approach on benchmarks and a healthcare task.
Tasks	Policy Gradient Methods
Published	2019-05-31
URL	https://arxiv.org/abs/1906.00088v1
PDF	https://arxiv.org/pdf/1906.00088v1.pdf
PWC	https://paperswithcode.com/paper/190600088
Repo
Framework