January 30, 2020

2820 words 14 mins read

Paper Group ANR 229

Automatic microscopic cell counting by use of deeply-supervised density regression model. Visual search and recognition for robot task execution and monitoring. Multi-Agent Reinforcement Learning with Multi-Step Generative Models. On Convex Duality in Linear Inverse Problems. SR-LSTM: State Refinement for LSTM towards Pedestrian Trajectory Predicti …

Automatic microscopic cell counting by use of deeply-supervised density regression model


Title	Automatic microscopic cell counting by use of deeply-supervised density regression model
Authors	Shenghua He, Kyaw Thu Minn, Lilianna Solnica-Krezel, Mark Anastasio, Hua Li
Abstract	Accurately counting cells in microscopic images is important for medical diagnoses and biological studies, but manual cell counting is very tedious, time-consuming, and prone to subjective errors, and automatic counting can be less accurate than desired. To improve the accuracy of automatic cell counting, we propose here a novel method that employs deeply-supervised density regression. A fully convolutional neural network (FCNN) serves as the primary FCNN for density map regression. Innovatively, a set of auxiliary FCNNs are employed to provide additional supervision for learning the intermediate layers of the primary CNN to improve network performance. In addition, the primary CNN is designed as a concatenating framework to integrate multi-scale features through shortcut connections in the network, which improves the granularity of the features extracted from the intermediate CNN layers and further supports the final density map estimation.
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01084v3
PDF	http://arxiv.org/pdf/1903.01084v3.pdf
PWC	https://paperswithcode.com/paper/automatic-microscopic-cell-counting-by-use-of-1
Repo
Framework

Visual search and recognition for robot task execution and monitoring


Title	Visual search and recognition for robot task execution and monitoring
Authors	Lorenzo Mauro, Francesco Puja, Simone Grazioso, Valsamis Ntouskos, Marta Sanzari, Edoardo Alati, Fiora Pirri
Abstract	Visual search of relevant targets in the environment is a crucial robot skill. We propose a preliminary framework for the execution monitor of a robot task, taking care of the robot attitude to visually searching the environment for targets involved in the task. Visual search is also relevant to recover from a failure. The framework exploits deep reinforcement learning to acquire a “common sense” scene structure and it takes advantage of a deep convolutional network to detect objects and relevant relations holding between them. The framework builds on these methods to introduce a vision-based execution monitoring, which uses classical planning as a backbone for task execution. Experiments show that with the proposed vision-based execution monitor the robot can complete simple tasks and can recover from failures in autonomy.
Tasks	Common Sense Reasoning
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02870v1
PDF	http://arxiv.org/pdf/1902.02870v1.pdf
PWC	https://paperswithcode.com/paper/visual-search-and-recognition-for-robot-task
Repo
Framework

Multi-Agent Reinforcement Learning with Multi-Step Generative Models


Title	Multi-Agent Reinforcement Learning with Multi-Step Generative Models
Authors	Orr Krupnik, Igor Mordatch, Aviv Tamar
Abstract	We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems – an important domain for robots interacting with other agents in the same workspace. For non-trivial dynamical systems, MBRL typically suffers from accumulating errors. Several recent studies have addressed this problem by learning latent variable models for trajectory segments and optimizing over behavior in the latent space. In this work, we investigate whether this approach can be extended to 2-agent competitive and cooperative settings. The fundamental challenge is how to learn models that capture interactions between agents, yet are disentangled to allow for optimization of each agent behavior separately. We propose such models based on a disentangled variational auto-encoder, and demonstrate our approach on a simulated 2-robot manipulation task, where one robot can either help or distract the other. We show that our approach has better sample efficiency than a strong model-free RL baseline, and can learn both cooperative and adversarial behavior from the same data.
Tasks	Continuous Control, Decision Making, Latent Variable Models, Multi-agent Reinforcement Learning
Published	2019-01-29
URL	https://arxiv.org/abs/1901.10251v3
PDF	https://arxiv.org/pdf/1901.10251v3.pdf
PWC	https://paperswithcode.com/paper/multi-agent-reinforcement-learning-with-multi
Repo
Framework

On Convex Duality in Linear Inverse Problems


Title	On Convex Duality in Linear Inverse Problems
Authors	Mohammed Rayyan Sheriff, Debasish Chatterjee
Abstract	In this article we dwell into the class of so called ill posed Linear Inverse Problems (LIP) in machine learning, which has become almost a classic in recent times. The fundamental task in an LIP is to recover the entire signal / data from its relatively few random linear measurements. Such problems arise in variety of settings with applications ranging from medical image processing, recommender systems etc. We provide an exposition to the convex duality of the linear inverse problems, and obtain a novel and equivalent convex-concave min-max reformulation that gives rise to simple ascend-descent type algorithms to solve an LIP. Moreover, such a reformulation is crucial in developing methods to solve the dictionary learning problem with almost sure recovery constraints.
Tasks	Denoising, Dictionary Learning, Recommendation Systems
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06065v3
PDF	https://arxiv.org/pdf/1908.06065v3.pdf
PWC	https://paperswithcode.com/paper/convex-geometry-of-the-coding-problem-for
Repo
Framework


Title	SR-LSTM: State Refinement for LSTM towards Pedestrian Trajectory Prediction
Authors	Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, Nanning Zheng
Abstract	In crowd scenarios, reliable trajectory prediction of pedestrians requires insightful understanding of their social behaviors. These behaviors have been well investigated by plenty of studies, while it is hard to be fully expressed by hand-craft rules. Recent studies based on LSTM networks have shown great ability to learn social behaviors. However, many of these methods rely on previous neighboring hidden states but ignore the important current intention of the neighbors. In order to address this issue, we propose a data-driven state refinement module for LSTM network (SR-LSTM), which activates the utilization of the current intention of neighbors, and jointly and iteratively refines the current states of all participants in the crowd through a message passing mechanism. To effectively extract the social effect of neighbors, we further introduce a social-aware information selection mechanism consisting of an element-wise motion gate and a pedestrian-wise attention to select useful message from neighboring pedestrians. Experimental results on two public datasets, i.e. ETH and UCY, demonstrate the effectiveness of our proposed SR-LSTM and we achieves state-of-the-art results.
Tasks	Trajectory Prediction
Published	2019-03-07
URL	http://arxiv.org/abs/1903.02793v1
PDF	http://arxiv.org/pdf/1903.02793v1.pdf
PWC	https://paperswithcode.com/paper/sr-lstm-state-refinement-for-lstm-towards
Repo
Framework

An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation


Title	An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation
Authors	Wanyu Du, Yangfeng Ji
Abstract	Generating paraphrases from given sentences involves decoding words step by step from a large vocabulary. To learn a decoder, supervised learning which maximizes the likelihood of tokens always suffers from the exposure bias. Although both reinforcement learning (RL) and imitation learning (IL) have been widely used to alleviate the bias, the lack of direct comparison leads to only a partial image on their benefits. In this work, we present an empirical study on how RL and IL can help boost the performance of generating paraphrases, with the pointer-generator as a base model. Experiments on the benchmark datasets show that (1) imitation learning is constantly better than reinforcement learning; and (2) the pointer-generator models with imitation learning outperform the state-of-the-art methods with a large margin.
Tasks	Imitation Learning, Paraphrase Generation
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10835v1
PDF	https://arxiv.org/pdf/1908.10835v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-comparison-on-imitation-learning
Repo
Framework

Constructing High Precision Knowledge Bases with Subjective and Factual Attributes


Title	Constructing High Precision Knowledge Bases with Subjective and Factual Attributes
Authors	Ari Kobren, Pablo Barrio, Oksana Yakhnenko, Johann Hibschman, Ian Langmore
Abstract	Knowledge bases (KBs) are the backbone of many ubiquitous applications and are thus required to exhibit high precision. However, for KBs that store subjective attributes of entities, e.g., whether a movie is “kid friendly”, simply estimating precision is complicated by the inherent ambiguity in measuring subjective phenomena. In this work, we develop a method for constructing KBs with tunable precision–i.e., KBs that can be made to operate at a specific false positive rate, despite storing both difficult-to-evaluate subjective attributes and more traditional factual attributes. The key to our approach is probabilistically modeling user consensus with respect to each entity-attribute pair, rather than modeling each pair as either True or False. Uncertainty in the model is explicitly represented and used to control the KB’s precision. We propose three neural networks for fitting the consensus model and evaluate each one on data from Google Maps–a large KB of locations and their subjective and factual attributes. The results demonstrate that our learned models are well-calibrated and thus can successfully be used to control the KB’s precision. Moreover, when constrained to maintain 95% precision, the best consensus model matches the F-score of a baseline that models each entity-attribute pair as a binary variable and does not support tunable precision. When unconstrained, our model dominates the same baseline by 12% F-score. Finally, we perform an empirical analysis of attribute-attribute correlations and show that leveraging them effectively contributes to reduced uncertainty and better performance in attribute prediction.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12807v3
PDF	https://arxiv.org/pdf/1905.12807v3.pdf
PWC	https://paperswithcode.com/paper/constructing-high-precision-knowledge-bases
Repo
Framework

Fast Fourier Color Constancy and Grayness Index for ISPA Illumination Estimation Challenge


Title	Fast Fourier Color Constancy and Grayness Index for ISPA Illumination Estimation Challenge
Authors	Yanlin Qian, Ke Chen, Huanglin Yu
Abstract	We briefly introduce two submissions to the Illumination Estimation Challenge, in the Int’l Workshop on Color Vision, affiliated to the 11th Int’l Symposium on Image and Signal Processing and Analysis. The Fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.
Tasks	Color Constancy
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02076v2
PDF	https://arxiv.org/pdf/1908.02076v2.pdf
PWC	https://paperswithcode.com/paper/fast-fourier-color-constancy-and-grayness
Repo
Framework

Color Cerberus


Title	Color Cerberus
Authors	A. ~Savchik, E. ~Ershov, S. ~Karpenko
Abstract	Simple convolutional neural network was able to win ISISPA color constancy competition. Partial reimplementation of (Bianco, 2017) neural architecture would have shown even better results in this setup.
Tasks	Color Constancy
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06483v1
PDF	https://arxiv.org/pdf/1907.06483v1.pdf
PWC	https://paperswithcode.com/paper/color-cerberus
Repo
Framework

Neural Embedding for Physical Manipulations


Title	Neural Embedding for Physical Manipulations
Authors	Lingzhi Zhang, Andong Cao, Rui Li, Jianbo Shi
Abstract	In common real-world robotic operations, action and state spaces can be vast and sometimes unknown, and observations are often relatively sparse. How do we learn the full topology of action and state spaces when given only few and sparse observations? Inspired by the properties of grid cells in mammalian brains, we build a generative model that enforces a normalized pairwise distance constraint between the latent space and output space to achieve data-efficient discovery of output spaces. This method achieves substantially better results than prior generative models, such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs). Prior models have the common issue of mode collapse and thus fail to explore the full topology of output space. We demonstrate the effectiveness of our model on various datasets both qualitatively and quantitatively.
Tasks
Published	2019-07-13
URL	https://arxiv.org/abs/1907.06143v1
PDF	https://arxiv.org/pdf/1907.06143v1.pdf
PWC	https://paperswithcode.com/paper/neural-embedding-for-physical-manipulations
Repo
Framework

Efficient Transfer Bayesian Optimization with Auxiliary Information


Title	Efficient Transfer Bayesian Optimization with Auxiliary Information
Authors	Tomoharu Iwata, Takuma Otsuka
Abstract	We propose an efficient transfer Bayesian optimization method, which finds the maximum of an expensive-to-evaluate black-box function by using data on related optimization tasks. Our method uses auxiliary information that represents the task characteristics to effectively transfer knowledge for estimating a distribution over target functions. In particular, we use a Gaussian process, in which the mean and covariance functions are modeled with neural networks that simultaneously take both the auxiliary information and feature vectors as input. With a neural network mean function, we can estimate the target function even without evaluations. By using the neural network covariance function, we can extract nonlinear correlation among feature vectors that are shared across related tasks. Our Gaussian process-based formulation not only enables an analytic calculation of the posterior distribution but also swiftly adapts the target function to observations. Our method is also advantageous because the computational costs scale linearly with the number of source tasks. Through experiments using a synthetic dataset and datasets for finding the optimal pedestrian traffic regulations and optimal machine learning algorithms, we demonstrate that our method identifies the optimal points with fewer target function evaluations than existing methods.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07670v1
PDF	https://arxiv.org/pdf/1909.07670v1.pdf
PWC	https://paperswithcode.com/paper/efficient-transfer-bayesian-optimization-with
Repo
Framework

FVA: Modeling Perceived Friendliness of Virtual Agents Using Movement Characteristics


Title	FVA: Modeling Perceived Friendliness of Virtual Agents Using Movement Characteristics
Authors	Tanmay Randhavane, Aniket Bera, Kyra Kapsaskis, Kurt Gray, Dinesh Manocha
Abstract	We present a new approach for improving the friendliness and warmth of a virtual agent in an AR environment by generating appropriate movement characteristics. Our algorithm is based on a novel data-driven friendliness model that is computed using a user-study and psychological characteristics. We use our model to control the movements corresponding to the gaits, gestures, and gazing of friendly virtual agents (FVAs) as they interact with the user’s avatar and other agents in the environment. We have integrated FVA agents with an AR environment using with a Microsoft HoloLens. Our algorithm can generate plausible movements at interactive rates to increase the social presence. We also investigate the perception of a user in an AR setting and observe that an FVA has a statistically significant improvement in terms of the perceived friendliness and social presence of a user compared to an agent without the friendliness modeling. We observe an increment of 5.71% in the mean responses to a friendliness measure and an improvement of 4.03% in the mean responses to a social presence measure.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00377v1
PDF	https://arxiv.org/pdf/1907.00377v1.pdf
PWC	https://paperswithcode.com/paper/fva-modeling-perceived-friendliness-of
Repo
Framework

Optimally Scheduling CNN Convolutions for Efficient Memory Access


Title	Optimally Scheduling CNN Convolutions for Efficient Memory Access
Authors	Arthur Stoutchinin, Francesco Conti, Luca Benini
Abstract	Embedded inference engines for convolutional networks must be parsimonious in memory bandwidth and buffer sizing to meet power and cost constraints. We present an analytical memory bandwidth model for loop-nest optimization targeting architectures with application managed buffers. We applied this model to optimize the CNN convolution loop-nest. We show that our model is more accurate than previously published models. Using this model we can identify non-trivial dataflow schedules that result in lowest communication bandwidth given tight local buffering constraints. We show that optimal dataflow schedules are implementable in practice and that our model is accurate with respect to a real implementation; moreover, we introduce an accelerator architecture, named Hardware Convolution Block (HWC), which implements the optimal schedules, and we show it achieves up to 14x memory bandwidth reduction compared to a previously published accelerator with a similar memory interface, but implementing a non-optimal schedule.
Tasks
Published	2019-02-04
URL	http://arxiv.org/abs/1902.01492v1
PDF	http://arxiv.org/pdf/1902.01492v1.pdf
PWC	https://paperswithcode.com/paper/optimally-scheduling-cnn-convolutions-for
Repo
Framework

A cascaded dual-domain deep learning reconstruction method for sparsely spaced multidetector helical CT


Title	A cascaded dual-domain deep learning reconstruction method for sparsely spaced multidetector helical CT
Authors	Ao Zheng, Hewei Gao, Li Zhang, Yuxiang Xing
Abstract	Helical CT has been widely used in clinical diagnosis. Sparsely spaced multidetector in z direction can increase the coverage of the detector provided limited detector rows. It can speed up volumetric CT scan, lower the radiation dose and reduce motion artifacts. However, it leads to insufficient data for reconstruction. That means reconstructions from general analytical methods will have severe artifacts. Iterative reconstruction methods might be able to deal with this situation but with the cost of huge computational load. In this work, we propose a cascaded dual-domain deep learning method that completes both data transformation in projection domain and error reduction in image domain. First, a convolutional neural network (CNN) in projection domain is constructed to estimate missing helical projection data and converting helical projection data to 2D fan-beam projection data. This step is to suppress helical artifacts and reduce the following computational cost. Then, an analytical linear operator is followed to transfer the data from projection domain to image domain. Finally, an image domain CNN is added to improve image quality further. These three steps work as an entirety and can be trained end to end. The overall network is trained using a simulated lung CT dataset with Poisson noise from 25 patients. We evaluate the trained network on another three patients and obtain very encouraging results with both visual examination and quantitative comparison. The resulting RRMSE is 6.56% and the SSIM is 99.60%. In addition, we test the trained network on the lung CT dataset with different noise level and a new dental CT dataset to demonstrate the generalization and robustness of our method.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03746v2
PDF	https://arxiv.org/pdf/1910.03746v2.pdf
PWC	https://paperswithcode.com/paper/a-cascaded-dual-domain-deep-learning
Repo
Framework

Lexicographically Ordered Multi-Objective Clustering


Title	Lexicographically Ordered Multi-Objective Clustering
Authors	Sainyam Galhotra, Sandhya Saisubramanian, Shlomo Zilberstein
Abstract	We introduce a rich model for multi-objective clustering with lexicographic ordering over objectives and a slack. The slack denotes the allowed multiplicative deviation from the optimal objective value of the higher priority objective to facilitate improvement in lower-priority objectives. We then propose an algorithm called Zeus to solve this class of problems, which is characterized by a makeshift function. The makeshift fine tunes the clusters formed by the processed objectives so as to improve the clustering with respect to the unprocessed objectives, given the slack. We present makeshift for solving three different classes of objectives and analyze their solution guarantees. Finally, we empirically demonstrate the effectiveness of our approach on three applications using real-world data.
Tasks
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00750v1
PDF	http://arxiv.org/pdf/1903.00750v1.pdf
PWC	https://paperswithcode.com/paper/lexicographically-ordered-multi-objective
Repo
Framework