October 18, 2019

3056 words 15 mins read

Paper Group ANR 579

SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection. Contingency-Aware Exploration in Reinforcement Learning. Skin Lesion Analysis Towards Melanoma Detection via End-to-end Deep Learning of Convolutional Neural Networks. Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent’ …

SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection


Title	SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
Authors	Yonghyun Kim, Bong-Nam Kang, Daijin Kim
Abstract	Most of the recent successful methods in accurate object detection build on the convolutional neural networks (CNN). However, due to the lack of scale normalization in CNN-based detection methods, the activated channels in the feature space can be completely different according to a scale and this difference makes it hard for the classifier to learn samples. We propose a Scale Aware Network (SAN) that maps the convolutional features from the different scales onto a scale-invariant subspace to make CNN-based detection methods more robust to the scale variation, and also construct a unique learning method which considers purely the relationship between channels without the spatial information for the efficient learning of SAN. To show the validity of our method, we visualize how convolutional features change according to the scale through a channel activation matrix and experimentally show that SAN reduces the feature differences in the scale space. We evaluate our method on VOC PASCAL and MS COCO dataset. We demonstrate SAN by conducting several experiments on structures and parameters. The proposed SAN can be generally applied to many CNN-based detection methods to enhance the detection accuracy with a slight increase in the computing time.
Tasks	Object Detection
Published	2018-08-15
URL	http://arxiv.org/abs/1808.04974v1
PDF	http://arxiv.org/pdf/1808.04974v1.pdf
PWC	https://paperswithcode.com/paper/san-learning-relationship-between
Repo
Framework

Contingency-Aware Exploration in Reinforcement Learning


Title	Contingency-Aware Exploration in Reinforcement Learning
Authors	Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu, Mohammad Norouzi, Honglak Lee
Abstract	This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning. To investigate this question, we consider an instantiation of this hypothesis evaluated on the Arcade Learning Element (ALE). In this study, we develop an attentive dynamics model (ADM) that discovers controllable elements of the observations, which are often associated with the location of the character in Atari games. The ADM is trained in a self-supervised fashion to predict the actions taken by the agent. The learned contingency information is used as a part of the state representation for exploration purposes. We demonstrate that combining actor-critic algorithm with count-based exploration using our representation achieves impressive results on a set of notoriously challenging Atari games due to sparse rewards. For example, we report a state-of-the-art score of >11,000 points on Montezuma’s Revenge without using expert demonstrations, explicit high-level information (e.g., RAM states), or supervisory data. Our experiments confirm that contingency-awareness is indeed an extremely powerful concept for tackling exploration problems in reinforcement learning and opens up interesting research questions for further investigations.
Tasks	Atari Games, Montezuma’s Revenge
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01483v3
PDF	http://arxiv.org/pdf/1811.01483v3.pdf
PWC	https://paperswithcode.com/paper/contingency-aware-exploration-in
Repo
Framework

Skin Lesion Analysis Towards Melanoma Detection via End-to-end Deep Learning of Convolutional Neural Networks


Title	Skin Lesion Analysis Towards Melanoma Detection via End-to-end Deep Learning of Convolutional Neural Networks
Authors	Katherine M. Li, Evelyn C. Li
Abstract	This article presents the design, experiments and results of our solution submitted to the 2018 ISIC challenge: Skin Lesion Analysis Towards Melanoma Detection. We design a pipeline using state-of-the-art Convolutional Neural Network (CNN) models for a Lesion Boundary Segmentation task and a Lesion Diagnosis task.
Tasks
Published	2018-07-22
URL	http://arxiv.org/abs/1807.08332v1
PDF	http://arxiv.org/pdf/1807.08332v1.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-analysis-towards-melanoma-1
Repo
Framework

Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent’s Demonstration


Title	Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent’s Demonstration
Authors	Zhaodong Wang, Matthew E. Taylor
Abstract	Reinforcement learning has enjoyed multiple successes in recent years. However, these successes typically require very large amounts of data before an agent achieves acceptable performance. This paper introduces a novel way of combating such requirements by leveraging existing (human or agent) knowledge. In particular, this paper uses demonstrations from agents and humans, allowing an untrained agent to quickly achieve high performance. We empirically compare with, and highlight the weakness of, HAT and CHAT, methods of transferring knowledge from a source agent/human to a target agent. This paper introduces an effective transfer approach, DRoP, combining the offline knowledge (demonstrations recorded before learning) with online confidence-based performance analysis. DRoP dynamically involves the demonstrator’s knowledge, integrating it into the reinforcement learning agent’s online learning loop to achieve efficient and robust learning.
Tasks
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04493v1
PDF	http://arxiv.org/pdf/1805.04493v1.pdf
PWC	https://paperswithcode.com/paper/interactive-reinforcement-learning-with
Repo
Framework

Networks for Nonlinear Diffusion Problems in Imaging


Title	Networks for Nonlinear Diffusion Problems in Imaging
Authors	Simon Arridge, Andreas Hauptmann
Abstract	A multitude of imaging and vision tasks have seen recently a major transformation by deep learning methods and in particular by the application of convolutional neural networks. These methods achieve impressive results, even for applications where it is not apparent that convolutions are suited to capture the underlying physics. In this work we develop a network architecture based on nonlinear diffusion processes, named DiffNet. By design, we obtain a nonlinear network architecture that is well suited for diffusion related problems in imaging. Furthermore, the performed updates are explicit, by which we obtain better interpretability and generalisability compared to classical convolutional neural network architectures. The performance of DiffNet tested on the inverse problem of nonlinear diffusion with the Perona-Malik filter on the STL-10 image dataset. We obtain competitive results to the established U-Net architecture, with a fraction of parameters and necessary training data.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12084v1
PDF	http://arxiv.org/pdf/1811.12084v1.pdf
PWC	https://paperswithcode.com/paper/networks-for-nonlinear-diffusion-problems-in
Repo
Framework

ASVRG: Accelerated Proximal SVRG


Title	ASVRG: Accelerated Proximal SVRG
Authors	Fanhua Shang, Licheng Jiao, Kaiwen Zhou, James Cheng, Yan Ren, Yufei Jin
Abstract	This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick. Unlike most existing accelerated stochastic variance reduction methods such as Katyusha, ASVRG has only one additional variable and one momentum parameter. Thus, ASVRG is much simpler than those methods, and has much lower per-iteration complexity. We prove that ASVRG achieves the best known oracle complexities for both strongly convex and non-strongly convex objectives. In addition, we extend ASVRG to mini-batch and non-smooth settings. We also empirically verify our theoretical results and show that the performance of ASVRG is comparable with, and sometimes even better than that of the state-of-the-art stochastic methods.
Tasks
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03105v2
PDF	http://arxiv.org/pdf/1810.03105v2.pdf
PWC	https://paperswithcode.com/paper/asvrg-accelerated-proximal-svrg
Repo
Framework

Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database


Title	Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database
Authors	Qijie Zhao, Feng Ni, Yang Song, Yongtao Wang, Zhi Tang
Abstract	Digital signs(such as barcode or QR code) are widely used in our daily life, and for many applications, we need to localize them on images. However, difficult cases such as targets with small scales, half-occlusion, shape deformation and large illumination changes cause challenges for conventional methods. In this paper, we address this problem by producing a large-scale dataset and adopting a deep learning based semantic segmentation approach. Specifically, a synthesizing method was proposed to generate well-annotated images containing barcode and QR code labels, which contributes to largely decrease the annotation time. Through the synthesis strategy, we introduce a dataset that contains 30000 images with Barcode and QR code - Barcode-30k. Moreover, we further propose a dual pyramid structure based segmentation network - BarcodeNet, which is mainly formed with two novel modules, Prior Pyramid Pooling Module(P3M) and Pyramid Refine Module(PRM). We validate the effectiveness of BarcodeNet on the proposed synthetic dataset, and it yields the result of mIoU accuracy 95.36% on validation set. Additional segmentation results of real images have shown that accurate segmentation performance is achieved.
Tasks	Semantic Segmentation
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11886v1
PDF	http://arxiv.org/pdf/1807.11886v1.pdf
PWC	https://paperswithcode.com/paper/deep-dual-pyramid-network-for-barcode
Repo
Framework

A convex formulation for high-dimensional sparse sliced inverse regression


Title	A convex formulation for high-dimensional sparse sliced inverse regression
Authors	Kean Ming Tan, Zhaoran Wang, Tong Zhang, Han Liu, R. Dennis Cook
Abstract	Sliced inverse regression is a popular tool for sufficient dimension reduction, which replaces covariates with a minimal set of their linear combinations without loss of information on the conditional distribution of the response given the covariates. The estimated linear combinations include all covariates, making results difficult to interpret and perhaps unnecessarily variable, particularly when the number of covariates is large. In this paper, we propose a convex formulation for fitting sparse sliced inverse regression in high dimensions. Our proposal estimates the subspace of the linear combinations of the covariates directly and performs variable selection simultaneously. We solve the resulting convex optimization problem via the linearized alternating direction methods of multiplier algorithm, and establish an upper bound on the subspace distance between the estimated and the true subspaces. Through numerical studies, we show that our proposal is able to identify the correct covariates in the high-dimensional setting.
Tasks	Dimensionality Reduction
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06024v1
PDF	http://arxiv.org/pdf/1809.06024v1.pdf
PWC	https://paperswithcode.com/paper/a-convex-formulation-for-high-dimensional
Repo
Framework

Liquid Pouring Monitoring via Rich Sensory Inputs


Title	Liquid Pouring Monitoring via Rich Sensory Inputs
Authors	Tz-Ying Wu, Juan-Ting Lin, Tsun-Hsuang Wang, Chan-Wei Hu, Juan Carlos Niebles, Min Sun
Abstract	Humans have the amazing ability to perform very subtle manipulation task using a closed-loop control system with imprecise mechanics (i.e., our body parts) but rich sensory information (e.g., vision, tactile, etc.). In the closed-loop system, the ability to monitor the state of the task via rich sensory information is important but often less studied. In this work, we take liquid pouring as a concrete example and aim at learning to continuously monitor whether liquid pouring is successful (e.g., no spilling) or not via rich sensory inputs. We mimic humans’ rich sensories using synchronized observation from a chest-mounted camera and a wrist-mounted IMU sensor. Given many success and failure demonstrations of liquid pouring, we train a hierarchical LSTM with late fusion for monitoring. To improve the robustness of the system, we propose two auxiliary tasks during training: inferring (1) the initial state of containers and (2) forecasting the one-step future 3D trajectory of the hand with an adversarial training procedure. These tasks encourage our method to learn representation sensitive to container states and how objects are manipulated in 3D. With these novel components, our method achieves ~8% and ~11% better monitoring accuracy than the baseline method without auxiliary tasks on unseen containers and unseen users respectively.
Tasks
Published	2018-08-06
URL	http://arxiv.org/abs/1808.01725v1
PDF	http://arxiv.org/pdf/1808.01725v1.pdf
PWC	https://paperswithcode.com/paper/liquid-pouring-monitoring-via-rich-sensory
Repo
Framework


Title	Random Dictators with a Random Referee: Constant Sample Complexity Mechanisms for Social Choice
Authors	Brandon Fain, Ashish Goel, Kamesh Munagala, Nina Prabhu
Abstract	We study social choice mechanisms in an implicit utilitarian framework with a metric constraint, where the goal is to minimize \textit{Distortion}, the worst case social cost of an ordinal mechanism relative to underlying cardinal utilities. We consider two additional desiderata: Constant sample complexity and Squared Distortion. Constant sample complexity means that the mechanism (potentially randomized) only uses a constant number of ordinal queries regardless of the number of voters and alternatives. Squared Distortion is a measure of variance of the Distortion of a randomized mechanism. Our primary contribution is the first social choice mechanism with constant sample complexity \textit{and} constant Squared Distortion (which also implies constant Distortion). We call the mechanism Random Referee, because it uses a random agent to compare two alternatives that are the favorites of two other random agents. We prove that the use of a comparison query is necessary: no mechanism that only elicits the top-k preferred alternatives of voters (for constant k) can have Squared Distortion that is sublinear in the number of alternatives. We also prove that unlike any top-k only mechanism, the Distortion of Random Referee meaningfully improves on benign metric spaces, using the Euclidean plane as a canonical example. Finally, among top-1 only mechanisms, we introduce Random Oligarchy. The mechanism asks just 3 queries and is essentially optimal among the class of such mechanisms with respect to Distortion. In summary, we demonstrate the surprising power of constant sample complexity mechanisms generally, and just three random voters in particular, to provide some of the best known results in the implicit utilitarian framework.
Tasks
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04786v2
PDF	http://arxiv.org/pdf/1811.04786v2.pdf
PWC	https://paperswithcode.com/paper/random-dictators-with-a-random-referee
Repo
Framework

Scene-Aware Audio for 360\textdegree


Title	Scene-Aware Audio for 360\textdegree{} Videos
Authors	Dingzeyu Li, Timothy R. Langlois, Changxi Zheng
Abstract	Although 360\textdegree{} cameras ease the capture of panoramic footage, it remains challenging to add realistic 360\textdegree{} audio that blends into the captured scene and is synchronized with the camera motion. We present a method for adding scene-aware spatial audio to 360\textdegree{} videos in typical indoor scenes, using only a conventional mono-channel microphone and a speaker. We observe that the late reverberation of a room’s impulse response is usually diffuse spatially and directionally. Exploiting this fact, we propose a method that synthesizes the directional impulse response between any source and listening locations by combining a synthesized early reverberation part and a measured late reverberation tail. The early reverberation is simulated using a geometric acoustic simulation and then enhanced using a frequency modulation method to capture room resonances. The late reverberation is extracted from a recorded impulse response, with a carefully chosen time duration that separates out the late reverberation from the early reverberation. In our validations, we show that our synthesized spatial audio matches closely with recordings using ambisonic microphones. Lastly, we demonstrate the strength of our method in several applications.
Tasks
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04792v1
PDF	http://arxiv.org/pdf/1805.04792v1.pdf
PWC	https://paperswithcode.com/paper/scene-aware-audio-for-360textdegree-videos
Repo
Framework

Re-Weighted Learning for Sparsifying Deep Neural Networks


Title	Re-Weighted Learning for Sparsifying Deep Neural Networks
Authors	Igor Fedorov, Bhaskar D. Rao
Abstract	This paper addresses the topic of sparsifying deep neural networks (DNN’s). While DNN’s are powerful models that achieve state-of-the-art performance on a large number of tasks, the large number of model parameters poses serious storage and computational challenges. To combat these difficulties, a growing line of work focuses on pruning network weights without sacrificing performance. We propose a general affine scaling transformation (AST) algorithm to sparsify DNN’s. Our approach follows in the footsteps of popular sparse recovery techniques, which have yet to be explored in the context of DNN’s. We describe a principled framework for transforming densely connected DNN’s into sparsely connected ones without sacrificing network performance. Unlike existing methods, our approach is able to learn sparse connections at each layer simultaneously, and achieves comparable pruning results on the architecture tested.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01616v1
PDF	http://arxiv.org/pdf/1802.01616v1.pdf
PWC	https://paperswithcode.com/paper/re-weighted-learning-for-sparsifying-deep
Repo
Framework

Meta-modeling game for deriving theoretical-consistent, micro-structural-based traction-separation laws via deep reinforcement learning


Title	Meta-modeling game for deriving theoretical-consistent, micro-structural-based traction-separation laws via deep reinforcement learning
Authors	Kun Wang, WaiChing Sun
Abstract	This paper presents a new meta-modeling framework to employ deep reinforcement learning (DRL) to generate mechanical constitutive models for interfaces. The constitutive models are conceptualized as information flow in directed graphs. The process of writing constitutive models are simplified as a sequence of forming graph edges with the goal of maximizing the model score (a function of accuracy, robustness and forward prediction quality). Thus meta-modeling can be formulated as a Markov decision process with well-defined states, actions, rules, objective functions, and rewards. By using neural networks to estimate policies and state values, the computer agent is able to efficiently self-improve the constitutive model it generated through self-playing, in the same way AlphaGo Zero (the algorithm that outplayed the world champion in the game of Go)improves its gameplay. Our numerical examples show that this automated meta-modeling framework not only produces models which outperform existing cohesive models on benchmark traction-separation data but is also capable of detecting hidden mechanisms among micro-structural features and incorporating them in constitutive models to improve the forward prediction accuracy, which are difficult tasks to do manually.
Tasks	Game of Go
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10535v1
PDF	http://arxiv.org/pdf/1810.10535v1.pdf
PWC	https://paperswithcode.com/paper/meta-modeling-game-for-deriving-theoretical
Repo
Framework

Error-Robust Multi-View Clustering


Title	Error-Robust Multi-View Clustering
Authors	Mehrnaz Najafi, Lifang He, Philip S. Yu
Abstract	In the era of big data, data may come from multiple sources, known as multi-view data. Multi-view clustering aims at generating better clusters by exploiting complementary and consistent information from multiple views rather than relying on the individual view. Due to inevitable system errors caused by data-captured sensors or others, the data in each view may be erroneous. Various types of errors behave differently and inconsistently in each view. More precisely, error could exhibit as noise and corruptions in reality. Unfortunately, none of the existing multi-view clustering approaches handle all of these error types. Consequently, their clustering performance is dramatically degraded. In this paper, we propose a novel Markov chain method for Error-Robust Multi-View Clustering (EMVC). By decomposing each view into a shared transition probability matrix and error matrix and imposing structured sparsity-inducing norms on error matrices, we characterize and handle typical types of errors explicitly. To solve the challenging optimization problem, we propose a new efficient algorithm based on Augmented Lagrangian Multipliers and prove its convergence rigorously. Experimental results on various synthetic and real-world datasets show the superiority of the proposed EMVC method over the baseline methods and its robustness against different types of errors.
Tasks
Published	2018-01-01
URL	http://arxiv.org/abs/1801.00384v1
PDF	http://arxiv.org/pdf/1801.00384v1.pdf
PWC	https://paperswithcode.com/paper/error-robust-multi-view-clustering
Repo
Framework

Neural Machine Translation into Language Varieties


Title	Neural Machine Translation into Language Varieties
Authors	Surafel M. Lakew, Aliia Erofeeva, Marcello Federico
Abstract	Both research and commercial machine translation have so far neglected the importance of properly handling the spelling, lexical and grammar divergences occurring among language varieties. Notable cases are standard national varieties such as Brazilian and European Portuguese, and Canadian and European French, which popular online machine translation services are not keeping distinct. We show that an evident side effect of modeling such varieties as unique classes is the generation of inconsistent translations. In this work, we investigate the problem of training neural machine translation from English to specific pairs of language varieties, assuming both labeled and unlabeled parallel texts, and low-resource conditions. We report experiments from English to two pairs of dialects, EuropeanBrazilian Portuguese and European-Canadian French, and two pairs of standardized varieties, Croatian-Serbian and Indonesian-Malay. We show significant BLEU score improvements over baseline systems when translation into similar languages is learned as a multilingual task with shared representations.
Tasks	Machine Translation
Published	2018-11-02
URL	http://arxiv.org/abs/1811.01064v1
PDF	http://arxiv.org/pdf/1811.01064v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-into-language
Repo
Framework