October 20, 2019

2987 words 15 mins read

Paper Group AWR 282

Deep Multi-Agent Reinforcement Learning with Relevance Graphs. Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions. A quantum-inspired classical algorithm for recommendation systems. Deep Learning for Generic Object Detection: A Survey. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks. ShelfNet fo …

Deep Multi-Agent Reinforcement Learning with Relevance Graphs


Title	Deep Multi-Agent Reinforcement Learning with Relevance Graphs
Authors	Aleksandra Malysheva, Tegg Taekyong Sung, Chae-Bong Sohn, Daniel Kudenko, Aleksei Shpilman
Abstract	Over recent years, deep reinforcement learning has shown strong successes in complex single-agent tasks, and more recently this approach has also been applied to multi-agent domains. In this paper, we propose a novel approach, called MAGnet, to multi-agent reinforcement learning (MARL) that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique inspired by the NerveNet architecture. We applied our MAGnet approach to the Pommerman game and the results show that it significantly outperforms state-of-the-art MARL solutions, including DQN, MADDPG, and MCTS.
Tasks	Multi-agent Reinforcement Learning
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12557v1
PDF	http://arxiv.org/pdf/1811.12557v1.pdf
PWC	https://paperswithcode.com/paper/deep-multi-agent-reinforcement-learning-with
Repo	https://github.com/tegg89/DLCamp_Jeju2018
Framework	tf

Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions


Title	Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions
Authors	Zheng Qin, Zhaoning Zhang, Dongsheng Li, Yiming Zhang, Yuxing Peng
Abstract	Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity. To address this problem, in this paper we present an efficient method (called diagonalwise refactorization) for accelerating the training of depthwise convolution layers. Our key idea is to rearrange the weight vectors of a depthwise convolution into a large diagonal weight matrix so as to convert the depthwise convolution into one single standard convolution, which is well supported by the cuDNN library that is highly-optimized for GPU computations. We have implemented our training method in five popular deep learning frameworks. Evaluation results show that our proposed method gains $15.4\times$ training speedup on Darknet, $8.4\times$ on Caffe, $5.4\times$ on PyTorch, $3.5\times$ on MXNet, and $1.4\times$ on TensorFlow, compared to their original implementations of depthwise convolutions.
Tasks
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09926v1
PDF	http://arxiv.org/pdf/1803.09926v1.pdf
PWC	https://paperswithcode.com/paper/diagonalwise-refactorization-an-efficient
Repo	https://github.com/clavichord93/diagonalwise-refactorization-tensorflow
Framework	tf

A quantum-inspired classical algorithm for recommendation systems


Title	A quantum-inspired classical algorithm for recommendation systems
Authors	Ewin Tang
Abstract	We give a classical analogue to Kerenidis and Prakash’s quantum recommendation system, previously believed to be one of the strongest candidates for provably exponential speedups in quantum machine learning. Our main result is an algorithm that, given an $m \times n$ matrix in a data structure supporting certain $\ell^2$-norm sampling operations, outputs an $\ell^2$-norm sample from a rank-$k$ approximation of that matrix in time $O(\text{poly}(k)\log(mn))$, only polynomially slower than the quantum algorithm. As a consequence, Kerenidis and Prakash’s algorithm does not in fact give an exponential speedup over classical algorithms. Further, under strong input assumptions, the classical recommendation system resulting from our algorithm produces recommendations exponentially faster than previous classical systems, which run in time linear in $m$ and $n$. The main insight of this work is the use of simple routines to manipulate $\ell^2$-norm sampling distributions, which play the role of quantum superpositions in the classical setting. This correspondence indicates a potentially fruitful framework for formally comparing quantum machine learning algorithms to classical machine learning algorithms.
Tasks	Quantum Machine Learning, Recommendation Systems
Published	2018-07-10
URL	https://arxiv.org/abs/1807.04271v3
PDF	https://arxiv.org/pdf/1807.04271v3.pdf
PWC	https://paperswithcode.com/paper/a-quantum-inspired-classical-algorithm-for
Repo	https://github.com/nkmjm/qiML
Framework	none

Deep Learning for Generic Object Detection: A Survey


Title	Deep Learning for Generic Object Detection: A Survey
Authors	Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, Matti Pietikäinen
Abstract	Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.
Tasks	Object Detection, Object Proposal Generation
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02165v4
PDF	https://arxiv.org/pdf/1809.02165v4.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-generic-object-detection-a
Repo	https://github.com/TuoniTuoni/causal-inference
Framework	tf

Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks


Title	Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Authors	Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, Martin Grohe
Abstract	In recent years, graph neural networks (GNNs) have emerged as a powerful neural architecture to learn vector representations of nodes and graphs in a supervised, end-to-end fashion. Up to now, GNNs have only been evaluated empirically—showing promising results. The following work investigates GNNs from a theoretical point of view and relates them to the $1$-dimensional Weisfeiler-Leman graph isomorphism heuristic ($1$-WL). We show that GNNs have the same expressiveness as the $1$-WL in terms of distinguishing non-isomorphic (sub-)graphs. Hence, both algorithms also have the same shortcomings. Based on this, we propose a generalization of GNNs, so-called $k$-dimensional GNNs ($k$-GNNs), which can take higher-order graph structures at multiple scales into account. These higher-order structures play an essential role in the characterization of social networks and molecule graphs. Our experimental evaluation confirms our theoretical findings as well as confirms that higher-order information is useful in the task of graph classification and regression.
Tasks	Graph Classification
Published	2018-10-04
URL	https://arxiv.org/abs/1810.02244v3
PDF	https://arxiv.org/pdf/1810.02244v3.pdf
PWC	https://paperswithcode.com/paper/weisfeiler-and-leman-go-neural-higher-order
Repo	https://github.com/toshi-k/kaggle-champs-scalar-coupling
Framework	none

ShelfNet for Fast Semantic Segmentation


Title	ShelfNet for Fast Semantic Segmentation
Authors	Juntang Zhuang, Junlin Yang, Lin Gu, Nicha Dvornek
Abstract	In this paper, we present ShelfNet, a novel architecture for accurate fast semantic segmentation. Different from the single encoder-decoder structure, ShelfNet has multiple encoder-decoder branch pairs with skip connections at each spatial level, which looks like a shelf with multiple columns. The shelf-shaped structure can be viewed as an ensemble of multiple deep and shallow paths, thus improving accuracy. We significantly reduce computation burden by reducing channel number, at the same time achieving high accuracy with this unique structure. In addition, we propose a shared-weight strategy in the residual block which reduces parameter number without sacrificing performance. Compared with popular non real-time methods such as PSPNet, our ShelfNet achieves 4$\times$ faster inference speed with similar accuracy on PASCAL VOC dataset. Compared with real-time segmentation models such as BiSeNet, our model achieves higher accuracy at comparable speed on the Cityscapes Dataset, enabling the application in speed-demanding tasks such as street-scene understanding for autonomous driving. Furthermore, our ShelfNet achieves 79.0% mIoU on Cityscapes Dataset with ResNet34 backbone, outperforming PSPNet and BiSeNet with large backbones such as ResNet101. Through extensive experiments, we validated the superior performance of ShelfNet. We provide link to the implementation \url{https://github.com/juntang-zhuang/ShelfNet-lw-cityscapes}.
Tasks	Autonomous Driving, Real-Time Semantic Segmentation, Scene Understanding, Semantic Segmentation
Published	2018-11-27
URL	https://arxiv.org/abs/1811.11254v6
PDF	https://arxiv.org/pdf/1811.11254v6.pdf
PWC	https://paperswithcode.com/paper/multi-path-segmentation-network
Repo	https://github.com/juntang-zhuang/ShelfNet-lw-cityscapes
Framework	pytorch

Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures


Title	Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures
Authors	Martin Mundt, Sagnik Majumder, Tobias Weis, Visvanathan Ramesh
Abstract	We characterize convolutional neural networks with respect to the relative amount of features per layer. Using a skew normal distribution as a parametrized framework, we investigate the common assumption of monotonously increasing feature-counts with higher layers of architecture designs. Our evaluation on models with VGG-type layers on the MNIST, Fashion-MNIST and CIFAR-10 image classification benchmarks provides evidence that motivates rethinking of our common assumption: architectures that favor larger early layers seem to yield better accuracy.
Tasks	Image Classification
Published	2018-12-14
URL	http://arxiv.org/abs/1812.05836v1
PDF	http://arxiv.org/pdf/1812.05836v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-layer-wise-feature-amounts-in
Repo	https://github.com/MrtnMndt/Rethinking_CNN_Layerwise_Feature_Amounts
Framework	pytorch

Context Encoding for Semantic Segmentation


Title	Context Encoding for Semantic Segmentation
Authors	Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal
Abstract	Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous convolution, utilizing multi-scale features and refining boundaries. In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures the semantic context of scenes and selectively highlights class-dependent featuremaps. The proposed Context Encoding Module significantly improves semantic segmentation results with only marginal extra computation cost over FCN. Our approach has achieved new state-of-the-art results 51.7% mIoU on PASCAL-Context, 85.9% mIoU on PASCAL VOC 2012. Our single model achieves a final score of 0.5567 on ADE20K test set, which surpass the winning entry of COCO-Place Challenge in 2017. In addition, we also explore how the Context Encoding Module can improve the feature representation of relatively shallow networks for the image classification on CIFAR-10 dataset. Our 14 layer network has achieved an error rate of 3.45%, which is comparable with state-of-the-art approaches with over 10 times more layers. The source code for the complete system are publicly available.
Tasks	Image Classification, Semantic Segmentation
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08904v1
PDF	http://arxiv.org/pdf/1803.08904v1.pdf
PWC	https://paperswithcode.com/paper/context-encoding-for-semantic-segmentation
Repo	https://github.com/kmaninis/pytorch-encoding
Framework	pytorch

μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching


Title	μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Authors	Yosuke Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka
Abstract	NVIDIA cuDNN is a low-level library that provides GPU kernels frequently used in deep learning. Specifically, cuDNN implements several equivalent convolution algorithms, whose performance and memory footprint may vary considerably, depending on the layer dimensions. When an algorithm is automatically selected by cuDNN, the decision is performed on a per-layer basis, and thus it often resorts to slower algorithms that fit the workspace size constraints. We present {\mu}-cuDNN, a transparent wrapper library for cuDNN, which divides layers’ mini-batch computation into several micro-batches. Based on Dynamic Programming and Integer Linear Programming, {\mu}-cuDNN enables faster algorithms by decreasing the workspace requirements. At the same time, {\mu}-cuDNN keeps the computational semantics unchanged, so that it decouples statistical efficiency from the hardware efficiency safely. We demonstrate the effectiveness of {\mu}-cuDNN over two frameworks, Caffe and TensorFlow, achieving speedups of 1.63x for AlexNet and 1.21x for ResNet-18 on P100-SXM2 GPU. These results indicate that using micro-batches can seamlessly increase the performance of deep learning, while maintaining the same memory footprint.
Tasks
Published	2018-04-13
URL	http://arxiv.org/abs/1804.04806v1
PDF	http://arxiv.org/pdf/1804.04806v1.pdf
PWC	https://paperswithcode.com/paper/-cudnn-accelerating-deep-learning-frameworks
Repo	https://github.com/spcl/ucudnn
Framework	tf

Understanding and Accelerating Particle-Based Variational Inference


Title	Understanding and Accelerating Particle-Based Variational Inference
Authors	Chang Liu, Jingwei Zhuo, Pengyu Cheng, Ruiyi Zhang, Jun Zhu, Lawrence Carin
Abstract	Particle-based variational inference methods (ParVIs) have gained attention in the Bayesian inference literature, for their capacity to yield flexible and accurate approximations. We explore ParVIs from the perspective of Wasserstein gradient flows, and make both theoretical and practical contributions. We unify various finite-particle approximations that existing ParVIs use, and recognize that the approximation is essentially a compulsory smoothing treatment, in either of two equivalent forms. This novel understanding reveals the assumptions and relations of existing ParVIs, and also inspires new ParVIs. We propose an acceleration framework and a principled bandwidth-selection method for general ParVIs; these are based on the developed theory and leverage the geometry of the Wasserstein space. Experimental results show the improved convergence by the acceleration framework and enhanced sample accuracy by the bandwidth-selection method.
Tasks	Bayesian Inference
Published	2018-07-04
URL	https://arxiv.org/abs/1807.01750v4
PDF	https://arxiv.org/pdf/1807.01750v4.pdf
PWC	https://paperswithcode.com/paper/accelerated-first-order-methods-on-the
Repo	https://github.com/chang-ml-thu/AWGF
Framework	tf

SketchyScene: Richly-Annotated Scene Sketches


Title	SketchyScene: Richly-Annotated Scene Sketches
Authors	Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang
Abstract	We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level. The dataset is created through a novel and carefully designed crowdsourcing pipeline, enabling users to efficiently generate large quantities of realistic and diverse scene sketches. SketchyScene contains more than 29,000 scene-level sketches, 7,000+ pairs of scene templates and photos, and 11,000+ object sketches. All objects in the scene sketches have ground-truth semantic and instance masks. The dataset is also highly scalable and extensible, easily allowing augmenting and/or changing scene composition. We demonstrate the potential impact of SketchyScene by training new computational models for semantic segmentation of scene sketches and showing how the new dataset enables several applications including image retrieval, sketch colorization, editing, and captioning, etc. The dataset and code can be found at https://github.com/SketchyScene/SketchyScene.
Tasks	Colorization, Image Retrieval, Semantic Segmentation
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02473v1
PDF	http://arxiv.org/pdf/1808.02473v1.pdf
PWC	https://paperswithcode.com/paper/sketchyscene-richly-annotated-scene-sketches
Repo	https://github.com/SketchyScene/SketchyScene
Framework	tf

Disease Progression Timeline Estimation for Alzheimer’s Disease using Discriminative Event Based Modeling


Title	Disease Progression Timeline Estimation for Alzheimer’s Disease using Discriminative Event Based Modeling
Authors	Vikram Venkatraghavan, Esther E. Bron, Wiro J. Niessen, Stefan Klein, for the Alzheimer’s Disease Neuroimaging Initiative
Abstract	Alzheimer’s Disease (AD) is characterized by a cascade of biomarkers becoming abnormal, the pathophysiology of which is very complex and largely unknown. Event-based modeling (EBM) is a data-driven technique to estimate the sequence in which biomarkers for a disease become abnormal based on cross-sectional data. It can help in understanding the dynamics of disease progression and facilitate early diagnosis and prognosis. In this work we propose a novel discriminative approach to EBM, which is shown to be more accurate than existing state-of-the-art EBM methods. The method first estimates for each subject an approximate ordering of events. Subsequently, the central ordering over all subjects is estimated by fitting a generalized Mallows model to these approximate subject-specific orderings. We also introduce the concept of relative distance between events which helps in creating a disease progression timeline. Subsequently, we propose a method to stage subjects by placing them on the estimated disease progression timeline. We evaluated the proposed method on Alzheimer’s Disease Neuroimaging Initiative (ADNI) data and compared the results with existing state-of-the-art EBM methods. We also performed extensive experiments on synthetic data simulating the progression of Alzheimer’s disease. The event orderings obtained on ADNI data seem plausible and are in agreement with the current understanding of progression of AD. The proposed patient staging algorithm performed consistently better than that of state-of-the-art EBM methods. Event orderings obtained in simulation experiments were more accurate than those of other EBM methods and the estimated disease progression timeline was observed to correlate with the timeline of actual disease progression. The results of these experiments are encouraging and suggest that discriminative EBM is a promising approach to disease progression modeling.
Tasks
Published	2018-08-10
URL	http://arxiv.org/abs/1808.03604v1
PDF	http://arxiv.org/pdf/1808.03604v1.pdf
PWC	https://paperswithcode.com/paper/disease-progression-timeline-estimation-for
Repo	https://github.com/88vikram/pyebm
Framework	none

Reinforcement Learning for Improving Agent Design


Title	Reinforcement Learning for Improving Agent Design
Authors	David Ha
Abstract	In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. The design of the agent’s physical structure is rarely optimized for the task at hand. In this work, we explore the possibility of learning a version of the agent’s design that is better suited for its task, jointly with the policy. We propose an alteration to the popular OpenAI Gym framework, where we parameterize parts of an environment, and allow an agent to jointly learn to modify these environment parameters along with its policy. We demonstrate that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning. Joint learning of policy and structure may even uncover design principles that are useful for assisted-design applications. Videos of results at https://designrl.github.io/
Tasks
Published	2018-10-09
URL	https://arxiv.org/abs/1810.03779v3
PDF	https://arxiv.org/pdf/1810.03779v3.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-for-improving-agent
Repo	https://github.com/hardmaru/astool
Framework	none

Image Reconstruction with Predictive Filter Flow


Title	Image Reconstruction with Predictive Filter Flow
Authors	Shu Kong, Charless Fowlkes
Abstract	We propose a simple, interpretable framework for solving a wide range of image reconstruction problems such as denoising and deconvolution. Given a corrupted input image, the model synthesizes a spatially varying linear filter which, when applied to the input image, reconstructs the desired output. The model parameters are learned using supervised or self-supervised training. We test this model on three tasks: non-uniform motion blur removal, lossy-compression artifact reduction and single image super resolution. We demonstrate that our model substantially outperforms state-of-the-art methods on all these tasks and is significantly faster than optimization-based approaches to deconvolution. Unlike models that directly predict output pixel values, the predicted filter flow is controllable and interpretable, which we demonstrate by visualizing the space of predicted filters for different tasks.
Tasks	Deblurring, Denoising, Image Reconstruction, Image Super-Resolution, Lossy-Compression Artifact Reduction, Super-Resolution
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11482v1
PDF	http://arxiv.org/pdf/1811.11482v1.pdf
PWC	https://paperswithcode.com/paper/image-reconstruction-with-predictive-filter
Repo	https://github.com/bestaar/predictiveFilterFlow
Framework	none

Phrase-Based & Neural Unsupervised Machine Translation


Title	Phrase-Based & Neural Unsupervised Machine Translation
Authors	Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc’Aurelio Ranzato
Abstract	Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. This work investigates how to learn to translate when having access to only large monolingual corpora in each language. We propose two model variants, a neural and a phrase-based model. Both versions leverage a careful initialization of the parameters, the denoising effect of language models and automatic generation of parallel data by iterative back-translation. These models are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. On the widely used WMT’14 English-French and WMT’16 German-English benchmarks, our models respectively obtain 28.1 and 25.2 BLEU points without using a single parallel sentence, outperforming the state of the art by more than 11 BLEU points. On low-resource languages like English-Urdu and English-Romanian, our methods achieve even better results than semi-supervised and supervised approaches leveraging the paucity of available bitexts. Our code for NMT and PBSMT is publicly available.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07755v2
PDF	http://arxiv.org/pdf/1804.07755v2.pdf
PWC	https://paperswithcode.com/paper/phrase-based-neural-unsupervised-machine
Repo	https://github.com/Helsinki-NLP/shared-info
Framework	none