Paper Group AWR 282
Deep Multi-Agent Reinforcement Learning with Relevance Graphs. Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions. A quantum-inspired classical algorithm for recommendation systems. Deep Learning for Generic Object Detection: A Survey. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks. ShelfNet fo …
Deep Multi-Agent Reinforcement Learning with Relevance Graphs
Title | Deep Multi-Agent Reinforcement Learning with Relevance Graphs |
Authors | Aleksandra Malysheva, Tegg Taekyong Sung, Chae-Bong Sohn, Daniel Kudenko, Aleksei Shpilman |
Abstract | Over recent years, deep reinforcement learning has shown strong successes in complex single-agent tasks, and more recently this approach has also been applied to multi-agent domains. In this paper, we propose a novel approach, called MAGnet, to multi-agent reinforcement learning (MARL) that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique inspired by the NerveNet architecture. We applied our MAGnet approach to the Pommerman game and the results show that it significantly outperforms state-of-the-art MARL solutions, including DQN, MADDPG, and MCTS. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12557v1 |
http://arxiv.org/pdf/1811.12557v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-agent-reinforcement-learning-with |
Repo | https://github.com/tegg89/DLCamp_Jeju2018 |
Framework | tf |
Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions
Title | Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions |
Authors | Zheng Qin, Zhaoning Zhang, Dongsheng Li, Yiming Zhang, Yuxing Peng |
Abstract | Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity. To address this problem, in this paper we present an efficient method (called diagonalwise refactorization) for accelerating the training of depthwise convolution layers. Our key idea is to rearrange the weight vectors of a depthwise convolution into a large diagonal weight matrix so as to convert the depthwise convolution into one single standard convolution, which is well supported by the cuDNN library that is highly-optimized for GPU computations. We have implemented our training method in five popular deep learning frameworks. Evaluation results show that our proposed method gains $15.4\times$ training speedup on Darknet, $8.4\times$ on Caffe, $5.4\times$ on PyTorch, $3.5\times$ on MXNet, and $1.4\times$ on TensorFlow, compared to their original implementations of depthwise convolutions. |
Tasks | |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.09926v1 |
http://arxiv.org/pdf/1803.09926v1.pdf | |
PWC | https://paperswithcode.com/paper/diagonalwise-refactorization-an-efficient |
Repo | https://github.com/clavichord93/diagonalwise-refactorization-tensorflow |
Framework | tf |
A quantum-inspired classical algorithm for recommendation systems
Title | A quantum-inspired classical algorithm for recommendation systems |
Authors | Ewin Tang |
Abstract | We give a classical analogue to Kerenidis and Prakash’s quantum recommendation system, previously believed to be one of the strongest candidates for provably exponential speedups in quantum machine learning. Our main result is an algorithm that, given an $m \times n$ matrix in a data structure supporting certain $\ell^2$-norm sampling operations, outputs an $\ell^2$-norm sample from a rank-$k$ approximation of that matrix in time $O(\text{poly}(k)\log(mn))$, only polynomially slower than the quantum algorithm. As a consequence, Kerenidis and Prakash’s algorithm does not in fact give an exponential speedup over classical algorithms. Further, under strong input assumptions, the classical recommendation system resulting from our algorithm produces recommendations exponentially faster than previous classical systems, which run in time linear in $m$ and $n$. The main insight of this work is the use of simple routines to manipulate $\ell^2$-norm sampling distributions, which play the role of quantum superpositions in the classical setting. This correspondence indicates a potentially fruitful framework for formally comparing quantum machine learning algorithms to classical machine learning algorithms. |
Tasks | Quantum Machine Learning, Recommendation Systems |
Published | 2018-07-10 |
URL | https://arxiv.org/abs/1807.04271v3 |
https://arxiv.org/pdf/1807.04271v3.pdf | |
PWC | https://paperswithcode.com/paper/a-quantum-inspired-classical-algorithm-for |
Repo | https://github.com/nkmjm/qiML |
Framework | none |
Deep Learning for Generic Object Detection: A Survey
Title | Deep Learning for Generic Object Detection: A Survey |
Authors | Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, Matti Pietikäinen |
Abstract | Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research. |
Tasks | Object Detection, Object Proposal Generation |
Published | 2018-09-06 |
URL | https://arxiv.org/abs/1809.02165v4 |
https://arxiv.org/pdf/1809.02165v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-generic-object-detection-a |
Repo | https://github.com/TuoniTuoni/causal-inference |
Framework | tf |
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Title | Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks |
Authors | Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, Martin Grohe |
Abstract | In recent years, graph neural networks (GNNs) have emerged as a powerful neural architecture to learn vector representations of nodes and graphs in a supervised, end-to-end fashion. Up to now, GNNs have only been evaluated empirically—showing promising results. The following work investigates GNNs from a theoretical point of view and relates them to the $1$-dimensional Weisfeiler-Leman graph isomorphism heuristic ($1$-WL). We show that GNNs have the same expressiveness as the $1$-WL in terms of distinguishing non-isomorphic (sub-)graphs. Hence, both algorithms also have the same shortcomings. Based on this, we propose a generalization of GNNs, so-called $k$-dimensional GNNs ($k$-GNNs), which can take higher-order graph structures at multiple scales into account. These higher-order structures play an essential role in the characterization of social networks and molecule graphs. Our experimental evaluation confirms our theoretical findings as well as confirms that higher-order information is useful in the task of graph classification and regression. |
Tasks | Graph Classification |
Published | 2018-10-04 |
URL | https://arxiv.org/abs/1810.02244v3 |
https://arxiv.org/pdf/1810.02244v3.pdf | |
PWC | https://paperswithcode.com/paper/weisfeiler-and-leman-go-neural-higher-order |
Repo | https://github.com/toshi-k/kaggle-champs-scalar-coupling |
Framework | none |
ShelfNet for Fast Semantic Segmentation
Title | ShelfNet for Fast Semantic Segmentation |
Authors | Juntang Zhuang, Junlin Yang, Lin Gu, Nicha Dvornek |
Abstract | In this paper, we present ShelfNet, a novel architecture for accurate fast semantic segmentation. Different from the single encoder-decoder structure, ShelfNet has multiple encoder-decoder branch pairs with skip connections at each spatial level, which looks like a shelf with multiple columns. The shelf-shaped structure can be viewed as an ensemble of multiple deep and shallow paths, thus improving accuracy. We significantly reduce computation burden by reducing channel number, at the same time achieving high accuracy with this unique structure. In addition, we propose a shared-weight strategy in the residual block which reduces parameter number without sacrificing performance. Compared with popular non real-time methods such as PSPNet, our ShelfNet achieves 4$\times$ faster inference speed with similar accuracy on PASCAL VOC dataset. Compared with real-time segmentation models such as BiSeNet, our model achieves higher accuracy at comparable speed on the Cityscapes Dataset, enabling the application in speed-demanding tasks such as street-scene understanding for autonomous driving. Furthermore, our ShelfNet achieves 79.0% mIoU on Cityscapes Dataset with ResNet34 backbone, outperforming PSPNet and BiSeNet with large backbones such as ResNet101. Through extensive experiments, we validated the superior performance of ShelfNet. We provide link to the implementation \url{https://github.com/juntang-zhuang/ShelfNet-lw-cityscapes}. |
Tasks | Autonomous Driving, Real-Time Semantic Segmentation, Scene Understanding, Semantic Segmentation |
Published | 2018-11-27 |
URL | https://arxiv.org/abs/1811.11254v6 |
https://arxiv.org/pdf/1811.11254v6.pdf | |
PWC | https://paperswithcode.com/paper/multi-path-segmentation-network |
Repo | https://github.com/juntang-zhuang/ShelfNet-lw-cityscapes |
Framework | pytorch |
Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures
Title | Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures |
Authors | Martin Mundt, Sagnik Majumder, Tobias Weis, Visvanathan Ramesh |
Abstract | We characterize convolutional neural networks with respect to the relative amount of features per layer. Using a skew normal distribution as a parametrized framework, we investigate the common assumption of monotonously increasing feature-counts with higher layers of architecture designs. Our evaluation on models with VGG-type layers on the MNIST, Fashion-MNIST and CIFAR-10 image classification benchmarks provides evidence that motivates rethinking of our common assumption: architectures that favor larger early layers seem to yield better accuracy. |
Tasks | Image Classification |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.05836v1 |
http://arxiv.org/pdf/1812.05836v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-layer-wise-feature-amounts-in |
Repo | https://github.com/MrtnMndt/Rethinking_CNN_Layerwise_Feature_Amounts |
Framework | pytorch |
Context Encoding for Semantic Segmentation
Title | Context Encoding for Semantic Segmentation |
Authors | Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal |
Abstract | Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous convolution, utilizing multi-scale features and refining boundaries. In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures the semantic context of scenes and selectively highlights class-dependent featuremaps. The proposed Context Encoding Module significantly improves semantic segmentation results with only marginal extra computation cost over FCN. Our approach has achieved new state-of-the-art results 51.7% mIoU on PASCAL-Context, 85.9% mIoU on PASCAL VOC 2012. Our single model achieves a final score of 0.5567 on ADE20K test set, which surpass the winning entry of COCO-Place Challenge in 2017. In addition, we also explore how the Context Encoding Module can improve the feature representation of relatively shallow networks for the image classification on CIFAR-10 dataset. Our 14 layer network has achieved an error rate of 3.45%, which is comparable with state-of-the-art approaches with over 10 times more layers. The source code for the complete system are publicly available. |
Tasks | Image Classification, Semantic Segmentation |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08904v1 |
http://arxiv.org/pdf/1803.08904v1.pdf | |
PWC | https://paperswithcode.com/paper/context-encoding-for-semantic-segmentation |
Repo | https://github.com/kmaninis/pytorch-encoding |
Framework | pytorch |
μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Title | μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching |
Authors | Yosuke Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka |
Abstract | NVIDIA cuDNN is a low-level library that provides GPU kernels frequently used in deep learning. Specifically, cuDNN implements several equivalent convolution algorithms, whose performance and memory footprint may vary considerably, depending on the layer dimensions. When an algorithm is automatically selected by cuDNN, the decision is performed on a per-layer basis, and thus it often resorts to slower algorithms that fit the workspace size constraints. We present {\mu}-cuDNN, a transparent wrapper library for cuDNN, which divides layers’ mini-batch computation into several micro-batches. Based on Dynamic Programming and Integer Linear Programming, {\mu}-cuDNN enables faster algorithms by decreasing the workspace requirements. At the same time, {\mu}-cuDNN keeps the computational semantics unchanged, so that it decouples statistical efficiency from the hardware efficiency safely. We demonstrate the effectiveness of {\mu}-cuDNN over two frameworks, Caffe and TensorFlow, achieving speedups of 1.63x for AlexNet and 1.21x for ResNet-18 on P100-SXM2 GPU. These results indicate that using micro-batches can seamlessly increase the performance of deep learning, while maintaining the same memory footprint. |
Tasks | |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.04806v1 |
http://arxiv.org/pdf/1804.04806v1.pdf | |
PWC | https://paperswithcode.com/paper/-cudnn-accelerating-deep-learning-frameworks |
Repo | https://github.com/spcl/ucudnn |
Framework | tf |
Understanding and Accelerating Particle-Based Variational Inference
Title | Understanding and Accelerating Particle-Based Variational Inference |
Authors | Chang Liu, Jingwei Zhuo, Pengyu Cheng, Ruiyi Zhang, Jun Zhu, Lawrence Carin |
Abstract | Particle-based variational inference methods (ParVIs) have gained attention in the Bayesian inference literature, for their capacity to yield flexible and accurate approximations. We explore ParVIs from the perspective of Wasserstein gradient flows, and make both theoretical and practical contributions. We unify various finite-particle approximations that existing ParVIs use, and recognize that the approximation is essentially a compulsory smoothing treatment, in either of two equivalent forms. This novel understanding reveals the assumptions and relations of existing ParVIs, and also inspires new ParVIs. We propose an acceleration framework and a principled bandwidth-selection method for general ParVIs; these are based on the developed theory and leverage the geometry of the Wasserstein space. Experimental results show the improved convergence by the acceleration framework and enhanced sample accuracy by the bandwidth-selection method. |
Tasks | Bayesian Inference |
Published | 2018-07-04 |
URL | https://arxiv.org/abs/1807.01750v4 |
https://arxiv.org/pdf/1807.01750v4.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-first-order-methods-on-the |
Repo | https://github.com/chang-ml-thu/AWGF |
Framework | tf |
SketchyScene: Richly-Annotated Scene Sketches
Title | SketchyScene: Richly-Annotated Scene Sketches |
Authors | Changqing Zou, Qian Yu, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen, Hao Zhang |
Abstract | We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level. The dataset is created through a novel and carefully designed crowdsourcing pipeline, enabling users to efficiently generate large quantities of realistic and diverse scene sketches. SketchyScene contains more than 29,000 scene-level sketches, 7,000+ pairs of scene templates and photos, and 11,000+ object sketches. All objects in the scene sketches have ground-truth semantic and instance masks. The dataset is also highly scalable and extensible, easily allowing augmenting and/or changing scene composition. We demonstrate the potential impact of SketchyScene by training new computational models for semantic segmentation of scene sketches and showing how the new dataset enables several applications including image retrieval, sketch colorization, editing, and captioning, etc. The dataset and code can be found at https://github.com/SketchyScene/SketchyScene. |
Tasks | Colorization, Image Retrieval, Semantic Segmentation |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02473v1 |
http://arxiv.org/pdf/1808.02473v1.pdf | |
PWC | https://paperswithcode.com/paper/sketchyscene-richly-annotated-scene-sketches |
Repo | https://github.com/SketchyScene/SketchyScene |
Framework | tf |
Disease Progression Timeline Estimation for Alzheimer’s Disease using Discriminative Event Based Modeling
Title | Disease Progression Timeline Estimation for Alzheimer’s Disease using Discriminative Event Based Modeling |
Authors | Vikram Venkatraghavan, Esther E. Bron, Wiro J. Niessen, Stefan Klein, for the Alzheimer’s Disease Neuroimaging Initiative |
Abstract | Alzheimer’s Disease (AD) is characterized by a cascade of biomarkers becoming abnormal, the pathophysiology of which is very complex and largely unknown. Event-based modeling (EBM) is a data-driven technique to estimate the sequence in which biomarkers for a disease become abnormal based on cross-sectional data. It can help in understanding the dynamics of disease progression and facilitate early diagnosis and prognosis. In this work we propose a novel discriminative approach to EBM, which is shown to be more accurate than existing state-of-the-art EBM methods. The method first estimates for each subject an approximate ordering of events. Subsequently, the central ordering over all subjects is estimated by fitting a generalized Mallows model to these approximate subject-specific orderings. We also introduce the concept of relative distance between events which helps in creating a disease progression timeline. Subsequently, we propose a method to stage subjects by placing them on the estimated disease progression timeline. We evaluated the proposed method on Alzheimer’s Disease Neuroimaging Initiative (ADNI) data and compared the results with existing state-of-the-art EBM methods. We also performed extensive experiments on synthetic data simulating the progression of Alzheimer’s disease. The event orderings obtained on ADNI data seem plausible and are in agreement with the current understanding of progression of AD. The proposed patient staging algorithm performed consistently better than that of state-of-the-art EBM methods. Event orderings obtained in simulation experiments were more accurate than those of other EBM methods and the estimated disease progression timeline was observed to correlate with the timeline of actual disease progression. The results of these experiments are encouraging and suggest that discriminative EBM is a promising approach to disease progression modeling. |
Tasks | |
Published | 2018-08-10 |
URL | http://arxiv.org/abs/1808.03604v1 |
http://arxiv.org/pdf/1808.03604v1.pdf | |
PWC | https://paperswithcode.com/paper/disease-progression-timeline-estimation-for |
Repo | https://github.com/88vikram/pyebm |
Framework | none |
Reinforcement Learning for Improving Agent Design
Title | Reinforcement Learning for Improving Agent Design |
Authors | David Ha |
Abstract | In many reinforcement learning tasks, the goal is to learn a policy to manipulate an agent, whose design is fixed, to maximize some notion of cumulative reward. The design of the agent’s physical structure is rarely optimized for the task at hand. In this work, we explore the possibility of learning a version of the agent’s design that is better suited for its task, jointly with the policy. We propose an alteration to the popular OpenAI Gym framework, where we parameterize parts of an environment, and allow an agent to jointly learn to modify these environment parameters along with its policy. We demonstrate that an agent can learn a better structure of its body that is not only better suited for the task, but also facilitates policy learning. Joint learning of policy and structure may even uncover design principles that are useful for assisted-design applications. Videos of results at https://designrl.github.io/ |
Tasks | |
Published | 2018-10-09 |
URL | https://arxiv.org/abs/1810.03779v3 |
https://arxiv.org/pdf/1810.03779v3.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-improving-agent |
Repo | https://github.com/hardmaru/astool |
Framework | none |
Image Reconstruction with Predictive Filter Flow
Title | Image Reconstruction with Predictive Filter Flow |
Authors | Shu Kong, Charless Fowlkes |
Abstract | We propose a simple, interpretable framework for solving a wide range of image reconstruction problems such as denoising and deconvolution. Given a corrupted input image, the model synthesizes a spatially varying linear filter which, when applied to the input image, reconstructs the desired output. The model parameters are learned using supervised or self-supervised training. We test this model on three tasks: non-uniform motion blur removal, lossy-compression artifact reduction and single image super resolution. We demonstrate that our model substantially outperforms state-of-the-art methods on all these tasks and is significantly faster than optimization-based approaches to deconvolution. Unlike models that directly predict output pixel values, the predicted filter flow is controllable and interpretable, which we demonstrate by visualizing the space of predicted filters for different tasks. |
Tasks | Deblurring, Denoising, Image Reconstruction, Image Super-Resolution, Lossy-Compression Artifact Reduction, Super-Resolution |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11482v1 |
http://arxiv.org/pdf/1811.11482v1.pdf | |
PWC | https://paperswithcode.com/paper/image-reconstruction-with-predictive-filter |
Repo | https://github.com/bestaar/predictiveFilterFlow |
Framework | none |
Phrase-Based & Neural Unsupervised Machine Translation
Title | Phrase-Based & Neural Unsupervised Machine Translation |
Authors | Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc’Aurelio Ranzato |
Abstract | Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. This work investigates how to learn to translate when having access to only large monolingual corpora in each language. We propose two model variants, a neural and a phrase-based model. Both versions leverage a careful initialization of the parameters, the denoising effect of language models and automatic generation of parallel data by iterative back-translation. These models are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. On the widely used WMT’14 English-French and WMT’16 German-English benchmarks, our models respectively obtain 28.1 and 25.2 BLEU points without using a single parallel sentence, outperforming the state of the art by more than 11 BLEU points. On low-resource languages like English-Urdu and English-Romanian, our methods achieve even better results than semi-supervised and supervised approaches leveraging the paucity of available bitexts. Our code for NMT and PBSMT is publicly available. |
Tasks | Machine Translation, Unsupervised Machine Translation |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07755v2 |
http://arxiv.org/pdf/1804.07755v2.pdf | |
PWC | https://paperswithcode.com/paper/phrase-based-neural-unsupervised-machine |
Repo | https://github.com/Helsinki-NLP/shared-info |
Framework | none |