February 1, 2020

3051 words 15 mins read

Paper Group AWR 319

ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance. Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM. Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression. Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inferen …

ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance


Title	ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance
Authors	Rahul Krishna, Chong Tang, Kevin Sullivan, Baishakhi Ray
Abstract	Configuration space complexity makes the big-data software systems hard to configure well. Consider Hadoop, with over nine hundred parameters, developers often just use the default configurations provided with Hadoop distributions. The opportunity costs in lost performance are significant. Popular learning-based approaches to auto-tune software does not scale well for big-data systems because of the high cost of collecting training data. We present a new method based on a combination of Evolutionary Markov Chain Monte Carlo (EMCMC) sampling and cost reduction techniques to cost-effectively find better-performing configurations for big data systems. For cost reduction, we developed and experimentally tested and validated two approaches: using scaled-up big data jobs as proxies for the objective function for larger jobs and using a dynamic job similarity measure to infer that results obtained for one kind of big data problem will work well for similar problems. Our experimental results suggest that our approach promises to significantly improve the performance of big data systems and that it outperforms competing approaches based on random sampling, basic genetic algorithms (GA), and predictive model learning. Our experimental results support the conclusion that our approach has strongly demonstrated potential to significantly and cost-effectively improve the performance of big data systems.
Tasks	Efficient Exploration
Published	2019-10-17
URL	https://arxiv.org/abs/1910.09644v1
PDF	https://arxiv.org/pdf/1910.09644v1.pdf
PWC	https://paperswithcode.com/paper/conex-efficient-exploration-of-big-data
Repo	https://github.com/ARiSE-Lab/ConEX__Replication_Package
Framework	none

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM


Title	Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM
Authors	Shaokai Ye, Xiaoyu Feng, Tianyun Zhang, Xiaolong Ma, Sheng Lin, Zhengang Li, Kaidi Xu, Wujie Wen, Sijia Liu, Jian Tang, Makan Fardad, Xue Lin, Yongpan Liu, Yanzhi Wang
Abstract	Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results. In this work, we first extend such one-shot ADMM-based framework to guarantee solution feasibility and provide fast convergence rate, and generalize to weight quantization as well. We have further developed a multi-step, progressive DNN weight pruning and quantization framework, with dual benefits of (i) achieving further weight pruning/quantization thanks to the special property of ADMM regularization, and (ii) reducing the search space within each step. Extensive experimental results demonstrate the superior performance compared with prior work. Some highlights: (i) we achieve 246x,36x, and 8x weight pruning on LeNet-5, AlexNet, and ResNet-50 models, respectively, with (almost) zero accuracy loss; (ii) even a significant 61x weight pruning in AlexNet (ImageNet) results in only minor degradation in actual accuracy compared with prior work; (iii) we are among the first to derive notable weight pruning results for ResNet and MobileNet models; (iv) we derive the first lossless, fully binarized (for all layers) LeNet-5 for MNIST and VGG-16 for CIFAR-10; and (v) we derive the first fully binarized (for all layers) ResNet for ImageNet with reasonable accuracy loss.
Tasks	Model Compression, Quantization
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09769v2
PDF	http://arxiv.org/pdf/1903.09769v2.pdf
PWC	https://paperswithcode.com/paper/progressive-dnn-compression-a-key-to-achieve
Repo	https://github.com/yeshaokai/Robustness-Aware-Pruning-ADMM
Framework	pytorch

Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression


Title	Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression
Authors	Maurice Quach, Giuseppe Valenzise, Frederic Dufaux
Abstract	Efficient point cloud compression is fundamental to enable the deployment of virtual and mixed reality applications, since the number of points to code can range in the order of millions. In this paper, we present a novel data-driven geometry compression method for static point clouds based on learned convolutional transforms and uniform quantization. We perform joint optimization of both rate and distortion using a trade-off parameter. In addition, we cast the decoding process as a binary classification of the point cloud occupancy map. Our method outperforms the MPEG reference solution in terms of rate-distortion on the Microsoft Voxelized Upper Bodies dataset with 51.5% BDBR savings on average. Moreover, while octree-based methods face exponential diminution of the number of points at low bitrates, our method still produces high resolution outputs even at low bitrates. Code and supplementary material are available at https://github.com/mauriceqch/pcc_geo_cnn .
Tasks	Quantization
Published	2019-03-20
URL	https://arxiv.org/abs/1903.08548v2
PDF	https://arxiv.org/pdf/1903.08548v2.pdf
PWC	https://paperswithcode.com/paper/learning-convolutional-transforms-for-lossy
Repo	https://github.com/mauriceqch/pcc_geo_cnn
Framework	tf

Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks


Title	Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks
Authors	Sambhav R. Jain, Albert Gural, Michael Wu, Chris H. Dick
Abstract	We propose a method of training quantization thresholds (TQT) for uniform symmetric quantizers using standard backpropagation and gradient descent. Contrary to prior work, we show that a careful analysis of the straight-through estimator for threshold gradients allows for a natural range-precision trade-off leading to better optima. Our quantizers are constrained to use power-of-2 scale-factors and per-tensor scaling of weights and activations to make it amenable for hardware implementations. We present analytical support for the general robustness of our methods and empirically validate them on various CNNs for ImageNet classification. We are able to achieve near-floating-point accuracy on traditionally difficult networks such as MobileNets with less than 5 epochs of quantized (8-bit) retraining. Finally, we present Graffitist, a framework that enables automatic quantization of TensorFlow graphs for TQT (available at https://github.com/Xilinx/graffitist ).
Tasks	Quantization
Published	2019-03-19
URL	https://arxiv.org/abs/1903.08066v3
PDF	https://arxiv.org/pdf/1903.08066v3.pdf
PWC	https://paperswithcode.com/paper/trained-uniform-quantization-for-accurate-and
Repo	https://github.com/Xilinx/graffitist
Framework	tf

Recon-GLGAN: A Global-Local context based Generative Adversarial Network for MRI Reconstruction


Title	Recon-GLGAN: A Global-Local context based Generative Adversarial Network for MRI Reconstruction
Authors	Balamurali Murugesan, Vijaya Raghavan S, Kaushik Sarveswaran, Keerthi Ram, Mohanasankar Sivaprakasam
Abstract	Magnetic resonance imaging (MRI) is one of the best medical imaging modalities as it offers excellent spatial resolution and soft-tissue contrast. But, the usage of MRI is limited by its slow acquisition time, which makes it expensive and causes patient discomfort. In order to accelerate the acquisition, multiple deep learning networks have been proposed. Recently, Generative Adversarial Networks (GANs) have shown promising results in MRI reconstruction. The drawback with the proposed GAN based methods is it does not incorporate the prior information about the end goal which could help in better reconstruction. For instance, in the case of cardiac MRI, the physician would be interested in the heart region which is of diagnostic relevance while excluding the peripheral regions. In this work, we show that incorporating prior information about a region of interest in the model would offer better performance. Thereby, we propose a novel GAN based architecture, Reconstruction Global-Local GAN (Recon-GLGAN) for MRI reconstruction. The proposed model contains a generator and a context discriminator which incorporates global and local contextual information from images. Our model offers significant performance improvement over the baseline models. Our experiments show that the concept of a context discriminator can be extended to existing GAN based reconstruction models to offer better performance. We also demonstrate that the reconstructions from the proposed method give segmentation results similar to fully sampled images.
Tasks
Published	2019-08-25
URL	https://arxiv.org/abs/1908.09262v1
PDF	https://arxiv.org/pdf/1908.09262v1.pdf
PWC	https://paperswithcode.com/paper/recon-glgan-a-global-local-context-based
Repo	https://github.com/Bala93/Recon-GLGAN
Framework	pytorch

Deep Log-Likelihood Ratio Quantization


Title	Deep Log-Likelihood Ratio Quantization
Authors	Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath
Abstract	In this work, a deep learning-based method for log-likelihood ratio (LLR) lossy compression and quantization is proposed, with emphasis on a single-input single-output uncorrelated fading communication setting. A deep autoencoder network is trained to compress, quantize and reconstruct the bit log-likelihood ratios corresponding to a single transmitted symbol. Specifically, the encoder maps to a latent space with dimension equal to the number of sufficient statistics required to recover the inputs - equal to three in this case - while the decoder aims to reconstruct a noisy version of the latent representation with the purpose of modeling quantization effects in a differentiable way. Simulation results show that, when applied to a standard rate-1/2 low-density parity-check (LDPC) code, a finite precision compression factor of nearly three times is achieved when storing an entire codeword, with an incurred loss of performance lower than 0.1 dB compared to straightforward scalar quantization of the log-likelihood ratios.
Tasks	Quantization
Published	2019-03-11
URL	https://arxiv.org/abs/1903.04656v2
PDF	https://arxiv.org/pdf/1903.04656v2.pdf
PWC	https://paperswithcode.com/paper/deep-log-likelihood-ratio-quantization
Repo	https://github.com/mariusarvinte/deep-llr-quantization
Framework	tf

Order Matters: Shuffling Sequence Generation for Video Prediction


Title	Order Matters: Shuffling Sequence Generation for Video Prediction
Authors	Junyan Wang, Bingzhang Hu, Yang Long, Yu Guan
Abstract	Predicting future frames in natural video sequences is a new challenge that is receiving increasing attention in the computer vision community. However, existing models suffer from severe loss of temporal information when the predicted sequence is long. Compared to previous methods focusing on generating more realistic contents, this paper extensively studies the importance of sequential order information for video generation. A novel Shuffling sEquence gEneration network (SEE-Net) is proposed that can learn to discriminate unnatural sequential orders by shuffling the video frames and comparing them to the real video sequence. Systematic experiments on three datasets with both synthetic and real-world videos manifest the effectiveness of shuffling sequence generation for video prediction in our proposed model and demonstrate state-of-the-art performance by both qualitative and quantitative evaluations. The source code is available at https://github.com/andrewjywang/SEENet.
Tasks	Video Generation, Video Prediction
Published	2019-07-20
URL	https://arxiv.org/abs/1907.08845v1
PDF	https://arxiv.org/pdf/1907.08845v1.pdf
PWC	https://paperswithcode.com/paper/order-matters-shuffling-sequence-generation
Repo	https://github.com/andrewjywang/SEENet
Framework	tf

Video Generation from Single Semantic Label Map


Title	Video Generation from Single Semantic Label Map
Authors	Junting Pan, Chengyu Wang, Xu Jia, Jing Shao, Lu Sheng, Junjie Yan, Xiaogang Wang
Abstract	This paper proposes the novel task of video generation conditioned on a SINGLE semantic label map, which provides a good balance between flexibility and quality in the generation process. Different from typical end-to-end approaches, which model both scene content and dynamics in a single step, we propose to decompose this difficult task into two sub-problems. As current image generation methods do better than video generation in terms of detail, we synthesize high quality content by only generating the first frame. Then we animate the scene based on its semantic meaning to obtain the temporally coherent video, giving us excellent results overall. We employ a cVAE for predicting optical flow as a beneficial intermediate step to generate a video sequence conditioned on the initial single frame. A semantic label map is integrated into the flow prediction module to achieve major improvements in the image-to-video generation process. Extensive experiments on the Cityscapes dataset show that our method outperforms all competing methods.
Tasks	Image Generation, Optical Flow Estimation, Video Generation
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04480v1
PDF	http://arxiv.org/pdf/1903.04480v1.pdf
PWC	https://paperswithcode.com/paper/video-generation-from-single-semantic-label
Repo	https://github.com/junting/seg2vid
Framework	pytorch

Neural reparameterization improves structural optimization


Title	Neural reparameterization improves structural optimization
Authors	Stephan Hoyer, Jascha Sohl-Dickstein, Sam Greydanus
Abstract	Structural optimization is a popular method for designing objects such as bridge trusses, airplane wings, and optical devices. Unfortunately, the quality of solutions depends heavily on how the problem is parameterized. In this paper, we propose using the implicit bias over functions induced by neural networks to improve the parameterization of structural optimization. Rather than directly optimizing densities on a grid, we instead optimize the parameters of a neural network which outputs those densities. This reparameterization leads to different and often better solutions. On a selection of 116 structural optimization tasks, our approach produces the best design 50% more often than the best baseline method.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04240v2
PDF	https://arxiv.org/pdf/1909.04240v2.pdf
PWC	https://paperswithcode.com/paper/neural-reparameterization-improves-structural
Repo	https://github.com/google-research/neural-structural-optimization
Framework	tf

Reversible GANs for Memory-efficient Image-to-Image Translation


Title	Reversible GANs for Memory-efficient Image-to-Image Translation
Authors	Tycho F. A. van der Ouderaa, Daniel E. Worrall
Abstract	The Pix2pix and CycleGAN losses have vastly improved the qualitative and quantitative visual quality of results in image-to-image translation tasks. We extend this framework by exploring approximately invertible architectures which are well suited to these losses. These architectures are approximately invertible by design and thus partially satisfy cycle-consistency before training even begins. Furthermore, since invertible architectures have constant memory complexity in depth, these models can be built arbitrarily deep. We are able to demonstrate superior quantitative output on the Cityscapes and Maps datasets at near constant memory budget.
Tasks	Image-to-Image Translation
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02729v1
PDF	http://arxiv.org/pdf/1902.02729v1.pdf
PWC	https://paperswithcode.com/paper/reversible-gans-for-memory-efficient-image-to
Repo	https://github.com/silvandeleemput/memcnn
Framework	pytorch

Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance


Title	Pairwise Learning to Rank by Neural Networks Revisited: Reconstruction, Theoretical Analysis and Practical Performance
Authors	Marius Köppel, Alexander Segner, Martin Wagener, Lukas Pensel, Andreas Karwath, Stefan Kramer
Abstract	We present a pairwise learning to rank approach based on a neural net, called DirectRanker, that generalizes the RankNet architecture. We show mathematically that our model is reflexive, antisymmetric, and transitive allowing for simplified training and improved performance. Experimental results on the LETOR MSLR-WEB10K, MQ2007 and MQ2008 datasets show that our model outperforms numerous state-of-the-art methods, while being inherently simpler in structure and using a pairwise approach only.
Tasks	Learning-To-Rank
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02768v1
PDF	https://arxiv.org/pdf/1909.02768v1.pdf
PWC	https://paperswithcode.com/paper/pairwise-learning-to-rank-by-neural-networks
Repo	https://github.com/kramerlab/direct-ranker
Framework	tf

Deep Modular Co-Attention Networks for Visual Question Answering


Title	Deep Modular Co-Attention Networks for Visual Question Answering
Authors	Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Abstract	Visual Question Answering (VQA) requires a fine-grained and simultaneous understanding of both the visual content of images and the textual content of questions. Therefore, designing an effective `co-attention’ model to associate key words in questions with key objects in images is central to VQA performance. So far, most successful attempts at co-attention learning have been achieved by using shallow models, and deep co-attention models show little improvement over their shallow counterparts. In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth. Each MCA layer models the self-attention of questions and images, as well as the guided-attention of images jointly using a modular composition of two basic attention units. We quantitatively and qualitatively evaluate MCAN on the benchmark VQA-v2 dataset and conduct extensive ablation studies to explore the reasons behind MCAN’s effectiveness. Experimental results demonstrate that MCAN significantly outperforms the previous state-of-the-art. Our best single model delivers 70.63$%$ overall accuracy on the test-dev set. Code is available at https://github.com/MILVLG/mcan-vqa. \|
Tasks	Question Answering, Visual Question Answering
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10770v1
PDF	https://arxiv.org/pdf/1906.10770v1.pdf
PWC	https://paperswithcode.com/paper/deep-modular-co-attention-networks-for-visual-1
Repo	https://github.com/MILVLG/mcan-vqa
Framework	pytorch

ModelicaGym: Applying Reinforcement Learning to Modelica Models


Title	ModelicaGym: Applying Reinforcement Learning to Modelica Models
Authors	Oleh Lukianykhin, Tetiana Bogodorova
Abstract	This paper presents ModelicaGym toolbox that was developed to employ Reinforcement Learning (RL) for solving optimization and control tasks in Modelica models. The developed tool allows connecting models using Functional Mock-up Interface (FMI) toOpenAI Gym toolkit in order to exploit Modelica equation-based modelling and co-simulation together with RL algorithms as a functionality of the tools correspondingly. Thus, ModelicaGym facilitates fast and convenient development of RL algorithms and their comparison when solving optimal control problem for Modelicadynamic models. Inheritance structure ofModelicaGymtoolbox’s classes and the implemented methods are discussed in details. The toolbox functionality validation is performed on Cart-Pole balancing problem. This includes physical system model description and its integration using the toolbox, experiments on selection and influence of the model parameters (i.e. force magnitude, Cart-pole mass ratio, reward ratio, and simulation time step) on the learning process of Q-learning algorithm supported with the discussion of the simulation results.
Tasks	Q-Learning
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08604v1
PDF	https://arxiv.org/pdf/1909.08604v1.pdf
PWC	https://paperswithcode.com/paper/modelicagym-applying-reinforcement-learning
Repo	https://github.com/ucuapps/modelicagym
Framework	none

On Exploring Undetermined Relationships for Visual Relationship Detection


Title	On Exploring Undetermined Relationships for Visual Relationship Detection
Authors	Yibing Zhan, Jun Yu, Ting Yu, Dacheng Tao
Abstract	In visual relationship detection, human-notated relationships can be regarded as determinate relationships. However, there are still large amount of unlabeled data, such as object pairs with less significant relationships or even with no relationships. We refer to these unlabeled but potentially useful data as undetermined relationships. Although a vast body of literature exists, few methods exploit these undetermined relationships for visual relationship detection. In this paper, we explore the beneficial effect of undetermined relationships on visual relationship detection. We propose a novel multi-modal feature based undetermined relationship learning network (MF-URLN) and achieve great improvements in relationship detection. In detail, our MF-URLN automatically generates undetermined relationships by comparing object pairs with human-notated data according to a designed criterion. Then, the MF-URLN extracts and fuses features of object pairs from three complementary modals: visual, spatial, and linguistic modals. Further, the MF-URLN proposes two correlated subnetworks: one subnetwork decides the determinate confidence, and the other predicts the relationships. We evaluate the MF-URLN on two datasets: the Visual Relationship Detection (VRD) and the Visual Genome (VG) datasets. The experimental results compared with state-of-the-art methods verify the significant improvements made by the undetermined relationships, e.g., the top-50 relation detection recall improves from 19.5% to 23.9% on the VRD dataset.
Tasks
Published	2019-05-05
URL	https://arxiv.org/abs/1905.01595v1
PDF	https://arxiv.org/pdf/1905.01595v1.pdf
PWC	https://paperswithcode.com/paper/on-exploring-undetermined-relationships-for
Repo	https://github.com/pranoyr/visual-relationship-detection
Framework	pytorch

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image


Title	A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image
Authors	Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi Zhou, Junsong Yuan
Abstract	For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed. Within A2J, anchor points able to capture global-local spatial context information are densely set on depth image as local regressors for the joints. They contribute to predict the positions of the joints in ensemble way to enhance generalization ability. The proposed 3D articulated pose estimation paradigm is different from the state-of-the-art encoder-decoder based FCN, 3D CNN and point-set based manners. To discover informative anchor points towards certain joint, anchor proposal procedure is also proposed for A2J. Meanwhile 2D CNN (i.e., ResNet-50) is used as backbone network to drive A2J, without using time-consuming 3D convolutional or deconvolutional layers. The experiments on 3 hand datasets and 2 body datasets verify A2J’s superiority. Meanwhile, A2J is of high running speed around 100 FPS on single NVIDIA 1080Ti GPU.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2019-08-27
URL	https://arxiv.org/abs/1908.09999v1
PDF	https://arxiv.org/pdf/1908.09999v1.pdf
PWC	https://paperswithcode.com/paper/a2j-anchor-to-joint-regression-network-for-3d
Repo	https://github.com/zhangboshen/A2J
Framework	none