April 1, 2020

2885 words 14 mins read

Paper Group ANR 406

Structured GANs. Breaking hypothesis testing for failure rates. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing. Multiplex Word Embeddings for Selectional Preference Acquisition. Robust Quantization: One Model to Rule Them A …

Structured GANs


Title	Structured GANs
Authors	Irad Peleg, Lior Wolf
Abstract	We present Generative Adversarial Networks (GANs), in which the symmetric property of the generated images is controlled. This is obtained through the generator network’s architecture, while the training procedure and the loss remain the same. The symmetric GANs are applied to face image synthesis in order to generate novel faces with a varying amount of symmetry. We also present an unsupervised face rotation capability, which is based on the novel notion of one-shot fine tuning.
Tasks	Image Generation
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05216v1
PDF	https://arxiv.org/pdf/2001.05216v1.pdf
PWC	https://paperswithcode.com/paper/structured-gans
Repo
Framework

Breaking hypothesis testing for failure rates


Title	Breaking hypothesis testing for failure rates
Authors	Rohit Pandey, Yingnong Dang, Gil Lapid Shafriri, Murali Chintalapati, Aerin Kim
Abstract	We describe the utility of point processes and failure rates and the most common point process for modeling failure rates, the Poisson point process. Next, we describe the uniformly most powerful test for comparing the rates of two Poisson point processes for a one-sided test (henceforth referred to as the “rate test”). A common argument against using this test is that real world data rarely follows the Poisson point process. We thus investigate what happens when the distributional assumptions of tests like these are violated and the test still applied. We find a non-pathological example (using the rate test on a Compound Poisson distribution with Binomial compounding) where violating the distributional assumptions of the rate test make it perform better (lower error rates). We also find that if we replace the distribution of the test statistic under the null hypothesis with any other arbitrary distribution, the performance of the test (described in terms of the false negative rate to false positive rate trade-off) remains exactly the same. Next, we compare the performance of the rate test to a version of the Wald test customized to the Negative Binomial point process and find it to perform very similarly while being much more general and versatile. Finally, we discuss the applications to Microsoft Azure. The code for all experiments performed is open source and linked in the introduction.
Tasks	Point Processes
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04045v1
PDF	https://arxiv.org/pdf/2001.04045v1.pdf
PWC	https://paperswithcode.com/paper/breaking-hypothesis-testing-for-failure-rates
Repo
Framework

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition


Title	Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Authors	Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, Wanli Ouyang
Abstract	Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.
Tasks	Skeleton Based Action Recognition
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14111v1
PDF	https://arxiv.org/pdf/2003.14111v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-and-unifying-graph-convolutions
Repo
Framework


Title	Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing
Authors	Ge Liu, Rui Wu, Heng-Tze Cheng, Jing Wang, Jayden Ooi, Lihong Li, Ang Li, Wai Lok Sibon Li, Craig Boutilier, Ed Chi
Abstract	Deep Reinforcement Learning (RL) is proven powerful for decision making in simulated environments. However, training deep RL model is challenging in real world applications such as production-scale health-care or recommender systems because of the expensiveness of interaction and limitation of budget at deployment. One aspect of the data inefficiency comes from the expensive hyper-parameter tuning when optimizing deep neural networks. We propose Adaptive Behavior Policy Sharing (ABPS), a data-efficient training algorithm that allows sharing of experience collected by behavior policy that is adaptively selected from a pool of agents trained with an ensemble of hyper-parameters. We further extend ABPS to evolve hyper-parameters during training by hybridizing ABPS with an adapted version of Population Based Training (ABPS-PBT). We conduct experiments with multiple Atari games with up to 16 hyper-parameter/architecture setups. ABPS achieves superior overall performance, reduced variance on top 25% agents, and equivalent performance on the best agent compared to conventional hyper-parameter tuning with independent training, even though ABPS only requires the same number of environmental interactions as training a single agent. We also show that ABPS-PBT further improves the convergence speed and reduces the variance.
Tasks	Atari Games, Decision Making, Recommendation Systems
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05229v1
PDF	https://arxiv.org/pdf/2002.05229v1.pdf
PWC	https://paperswithcode.com/paper/data-efficient-training-for-reinforcement
Repo
Framework

Multiplex Word Embeddings for Selectional Preference Acquisition


Title	Multiplex Word Embeddings for Selectional Preference Acquisition
Authors	Hongming Zhang, Jiaxin Bai, Yan Song, Kun Xu, Changlong Yu, Yangqiu Song, Wilfred Ng, Dong Yu
Abstract	Conventional word embeddings represent words with fixed vectors, which are usually trained based on co-occurrence patterns among words. In doing so, however, the power of such representations is limited, where the same word might be functionalized separately under different syntactic relations. To address this limitation, one solution is to incorporate relational dependencies of different words into their embeddings. Therefore, in this paper, we propose a multiplex word embedding model, which can be easily extended according to various relations among words. As a result, each word has a center embedding to represent its overall semantics, and several relational embeddings to represent its relational dependencies. Compared to existing models, our model can effectively distinguish words with respect to different relations without introducing unnecessary sparseness. Moreover, to accommodate various relations, we use a small dimension for relational embeddings and our model is able to keep their effectiveness. Experiments on selectional preference acquisition and word similarity demonstrate the effectiveness of the proposed model, and a further study of scalability also proves that our embeddings only need 1/20 of the original embedding size to achieve better performance.
Tasks	Word Embeddings
Published	2020-01-09
URL	https://arxiv.org/abs/2001.02836v1
PDF	https://arxiv.org/pdf/2001.02836v1.pdf
PWC	https://paperswithcode.com/paper/multiplex-word-embeddings-for-selectional-1
Repo
Framework

Robust Quantization: One Model to Rule Them All


Title	Robust Quantization: One Model to Rule Them All
Authors	Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yuri Nahshan, Alex Bronstein, Uri Weiser
Abstract	Neural network quantization methods often involve simulating the quantization process during training. This makes the trained model highly dependent on the precise way quantization is performed. Since low-precision accelerators differ in their quantization policies and their supported mix of data-types, a model trained for one accelerator may not be suitable for another. To address this issue, we propose KURE, a method that provides intrinsic robustness to the model against a broad range of quantization implementations. We show that KURE yields a generic model that may be deployed on numerous inference accelerators without a significant loss in accuracy.
Tasks	Quantization
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07686v1
PDF	https://arxiv.org/pdf/2002.07686v1.pdf
PWC	https://paperswithcode.com/paper/robust-quantization-one-model-to-rule-them
Repo
Framework

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations


Title	Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
Authors	Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang
Abstract	We propose precision gating (PG), an end-to-end trainable dynamic dual-precision quantization technique for deep neural networks. PG computes most features in a low precision and only a small proportion of important features in a higher precision to preserve accuracy. The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execution with almost no accuracy loss. Our experiments indicate that PG achieves excellent results on CNNs, including statically compressed mobile-friendly networks such as ShuffleNet. Compared to the state-of-the-art prediction-based quantization schemes, PG achieves the same or higher accuracy with 2.4$\times$ less compute on ImageNet. PG furthermore applies to RNNs. Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.7$\times$ computational cost reduction on LSTM on the Penn Tree Bank dataset.
Tasks	Quantization
Published	2020-02-17
URL	https://arxiv.org/abs/2002.07136v1
PDF	https://arxiv.org/pdf/2002.07136v1.pdf
PWC	https://paperswithcode.com/paper/precision-gating-improving-neural-network-1
Repo
Framework

Efficient Matrix Multiplication: The Sparse Power-of-2 Factorization


Title	Efficient Matrix Multiplication: The Sparse Power-of-2 Factorization
Authors	Ralf R. Müller, Bernhard Gäde, Ali Bereyhi
Abstract	We present an algorithm to reduce the computational effort for the multiplication of a given matrix with an unknown column vector. The algorithm decomposes the given matrix into a product of matrices whose entries are either zero or integer powers of two utilizing the principles of sparse recovery. While classical low resolution quantization achieves an accuracy of 6 dB per bit, our method can achieve many times more than that for large matrices. Numerical evidence suggests that the improvement actually grows unboundedly with matrix size. Due to sparsity, the algorithm even allows for quantization levels below 1 bit per matrix entry while achieving highly accurate approximations for large matrices. Applications include, but are not limited to, neural networks, as well as fully digital beam-forming for massive MIMO and millimeter wave applications.
Tasks	Quantization
Published	2020-02-10
URL	https://arxiv.org/abs/2002.04002v1
PDF	https://arxiv.org/pdf/2002.04002v1.pdf
PWC	https://paperswithcode.com/paper/efficient-matrix-multiplication-the-sparse
Repo
Framework

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization


Title	BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization
Authors	Miloš Nikolić, Ghouthi Boukli Hacene, Ciaran Bannon, Alberto Delmas Lascorz, Matthieu Courbariaux, Yoshua Bengio, Vincent Gripon, Andreas Moshovos
Abstract	Neural networks have demonstrably achieved state-of-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity while maintaining accuracy. Furthermore, we propose a regularizer that penalizes large bitlength representations throughout the architecture and show how it can be modified to minimize other quantifiable criteria, such as number of operations or memory footprint. We demonstrate that our method learns thrifty representations while maintaining accuracy. With ImageNet, the method produces an average per layer bitlength of 4.13 and 3.76 bits on AlexNet and ResNet18 respectively, remaining within 2.0% and 0.5% of the baseline TOP-1 accuracy.
Tasks	Quantization
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03090v1
PDF	https://arxiv.org/pdf/2002.03090v1.pdf
PWC	https://paperswithcode.com/paper/bitpruning-learning-bitlengths-for-aggressive
Repo
Framework

Random VLAD based Deep Hashing for Efficient Image Retrieval


Title	Random VLAD based Deep Hashing for Efficient Image Retrieval
Authors	Li Weng, Lingzhi Ye, Jiangmin Tian, Jiuwen Cao, Jianzhong Wang
Abstract	Image hash algorithms generate compact binary representations that can be quickly matched by Hamming distance, thus become an efficient solution for large-scale image retrieval. This paper proposes RV-SSDH, a deep image hash algorithm that incorporates the classical VLAD (vector of locally aggregated descriptors) architecture into neural networks. Specifically, a novel neural network component is formed by coupling a random VLAD layer with a latent hash layer through a transform layer. This component can be combined with convolutional layers to realize a hash algorithm. We implement RV-SSDH as a point-wise algorithm that can be efficiently trained by minimizing classification error and quantization loss. Comprehensive experiments show this new architecture significantly outperforms baselines such as NetVLAD and SSDH, and offers a cost-effective trade-off in the state-of-the-art. In addition, the proposed random VLAD layer leads to satisfactory accuracy with low complexity, thus shows promising potentials as an alternative to NetVLAD.
Tasks	Image Retrieval, Quantization
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02333v1
PDF	https://arxiv.org/pdf/2002.02333v1.pdf
PWC	https://paperswithcode.com/paper/random-vlad-based-deep-hashing-for-efficient
Repo
Framework

Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem


Title	Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem
Authors	Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf
Abstract	In order to anticipate rare and impactful events, we propose to quantify the worst-case risk under distributional ambiguity using a recent development in kernel methods – the kernel mean embedding. Specifically, we formulate the generalized moment problem whose ambiguity set (i.e., the moment constraint) is described by constraints in the associated reproducing kernel Hilbert space in a nonparametric manner. We then present the tractable optimization formulation and its theoretical justification. As a concrete application, we numerically test the proposed method in characterizing the worst-case constraint violation probability in the context of a constrained stochastic control system.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00166v1
PDF	https://arxiv.org/pdf/2004.00166v1.pdf
PWC	https://paperswithcode.com/paper/worst-case-risk-quantification-under
Repo
Framework

Free-breathing Cardiovascular MRI Using a Plug-and-Play Method with Learned Denoiser


Title	Free-breathing Cardiovascular MRI Using a Plug-and-Play Method with Learned Denoiser
Authors	Sizhuo Liu, Edward Reehorst, Philip Schniter, Rizwan Ahmad
Abstract	Cardiac magnetic resonance imaging (CMR) is a noninvasive imaging modality that provides a comprehensive evaluation of the cardiovascular system. The clinical utility of CMR is hampered by long acquisition times, however. In this work, we propose and validate a plug-and-play (PnP) method for CMR reconstruction from undersampled multi-coil data. To fully exploit the rich image structure inherent in CMR, we pair the PnP framework with a deep learning (DL)-based denoiser that is trained using spatiotemporal patches from high-quality, breath-held cardiac cine images. The resulting “PnP-DL” method iterates over data consistency and denoising subroutines. We compare the reconstruction performance of PnP-DL to that of compressed sensing (CS) using eight breath-held and ten real-time (RT) free-breathing cardiac cine datasets. We find that, for breath-held datasets, PnP-DL offers more than one dB advantage over commonly used CS methods. For RT free-breathing datasets, where ground truth is not available, PnP-DL receives higher scores in qualitative evaluation. The results highlight the potential of PnP-DL to accelerate RT CMR.
Tasks	Denoising
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03226v1
PDF	https://arxiv.org/pdf/2002.03226v1.pdf
PWC	https://paperswithcode.com/paper/free-breathing-cardiovascular-mri-using-a
Repo
Framework

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection


Title	FocalMix: Semi-Supervised Learning for 3D Medical Image Detection
Authors	Dong Wang, Yuan Zhang, Kexin Zhang, Liwei Wang
Abstract	Applying artificial intelligence techniques in medical imaging is one of the most promising areas in medicine. However, most of the recent success in this area highly relies on large amounts of carefully annotated data, whereas annotating medical images is a costly process. In this paper, we propose a novel method, called FocalMix, which, to the best of our knowledge, is the first to leverage recent advances in semi-supervised learning (SSL) for 3D medical image detection. We conducted extensive experiments on two widely used datasets for lung nodule detection, LUNA16 and NLST. Results show that our proposed SSL methods can achieve a substantial improvement of up to 17.3% over state-of-the-art supervised learning approaches with 400 unlabeled CT scans.
Tasks	Lung Nodule Detection
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09108v1
PDF	https://arxiv.org/pdf/2003.09108v1.pdf
PWC	https://paperswithcode.com/paper/focalmix-semi-supervised-learning-for-3d
Repo
Framework

Estimation of high frequency nutrient concentrations from water quality surrogates using machine learning methods


Title	Estimation of high frequency nutrient concentrations from water quality surrogates using machine learning methods
Authors	María Castrillo, Álvaro López García
Abstract	Continuous high frequency water quality monitoring is becoming a critical task to support water management. Despite the advancements in sensor technologies, certain variables cannot be easily and/or economically monitored in-situ and in real time. In these cases, surrogate measures can be used to make estimations by means of data-driven models. In this work, variables that are commonly measured in-situ are used as surrogates to estimate the concentrations of nutrients in a rural catchment and in an urban one, making use of machine learning models, specifically Random Forests. The results are compared with those of linear modelling using the same number of surrogates, obtaining a reduction in the Root Mean Squared Error (RMSE) of up to 60.1%. The profit from including up to seven surrogate sensors was computed, concluding that adding more than 4 and 5 sensors in each of the catchments respectively was not worthy in terms of error improvement.
Tasks
Published	2020-01-27
URL	https://arxiv.org/abs/2001.09695v1
PDF	https://arxiv.org/pdf/2001.09695v1.pdf
PWC	https://paperswithcode.com/paper/estimation-of-high-frequency-nutrient
Repo
Framework

3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation


Title	3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation
Authors	Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner
Abstract	We present 3D-MPA, a method for instance segmentation on 3D point clouds. Given an input point cloud, we propose an object-centric approach where each point votes for its object center. We sample object proposals from the predicted object centers. Then, we learn proposal features from grouped point features that voted for the same object center. A graph convolutional network introduces inter-proposal relations, providing higher-level feature learning in addition to the lower-level point features. Each proposal comprises a semantic label, a set of associated points over which we define a foreground-background mask, an objectness score and aggregation features. Previous works usually perform non-maximum-suppression (NMS) over proposals to obtain the final object detections or semantic instances. However, NMS can discard potentially correct predictions. Instead, our approach keeps all proposals and groups them together based on the learned aggregation features. We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset.
Tasks	3D Object Detection, 3D Semantic Instance Segmentation, Instance Segmentation, Object Detection, Semantic Segmentation
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13867v1
PDF	https://arxiv.org/pdf/2003.13867v1.pdf
PWC	https://paperswithcode.com/paper/3d-mpa-multi-proposal-aggregation-for-3d
Repo
Framework