February 1, 2020

3162 words 15 mins read

Paper Group AWR 347

Clonability of anti-counterfeiting printable graphical codes: a machine learning approach. Reducing the variance in online optimization by transporting past gradients. In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images. Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses. Approxim …

Clonability of anti-counterfeiting printable graphical codes: a machine learning approach


Title	Clonability of anti-counterfeiting printable graphical codes: a machine learning approach
Authors	Olga Taran, Slavi Bonev, Slava Voloshynovskiy
Abstract	In recent years, printable graphical codes have attracted a lot of attention enabling a link between the physical and digital worlds, which is of great interest for the IoT and brand protection applications. The security of printable codes in terms of their reproducibility by unauthorized parties or clonability is largely unexplored. In this paper, we try to investigate the clonability of printable graphical codes from a machine learning perspective. The proposed framework is based on a simple system composed of fully connected neural network layers. The results obtained on real codes printed by several printers demonstrate a possibility to accurately estimate digital codes from their printed counterparts in certain cases. This provides a new insight on scenarios, where printable graphical codes can be accurately cloned.
Tasks
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07359v1
PDF	http://arxiv.org/pdf/1903.07359v1.pdf
PWC	https://paperswithcode.com/paper/clonability-of-anti-counterfeiting-printable
Repo	https://github.com/taranO/clonability-of-printable-graphical-codes
Framework	pytorch

Reducing the variance in online optimization by transporting past gradients


Title	Reducing the variance in online optimization by transporting past gradients
Authors	Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux
Abstract	Most stochastic optimization methods use gradients once before discarding them. While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting. One issue is the staleness due to using past gradients. We propose to correct this staleness using the idea of implicit gradient transport (IGT) which transforms gradients computed at previous iterates into gradients evaluated at the current iterate without using the Hessian explicitly. In addition to reducing the variance and bias of our updates over time, IGT can be used as a drop-in replacement for the gradient estimate in a number of well-understood methods such as heavy ball or Adam. We show experimentally that it achieves state-of-the-art results on a wide range of architectures and benchmarks. Additionally, the IGT gradient estimator yields the optimal asymptotic convergence rate for online stochastic optimization in the restricted setting where the Hessians of all component functions are equal.
Tasks	Stochastic Optimization
Published	2019-06-08
URL	https://arxiv.org/abs/1906.03532v2
PDF	https://arxiv.org/pdf/1906.03532v2.pdf
PWC	https://paperswithcode.com/paper/reducing-the-variance-in-online-optimization
Repo	https://github.com/seba-1511/igt.pth
Framework	pytorch

In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images


Title	In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images
Authors	Marin Oršić, Ivan Krešo, Petra Bevandić, Siniša Šegvić
Abstract	Recent success of semantic segmentation approaches on demanding road driving datasets has spurred interest in many related application fields. Many of these applications involve real-time prediction on mobile platforms such as cars, drones and various kinds of robots. Real-time setup is challenging due to extraordinary computational complexity involved. Many previous works address the challenge with custom lightweight architectures which decrease computational complexity by reducing depth, width and layer capacity with respect to general purpose architectures. We propose an alternative approach which achieves a significantly better performance across a wide range of computing budgets. First, we rely on a light-weight general purpose architecture as the main recognition engine. Then, we leverage light-weight upsampling with lateral connections as the most cost-effective solution to restore the prediction resolution. Finally, we propose to enlarge the receptive field by fusing shared features at multiple resolutions in a novel fashion. Experiments on several road driving datasets show a substantial advantage of the proposed approach, either with ImageNet pre-trained parameters or when we learn from scratch. Our Cityscapes test submission entitled SwiftNetRN-18 delivers 75.5% MIoU and achieves 39.9 Hz on 1024x2048 images on GTX1080Ti.
Tasks	Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08469v2
PDF	http://arxiv.org/pdf/1903.08469v2.pdf
PWC	https://paperswithcode.com/paper/in-defense-of-pre-trained-imagenet
Repo	https://github.com/orsic/swiftnet
Framework	pytorch

Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses


Title	Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses
Authors	Eric Brachmann, Carsten Rother
Abstract	We present Neural-Guided RANSAC (NG-RANSAC), an extension to the classic RANSAC algorithm from robust optimization. NG-RANSAC uses prior information to improve model hypothesis search, increasing the chance of finding outlier-free minimal sets. Previous works use heuristic side-information like hand-crafted descriptor distance to guide hypothesis search. In contrast, we learn hypothesis search in a principled fashion that lets us optimize an arbitrary task loss during training, leading to large improvements on classic computer vision tasks. We present two further extensions to NG-RANSAC. Firstly, using the inlier count itself as training signal allows us to train neural guidance in a self-supervised fashion. Secondly, we combine neural guidance with differentiable RANSAC to build neural networks which focus on certain parts of the input data and make the output predictions as good as possible. We evaluate NG-RANSAC on a wide array of computer vision tasks, namely estimation of epipolar geometry, horizon line estimation and camera re-localization. We achieve superior or competitive results compared to state-of-the-art robust estimators, including very recent, learned ones.
Tasks	Horizon Line Estimation
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04132v2
PDF	https://arxiv.org/pdf/1905.04132v2.pdf
PWC	https://paperswithcode.com/paper/neural-guided-ransac-learning-where-to-sample
Repo	https://github.com/vislearn/ngransac
Framework	pytorch

Approximating Continuous Functions on Persistence Diagrams Using Template Functions


Title	Approximating Continuous Functions on Persistence Diagrams Using Template Functions
Authors	Jose A. Perea, Elizabeth Munch, Firas A. Khasawneh
Abstract	The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into $\mathbb{R}^n$, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we describe a mathematical framework for featurization using template functions. These functions are general as they are only required to be continuous and compactly supported. We discuss two realizations: tent functions, which emphasize the local contributions of points in a persistence diagram, and interpolating polynomials, which capture global pairwise interactions. We combine the resulting features with classification and regression algorithms on several examples including shape data and the Rossler system. Our results show that using template functions yields high accuracy rates that match and often exceed those of existing featurization methods. One counter-intuitive observation is that in most cases using interpolating polynomials, where each point contributes globally to the feature vector, yields significantly better results than using tent functions, where the contribution of each point is localized. Along the way, we provide a complete characterization of compactness in the space of persistence diagrams.
Tasks	Time Series, Topological Data Analysis
Published	2019-02-19
URL	http://arxiv.org/abs/1902.07190v2
PDF	http://arxiv.org/pdf/1902.07190v2.pdf
PWC	https://paperswithcode.com/paper/approximating-continuous-functions-on
Repo	https://github.com/lucho8908/adaptive_template_systems
Framework	none

Boosting Standard Classification Architectures Through a Ranking Regularizer


Title	Boosting Standard Classification Architectures Through a Ranking Regularizer
Authors	Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis
Abstract	We employ triplet loss as a feature embedding regularizer to boost classification performance. Standard architectures, like ResNet and Inception, are extended to support both losses with minimal hyper-parameter tuning. This promotes generality while fine-tuning pretrained networks. Triplet loss is a powerful surrogate for recently proposed embedding regularizers. Yet, it is avoided due to large batch-size requirement and high computational cost. Through our experiments, we re-assess these assumptions. During inference, our network supports both classification and embedding tasks without any computational overhead. Quantitative evaluation highlights a steady improvement on five fine-grained recognition datasets. Further evaluation on an imbalanced video dataset achieves significant improvement. Triplet loss brings feature embedding characteristics like nearest neighbor to classification models. Code available at \url{http://bit.ly/2LNYEqL}.
Tasks
Published	2019-01-24
URL	https://arxiv.org/abs/1901.08616v3
PDF	https://arxiv.org/pdf/1901.08616v3.pdf
PWC	https://paperswithcode.com/paper/in-defense-of-the-triplet-loss-for-visual
Repo	https://github.com/ahmdtaha/softmax_triplet_loss
Framework	tf

Adversarial Convolutional Networks with Weak Domain-Transfer for Multi-Sequence Cardiac MR Images Segmentation


Title	Adversarial Convolutional Networks with Weak Domain-Transfer for Multi-Sequence Cardiac MR Images Segmentation
Authors	Jingkun Chen, Hongwei Li, Jianguo Zhang, Bjoern Menze
Abstract	Analysis and modeling of the ventricles and myocardium are important in the diagnostic and treatment of heart diseases. Manual delineation of those tissues in cardiac MR (CMR) scans is laborious and time-consuming. The ambiguity of the boundaries makes the segmentation task rather challenging. Furthermore, the annotations on some modalities such as Late Gadolinium Enhancement (LGE) MRI, are often not available. We propose an end-to-end segmentation framework based on convolutional neural network (CNN) and adversarial learning. A dilated residual U-shape network is used as a segmentor to generate the prediction mask; meanwhile, a CNN is utilized as a discriminator model to judge the segmentation quality. To leverage the available annotations across modalities per patient, a new loss function named weak domain-transfer loss is introduced to the pipeline. The proposed model is evaluated on the public dataset released by the challenge organizer in MICCAI 2019, which consists of 45 sets of multi-sequence CMR images. We demonstrate that the proposed adversarial pipeline outperforms baseline deep-learning methods.
Tasks
Published	2019-08-25
URL	https://arxiv.org/abs/1908.09298v2
PDF	https://arxiv.org/pdf/1908.09298v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-convolutional-networks-with-weak
Repo	https://github.com/jingkunchen/MS-CMR_miccai_2019
Framework	none


Title	Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation
Authors	Junhwa Hur, Stefan Roth
Abstract	Deep learning approaches to optical flow estimation have seen rapid progress over the recent years. One common trait of many networks is that they refine an initial flow estimate either through multiple stages or across the levels of a coarse-to-fine representation. While leading to more accurate results, the downside of this is an increased number of parameters. Taking inspiration from both classical energy minimization approaches as well as residual networks, we propose an iterative residual refinement (IRR) scheme based on weight sharing that can be combined with several backbone networks. It reduces the number of parameters, improves the accuracy, or even achieves both. Moreover, we show that integrating occlusion prediction and bi-directional flow estimation into our IRR scheme can further boost the accuracy. Our full network achieves state-of-the-art results for both optical flow and occlusion estimation across several standard datasets.
Tasks	Optical Flow Estimation
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05290v1
PDF	http://arxiv.org/pdf/1904.05290v1.pdf
PWC	https://paperswithcode.com/paper/iterative-residual-refinement-for-joint
Repo	https://github.com/visinf/irr
Framework	pytorch

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning


Title	Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
Authors	Wenling Shang, Alex Trott, Stephan Zheng, Caiming Xiong, Richard Socher
Abstract	In many real-world scenarios, an autonomous agent often encounters various tasks within a single complex environment. We propose to build a graph abstraction over the environment structure to accelerate the learning of these tasks. Here, nodes are important points of interest (pivotal states) and edges represent feasible traversals between them. Our approach has two stages. First, we jointly train a latent pivotal state model and a curiosity-driven goal-conditioned policy in a task-agnostic manner. Second, provided with the information from the world graph, a high-level Manager quickly finds solution to new tasks and expresses subgoals in reference to pivotal states to a low-level Worker. The Worker can then also leverage the graph to easily traverse to the pivotal states of interest, even across long distance, and explore non-locally. We perform a thorough ablation study to evaluate our approach on a suite of challenging maze tasks, demonstrating significant advantages from the proposed framework over baselines that lack world graph knowledge in terms of performance and efficiency.
Tasks	Hierarchical Reinforcement Learning
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00664v1
PDF	https://arxiv.org/pdf/1907.00664v1.pdf
PWC	https://paperswithcode.com/paper/learning-world-graphs-to-accelerate
Repo	https://github.com/maximecb/gym-minigrid
Framework	pytorch

Efficient Winograd Convolution via Integer Arithmetic


Title	Efficient Winograd Convolution via Integer Arithmetic
Authors	Lingchuan Meng, John Brothers
Abstract	Convolution is the core operation for many deep neural networks. The Winograd convolution algorithms have been shown to accelerate the widely-used small convolution sizes. Quantized neural networks can effectively reduce model sizes and improve inference speed, which leads to a wide variety of kernels and hardware accelerators that work with integer data. The state-of-the-art Winograd algorithms pose challenges for efficient implementation and execution by the integer kernels and accelerators. We introduce a new class of Winograd algorithms by extending the construction to the field of complex and propose optimizations that reduce the number of general multiplications. The new algorithm achieves an arithmetic complexity reduction of $3.13$x over the direct method and an efficiency gain up to $17.37%$ over the rational algorithms. Furthermore, we design and implement an integer-based filter scaling scheme to effectively reduce the filter bit width by $30.77%$ without any significant accuracy loss.
Tasks
Published	2019-01-07
URL	http://arxiv.org/abs/1901.01965v1
PDF	http://arxiv.org/pdf/1901.01965v1.pdf
PWC	https://paperswithcode.com/paper/efficient-winograd-convolution-via-integer
Repo	https://github.com/gkUwen/learning-material
Framework	none

Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More


Title	Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More
Authors	Jingwen Ye, Yixin Ji, Xinchao Wang, Kairi Ou, Dapeng Tao, Mingli Song
Abstract	In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pretrained teacher models working on heterogeneous problems, one on scene parsing and the other on depth estimation. To this end, we propose an innovative training strategy that learns the parameters of the student intertwined with the teachers, achieved by ‘projecting’ its amalgamated features onto each teacher’s domain and computing the loss. We also introduce two options to generalize the proposed training strategy to handle three or more tasks simultaneously. The proposed scheme yields very encouraging results. As demonstrated on several benchmarks, the trained student model achieves results even superior to those of the teachers in their own expertise domains and on par with the state-of-the-art fully supervised models relying on human-labelled annotations.
Tasks	Depth Estimation, Scene Parsing
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10167v1
PDF	http://arxiv.org/pdf/1904.10167v1.pdf
PWC	https://paperswithcode.com/paper/student-becoming-the-master-knowledge
Repo	https://github.com/zju-vipa/KamalEngine
Framework	pytorch

PYRO-NN: Python Reconstruction Operators in Neural Networks


Title	PYRO-NN: Python Reconstruction Operators in Neural Networks
Authors	Christopher Syben, Markus Michen, Bernhard Stimpel, Stephan Seitz, Stefan Ploner, Andreas K. Maier
Abstract	Purpose: Recently, several attempts were conducted to transfer deep learning to medical image reconstruction. An increasingly number of publications follow the concept of embedding the CT reconstruction as a known operator into a neural network. However, most of the approaches presented lack an efficient CT reconstruction framework fully integrated into deep learning environments. As a result, many approaches are forced to use workarounds for mathematically unambiguously solvable problems. Methods: PYRO-NN is a generalized framework to embed known operators into the prevalent deep learning framework Tensorflow. The current status includes state-of-the-art parallel-, fan- and cone-beam projectors and back-projectors accelerated with CUDA provided as Tensorflow layers. On top, the framework provides a high level Python API to conduct FBP and iterative reconstruction experiments with data from real CT systems. Results: The framework provides all necessary algorithms and tools to design end-to-end neural network pipelines with integrated CT reconstruction algorithms. The high level Python API allows a simple use of the layers as known from Tensorflow. To demonstrate the capabilities of the layers, the framework comes with three baseline experiments showing a cone-beam short scan FDK reconstruction, a CT reconstruction filter learning setup, and a TV regularized iterative reconstruction. All algorithms and tools are referenced to a scientific publication and are compared to existing non deep learning reconstruction frameworks. The framework is available as open-source software at \url{https://github.com/csyben/PYRO-NN}. Conclusions: PYRO-NN comes with the prevalent deep learning framework Tensorflow and allows to setup end-to-end trainable neural networks in the medical image reconstruction context. We believe that the framework will be a step towards reproducible research
Tasks	Image Reconstruction
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13342v1
PDF	http://arxiv.org/pdf/1904.13342v1.pdf
PWC	https://paperswithcode.com/paper/pyro-nn-python-reconstruction-operators-in
Repo	https://github.com/csyben/PYRO-NN
Framework	tf

Streaming convolutional neural networks for end-to-end learning with multi-megapixel images


Title	Streaming convolutional neural networks for end-to-end learning with multi-megapixel images
Authors	Hans Pinckaers, Bram van Ginneken, Geert Litjens
Abstract	Due to memory constraints on current hardware, most convolution neural networks (CNN) are trained on sub-megapixel images. For example, most popular datasets in computer vision contain images much less than a megapixel in size (0.09MP for ImageNet and 0.001MP for CIFAR-10). In some domains such as medical imaging, multi-megapixel images are needed to identify the presence of disease accurately. We propose a novel method to directly train convolutional neural networks using any input image size end-to-end. This method exploits the locality of most operations in modern convolutional neural networks by performing the forward and backward pass on smaller tiles of the image. In this work, we show a proof of concept using images of up to 66-megapixels (8192x8192), saving approximately 50GB of memory per image. Using two public challenge datasets, we demonstrate that CNNs can learn to extract relevant information from these large images and benefit from increasing resolution. We improved the area under the receiver-operating characteristic curve from 0.580 (4MP) to 0.706 (66MP) for metastasis detection in breast cancer (CAMELYON17). We also obtained a Spearman correlation metric approaching state-of-the-art performance on the TUPAC16 dataset, from 0.485 (1MP) to 0.570 (16MP). Code to reproduce a subset of the experiments is available at https://github.com/DIAGNijmegen/StreamingCNN.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04432v1
PDF	https://arxiv.org/pdf/1911.04432v1.pdf
PWC	https://paperswithcode.com/paper/streaming-convolutional-neural-networks-for
Repo	https://github.com/DIAGNijmegen/StreamingCNN
Framework	pytorch

Anderson Acceleration of Proximal Gradient Methods


Title	Anderson Acceleration of Proximal Gradient Methods
Authors	Vien V. Mai, Mikael Johansson
Abstract	Anderson acceleration is a well-established and simple technique for speeding up fixed-point computations with countless applications. Previous studies of Anderson acceleration in optimization have only been able to provide convergence guarantees for unconstrained and smooth problems. This work introduces novel methods for adapting Anderson acceleration to (non-smooth and constrained) proximal gradient algorithms. Under some technical conditions, we extend the existing local convergence results of Anderson acceleration for smooth fixed-point mappings to the proposed scheme. We also prove analytically that it is not, in general, possible to guarantee global convergence of native Anderson acceleration. We therefore propose a simple scheme for stabilization that combines the global worst-case guarantees of proximal gradient methods with the local adaptation and practical speed-up of Anderson acceleration.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08590v1
PDF	https://arxiv.org/pdf/1910.08590v1.pdf
PWC	https://paperswithcode.com/paper/anderson-acceleration-of-proximal-gradient
Repo	https://github.com/vienmai/AA-Prox
Framework	none

LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation


Title	LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation
Authors	Taha Emara, Hossam E. Abd El Munim, Hazem M. Abbas
Abstract	Semantic image segmentation plays a pivotal role in many vision applications including autonomous driving and medical image analysis. Most of the former approaches move towards enhancing the performance in terms of accuracy with a little awareness of computational efficiency. In this paper, we introduce LiteSeg, a lightweight architecture for semantic image segmentation. In this work, we explore a new deeper version of Atrous Spatial Pyramid Pooling module (ASPP) and apply short and long residual connections, and depthwise separable convolution, resulting in a faster and efficient model. LiteSeg architecture is introduced and tested with multiple backbone networks as Darknet19, MobileNet, and ShuffleNet to provide multiple trade-offs between accuracy and computational cost. The proposed model LiteSeg, with MobileNetV2 as a backbone network, achieves an accuracy of 67.81% mean intersection over union at 161 frames per second with $640 \times 360$ resolution on the Cityscapes dataset.
Tasks	Autonomous Driving, Semantic Segmentation
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06683v1
PDF	https://arxiv.org/pdf/1912.06683v1.pdf
PWC	https://paperswithcode.com/paper/liteseg-a-novel-lightweight-convnet-for
Repo	https://github.com/tahaemara/LiteSeg
Framework	pytorch