Paper Group AWR 347
Clonability of anti-counterfeiting printable graphical codes: a machine learning approach. Reducing the variance in online optimization by transporting past gradients. In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images. Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses. Approxim …
Clonability of anti-counterfeiting printable graphical codes: a machine learning approach
Title | Clonability of anti-counterfeiting printable graphical codes: a machine learning approach |
Authors | Olga Taran, Slavi Bonev, Slava Voloshynovskiy |
Abstract | In recent years, printable graphical codes have attracted a lot of attention enabling a link between the physical and digital worlds, which is of great interest for the IoT and brand protection applications. The security of printable codes in terms of their reproducibility by unauthorized parties or clonability is largely unexplored. In this paper, we try to investigate the clonability of printable graphical codes from a machine learning perspective. The proposed framework is based on a simple system composed of fully connected neural network layers. The results obtained on real codes printed by several printers demonstrate a possibility to accurately estimate digital codes from their printed counterparts in certain cases. This provides a new insight on scenarios, where printable graphical codes can be accurately cloned. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07359v1 |
http://arxiv.org/pdf/1903.07359v1.pdf | |
PWC | https://paperswithcode.com/paper/clonability-of-anti-counterfeiting-printable |
Repo | https://github.com/taranO/clonability-of-printable-graphical-codes |
Framework | pytorch |
Reducing the variance in online optimization by transporting past gradients
Title | Reducing the variance in online optimization by transporting past gradients |
Authors | Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux |
Abstract | Most stochastic optimization methods use gradients once before discarding them. While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting. One issue is the staleness due to using past gradients. We propose to correct this staleness using the idea of implicit gradient transport (IGT) which transforms gradients computed at previous iterates into gradients evaluated at the current iterate without using the Hessian explicitly. In addition to reducing the variance and bias of our updates over time, IGT can be used as a drop-in replacement for the gradient estimate in a number of well-understood methods such as heavy ball or Adam. We show experimentally that it achieves state-of-the-art results on a wide range of architectures and benchmarks. Additionally, the IGT gradient estimator yields the optimal asymptotic convergence rate for online stochastic optimization in the restricted setting where the Hessians of all component functions are equal. |
Tasks | Stochastic Optimization |
Published | 2019-06-08 |
URL | https://arxiv.org/abs/1906.03532v2 |
https://arxiv.org/pdf/1906.03532v2.pdf | |
PWC | https://paperswithcode.com/paper/reducing-the-variance-in-online-optimization |
Repo | https://github.com/seba-1511/igt.pth |
Framework | pytorch |
In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images
Title | In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images |
Authors | Marin Oršić, Ivan Krešo, Petra Bevandić, Siniša Šegvić |
Abstract | Recent success of semantic segmentation approaches on demanding road driving datasets has spurred interest in many related application fields. Many of these applications involve real-time prediction on mobile platforms such as cars, drones and various kinds of robots. Real-time setup is challenging due to extraordinary computational complexity involved. Many previous works address the challenge with custom lightweight architectures which decrease computational complexity by reducing depth, width and layer capacity with respect to general purpose architectures. We propose an alternative approach which achieves a significantly better performance across a wide range of computing budgets. First, we rely on a light-weight general purpose architecture as the main recognition engine. Then, we leverage light-weight upsampling with lateral connections as the most cost-effective solution to restore the prediction resolution. Finally, we propose to enlarge the receptive field by fusing shared features at multiple resolutions in a novel fashion. Experiments on several road driving datasets show a substantial advantage of the proposed approach, either with ImageNet pre-trained parameters or when we learn from scratch. Our Cityscapes test submission entitled SwiftNetRN-18 delivers 75.5% MIoU and achieves 39.9 Hz on 1024x2048 images on GTX1080Ti. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08469v2 |
http://arxiv.org/pdf/1903.08469v2.pdf | |
PWC | https://paperswithcode.com/paper/in-defense-of-pre-trained-imagenet |
Repo | https://github.com/orsic/swiftnet |
Framework | pytorch |
Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses
Title | Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses |
Authors | Eric Brachmann, Carsten Rother |
Abstract | We present Neural-Guided RANSAC (NG-RANSAC), an extension to the classic RANSAC algorithm from robust optimization. NG-RANSAC uses prior information to improve model hypothesis search, increasing the chance of finding outlier-free minimal sets. Previous works use heuristic side-information like hand-crafted descriptor distance to guide hypothesis search. In contrast, we learn hypothesis search in a principled fashion that lets us optimize an arbitrary task loss during training, leading to large improvements on classic computer vision tasks. We present two further extensions to NG-RANSAC. Firstly, using the inlier count itself as training signal allows us to train neural guidance in a self-supervised fashion. Secondly, we combine neural guidance with differentiable RANSAC to build neural networks which focus on certain parts of the input data and make the output predictions as good as possible. We evaluate NG-RANSAC on a wide array of computer vision tasks, namely estimation of epipolar geometry, horizon line estimation and camera re-localization. We achieve superior or competitive results compared to state-of-the-art robust estimators, including very recent, learned ones. |
Tasks | Horizon Line Estimation |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04132v2 |
https://arxiv.org/pdf/1905.04132v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-guided-ransac-learning-where-to-sample |
Repo | https://github.com/vislearn/ngransac |
Framework | pytorch |
Approximating Continuous Functions on Persistence Diagrams Using Template Functions
Title | Approximating Continuous Functions on Persistence Diagrams Using Template Functions |
Authors | Jose A. Perea, Elizabeth Munch, Firas A. Khasawneh |
Abstract | The persistence diagram is an increasingly useful tool from Topological Data Analysis, but its use alongside typical machine learning techniques requires mathematical finesse. The most success to date has come from methods that map persistence diagrams into $\mathbb{R}^n$, in a way which maximizes the structure preserved. This process is commonly referred to as featurization. In this paper, we describe a mathematical framework for featurization using template functions. These functions are general as they are only required to be continuous and compactly supported. We discuss two realizations: tent functions, which emphasize the local contributions of points in a persistence diagram, and interpolating polynomials, which capture global pairwise interactions. We combine the resulting features with classification and regression algorithms on several examples including shape data and the Rossler system. Our results show that using template functions yields high accuracy rates that match and often exceed those of existing featurization methods. One counter-intuitive observation is that in most cases using interpolating polynomials, where each point contributes globally to the feature vector, yields significantly better results than using tent functions, where the contribution of each point is localized. Along the way, we provide a complete characterization of compactness in the space of persistence diagrams. |
Tasks | Time Series, Topological Data Analysis |
Published | 2019-02-19 |
URL | http://arxiv.org/abs/1902.07190v2 |
http://arxiv.org/pdf/1902.07190v2.pdf | |
PWC | https://paperswithcode.com/paper/approximating-continuous-functions-on |
Repo | https://github.com/lucho8908/adaptive_template_systems |
Framework | none |
Boosting Standard Classification Architectures Through a Ranking Regularizer
Title | Boosting Standard Classification Architectures Through a Ranking Regularizer |
Authors | Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis |
Abstract | We employ triplet loss as a feature embedding regularizer to boost classification performance. Standard architectures, like ResNet and Inception, are extended to support both losses with minimal hyper-parameter tuning. This promotes generality while fine-tuning pretrained networks. Triplet loss is a powerful surrogate for recently proposed embedding regularizers. Yet, it is avoided due to large batch-size requirement and high computational cost. Through our experiments, we re-assess these assumptions. During inference, our network supports both classification and embedding tasks without any computational overhead. Quantitative evaluation highlights a steady improvement on five fine-grained recognition datasets. Further evaluation on an imbalanced video dataset achieves significant improvement. Triplet loss brings feature embedding characteristics like nearest neighbor to classification models. Code available at \url{http://bit.ly/2LNYEqL}. |
Tasks | |
Published | 2019-01-24 |
URL | https://arxiv.org/abs/1901.08616v3 |
https://arxiv.org/pdf/1901.08616v3.pdf | |
PWC | https://paperswithcode.com/paper/in-defense-of-the-triplet-loss-for-visual |
Repo | https://github.com/ahmdtaha/softmax_triplet_loss |
Framework | tf |
Adversarial Convolutional Networks with Weak Domain-Transfer for Multi-Sequence Cardiac MR Images Segmentation
Title | Adversarial Convolutional Networks with Weak Domain-Transfer for Multi-Sequence Cardiac MR Images Segmentation |
Authors | Jingkun Chen, Hongwei Li, Jianguo Zhang, Bjoern Menze |
Abstract | Analysis and modeling of the ventricles and myocardium are important in the diagnostic and treatment of heart diseases. Manual delineation of those tissues in cardiac MR (CMR) scans is laborious and time-consuming. The ambiguity of the boundaries makes the segmentation task rather challenging. Furthermore, the annotations on some modalities such as Late Gadolinium Enhancement (LGE) MRI, are often not available. We propose an end-to-end segmentation framework based on convolutional neural network (CNN) and adversarial learning. A dilated residual U-shape network is used as a segmentor to generate the prediction mask; meanwhile, a CNN is utilized as a discriminator model to judge the segmentation quality. To leverage the available annotations across modalities per patient, a new loss function named weak domain-transfer loss is introduced to the pipeline. The proposed model is evaluated on the public dataset released by the challenge organizer in MICCAI 2019, which consists of 45 sets of multi-sequence CMR images. We demonstrate that the proposed adversarial pipeline outperforms baseline deep-learning methods. |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09298v2 |
https://arxiv.org/pdf/1908.09298v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-convolutional-networks-with-weak |
Repo | https://github.com/jingkunchen/MS-CMR_miccai_2019 |
Framework | none |
Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation
Title | Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation |
Authors | Junhwa Hur, Stefan Roth |
Abstract | Deep learning approaches to optical flow estimation have seen rapid progress over the recent years. One common trait of many networks is that they refine an initial flow estimate either through multiple stages or across the levels of a coarse-to-fine representation. While leading to more accurate results, the downside of this is an increased number of parameters. Taking inspiration from both classical energy minimization approaches as well as residual networks, we propose an iterative residual refinement (IRR) scheme based on weight sharing that can be combined with several backbone networks. It reduces the number of parameters, improves the accuracy, or even achieves both. Moreover, we show that integrating occlusion prediction and bi-directional flow estimation into our IRR scheme can further boost the accuracy. Our full network achieves state-of-the-art results for both optical flow and occlusion estimation across several standard datasets. |
Tasks | Optical Flow Estimation |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05290v1 |
http://arxiv.org/pdf/1904.05290v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-residual-refinement-for-joint |
Repo | https://github.com/visinf/irr |
Framework | pytorch |
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
Title | Learning World Graphs to Accelerate Hierarchical Reinforcement Learning |
Authors | Wenling Shang, Alex Trott, Stephan Zheng, Caiming Xiong, Richard Socher |
Abstract | In many real-world scenarios, an autonomous agent often encounters various tasks within a single complex environment. We propose to build a graph abstraction over the environment structure to accelerate the learning of these tasks. Here, nodes are important points of interest (pivotal states) and edges represent feasible traversals between them. Our approach has two stages. First, we jointly train a latent pivotal state model and a curiosity-driven goal-conditioned policy in a task-agnostic manner. Second, provided with the information from the world graph, a high-level Manager quickly finds solution to new tasks and expresses subgoals in reference to pivotal states to a low-level Worker. The Worker can then also leverage the graph to easily traverse to the pivotal states of interest, even across long distance, and explore non-locally. We perform a thorough ablation study to evaluate our approach on a suite of challenging maze tasks, demonstrating significant advantages from the proposed framework over baselines that lack world graph knowledge in terms of performance and efficiency. |
Tasks | Hierarchical Reinforcement Learning |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00664v1 |
https://arxiv.org/pdf/1907.00664v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-world-graphs-to-accelerate |
Repo | https://github.com/maximecb/gym-minigrid |
Framework | pytorch |
Efficient Winograd Convolution via Integer Arithmetic
Title | Efficient Winograd Convolution via Integer Arithmetic |
Authors | Lingchuan Meng, John Brothers |
Abstract | Convolution is the core operation for many deep neural networks. The Winograd convolution algorithms have been shown to accelerate the widely-used small convolution sizes. Quantized neural networks can effectively reduce model sizes and improve inference speed, which leads to a wide variety of kernels and hardware accelerators that work with integer data. The state-of-the-art Winograd algorithms pose challenges for efficient implementation and execution by the integer kernels and accelerators. We introduce a new class of Winograd algorithms by extending the construction to the field of complex and propose optimizations that reduce the number of general multiplications. The new algorithm achieves an arithmetic complexity reduction of $3.13$x over the direct method and an efficiency gain up to $17.37%$ over the rational algorithms. Furthermore, we design and implement an integer-based filter scaling scheme to effectively reduce the filter bit width by $30.77%$ without any significant accuracy loss. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.01965v1 |
http://arxiv.org/pdf/1901.01965v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-winograd-convolution-via-integer |
Repo | https://github.com/gkUwen/learning-material |
Framework | none |
Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More
Title | Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More |
Authors | Jingwen Ye, Yixin Ji, Xinchao Wang, Kairi Ou, Dapeng Tao, Mingli Song |
Abstract | In this paper, we investigate a novel deep-model reusing task. Our goal is to train a lightweight and versatile student model, without human-labelled annotations, that amalgamates the knowledge and masters the expertise of two pretrained teacher models working on heterogeneous problems, one on scene parsing and the other on depth estimation. To this end, we propose an innovative training strategy that learns the parameters of the student intertwined with the teachers, achieved by ‘projecting’ its amalgamated features onto each teacher’s domain and computing the loss. We also introduce two options to generalize the proposed training strategy to handle three or more tasks simultaneously. The proposed scheme yields very encouraging results. As demonstrated on several benchmarks, the trained student model achieves results even superior to those of the teachers in their own expertise domains and on par with the state-of-the-art fully supervised models relying on human-labelled annotations. |
Tasks | Depth Estimation, Scene Parsing |
Published | 2019-04-23 |
URL | http://arxiv.org/abs/1904.10167v1 |
http://arxiv.org/pdf/1904.10167v1.pdf | |
PWC | https://paperswithcode.com/paper/student-becoming-the-master-knowledge |
Repo | https://github.com/zju-vipa/KamalEngine |
Framework | pytorch |
PYRO-NN: Python Reconstruction Operators in Neural Networks
Title | PYRO-NN: Python Reconstruction Operators in Neural Networks |
Authors | Christopher Syben, Markus Michen, Bernhard Stimpel, Stephan Seitz, Stefan Ploner, Andreas K. Maier |
Abstract | Purpose: Recently, several attempts were conducted to transfer deep learning to medical image reconstruction. An increasingly number of publications follow the concept of embedding the CT reconstruction as a known operator into a neural network. However, most of the approaches presented lack an efficient CT reconstruction framework fully integrated into deep learning environments. As a result, many approaches are forced to use workarounds for mathematically unambiguously solvable problems. Methods: PYRO-NN is a generalized framework to embed known operators into the prevalent deep learning framework Tensorflow. The current status includes state-of-the-art parallel-, fan- and cone-beam projectors and back-projectors accelerated with CUDA provided as Tensorflow layers. On top, the framework provides a high level Python API to conduct FBP and iterative reconstruction experiments with data from real CT systems. Results: The framework provides all necessary algorithms and tools to design end-to-end neural network pipelines with integrated CT reconstruction algorithms. The high level Python API allows a simple use of the layers as known from Tensorflow. To demonstrate the capabilities of the layers, the framework comes with three baseline experiments showing a cone-beam short scan FDK reconstruction, a CT reconstruction filter learning setup, and a TV regularized iterative reconstruction. All algorithms and tools are referenced to a scientific publication and are compared to existing non deep learning reconstruction frameworks. The framework is available as open-source software at \url{https://github.com/csyben/PYRO-NN}. Conclusions: PYRO-NN comes with the prevalent deep learning framework Tensorflow and allows to setup end-to-end trainable neural networks in the medical image reconstruction context. We believe that the framework will be a step towards reproducible research |
Tasks | Image Reconstruction |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1904.13342v1 |
http://arxiv.org/pdf/1904.13342v1.pdf | |
PWC | https://paperswithcode.com/paper/pyro-nn-python-reconstruction-operators-in |
Repo | https://github.com/csyben/PYRO-NN |
Framework | tf |
Streaming convolutional neural networks for end-to-end learning with multi-megapixel images
Title | Streaming convolutional neural networks for end-to-end learning with multi-megapixel images |
Authors | Hans Pinckaers, Bram van Ginneken, Geert Litjens |
Abstract | Due to memory constraints on current hardware, most convolution neural networks (CNN) are trained on sub-megapixel images. For example, most popular datasets in computer vision contain images much less than a megapixel in size (0.09MP for ImageNet and 0.001MP for CIFAR-10). In some domains such as medical imaging, multi-megapixel images are needed to identify the presence of disease accurately. We propose a novel method to directly train convolutional neural networks using any input image size end-to-end. This method exploits the locality of most operations in modern convolutional neural networks by performing the forward and backward pass on smaller tiles of the image. In this work, we show a proof of concept using images of up to 66-megapixels (8192x8192), saving approximately 50GB of memory per image. Using two public challenge datasets, we demonstrate that CNNs can learn to extract relevant information from these large images and benefit from increasing resolution. We improved the area under the receiver-operating characteristic curve from 0.580 (4MP) to 0.706 (66MP) for metastasis detection in breast cancer (CAMELYON17). We also obtained a Spearman correlation metric approaching state-of-the-art performance on the TUPAC16 dataset, from 0.485 (1MP) to 0.570 (16MP). Code to reproduce a subset of the experiments is available at https://github.com/DIAGNijmegen/StreamingCNN. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04432v1 |
https://arxiv.org/pdf/1911.04432v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-convolutional-neural-networks-for |
Repo | https://github.com/DIAGNijmegen/StreamingCNN |
Framework | pytorch |
Anderson Acceleration of Proximal Gradient Methods
Title | Anderson Acceleration of Proximal Gradient Methods |
Authors | Vien V. Mai, Mikael Johansson |
Abstract | Anderson acceleration is a well-established and simple technique for speeding up fixed-point computations with countless applications. Previous studies of Anderson acceleration in optimization have only been able to provide convergence guarantees for unconstrained and smooth problems. This work introduces novel methods for adapting Anderson acceleration to (non-smooth and constrained) proximal gradient algorithms. Under some technical conditions, we extend the existing local convergence results of Anderson acceleration for smooth fixed-point mappings to the proposed scheme. We also prove analytically that it is not, in general, possible to guarantee global convergence of native Anderson acceleration. We therefore propose a simple scheme for stabilization that combines the global worst-case guarantees of proximal gradient methods with the local adaptation and practical speed-up of Anderson acceleration. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08590v1 |
https://arxiv.org/pdf/1910.08590v1.pdf | |
PWC | https://paperswithcode.com/paper/anderson-acceleration-of-proximal-gradient |
Repo | https://github.com/vienmai/AA-Prox |
Framework | none |
LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation
Title | LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation |
Authors | Taha Emara, Hossam E. Abd El Munim, Hazem M. Abbas |
Abstract | Semantic image segmentation plays a pivotal role in many vision applications including autonomous driving and medical image analysis. Most of the former approaches move towards enhancing the performance in terms of accuracy with a little awareness of computational efficiency. In this paper, we introduce LiteSeg, a lightweight architecture for semantic image segmentation. In this work, we explore a new deeper version of Atrous Spatial Pyramid Pooling module (ASPP) and apply short and long residual connections, and depthwise separable convolution, resulting in a faster and efficient model. LiteSeg architecture is introduced and tested with multiple backbone networks as Darknet19, MobileNet, and ShuffleNet to provide multiple trade-offs between accuracy and computational cost. The proposed model LiteSeg, with MobileNetV2 as a backbone network, achieves an accuracy of 67.81% mean intersection over union at 161 frames per second with $640 \times 360$ resolution on the Cityscapes dataset. |
Tasks | Autonomous Driving, Semantic Segmentation |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06683v1 |
https://arxiv.org/pdf/1912.06683v1.pdf | |
PWC | https://paperswithcode.com/paper/liteseg-a-novel-lightweight-convnet-for |
Repo | https://github.com/tahaemara/LiteSeg |
Framework | pytorch |