January 31, 2020

3171 words 15 mins read

Paper Group AWR 387

Paper Group AWR 387

Multi-class Classification without Multi-class Labels. Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach. Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval. OpenEDS: Open Eye Dataset. HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation. Provably Robust …

Multi-class Classification without Multi-class Labels

Title Multi-class Classification without Multi-class Labels
Authors Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira
Abstract This work presents a new strategy for multi-class classification that requires no class-specific labels, but instead leverages pairwise similarity between examples, which is a weaker form of annotation. The proposed method, meta classification learning, optimizes a binary classifier for pairwise similarity prediction and through this process learns a multi-class classifier as a submodule. We formulate this approach, present a probabilistic graphical model for it, and derive a surprisingly simple loss function that can be used to learn neural network-based models. We then demonstrate that this same framework generalizes to the supervised, unsupervised cross-task, and semi-supervised settings. Our method is evaluated against state of the art in all three learning paradigms and shows a superior or comparable accuracy, providing evidence that learning multi-class classification without multi-class labels is a viable learning option.
Tasks
Published 2019-01-02
URL http://arxiv.org/abs/1901.00544v1
PDF http://arxiv.org/pdf/1901.00544v1.pdf
PWC https://paperswithcode.com/paper/multi-class-classification-without-multi
Repo https://github.com/GT-RIPL/L2C
Framework pytorch

Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach

Title Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach
Authors Xi Shen, Ilaria Pastrolin, Oumayma Bounou, Spyros Gidaris, Marc Smith, Olivier Poncet, Mathieu Aubry
Abstract Historical watermark recognition is a highly practical, yet unsolved challenge for archivists and historians. With a large number of well-defined classes, cluttered and noisy samples, different types of representations, both subtle differences between classes and high intra-class variation, historical watermarks are also challenging for pattern recognition. In this paper, overcoming the difficulty of data collection, we present a large public dataset with more than 6k new photographs, allowing for the first time to tackle at scale the scenarios of practical interest for scholars: one-shot instance recognition and cross-domain one-shot instance recognition amongst more than 16k fine-grained classes. We demonstrate that this new dataset is large enough to train modern deep learning approaches, and show that standard methods can be improved considerably by using mid-level deep features. More precisely, we design both a matching score and a feature fine-tuning strategy based on filtering local matches using spatial consistency. This consistency-based approach provides important performance boost compared to strong baselines. Our model achieves 55% top-1 accuracy on our very challenging 16,753-class one-shot cross-domain recognition task, each class described by a single drawing from the classic Briquet catalog. In addition to watermark classification, we show our approach provides promising results on fine-grained sketch-based image retrieval.
Tasks Image Retrieval, Sketch-Based Image Retrieval
Published 2019-08-27
URL https://arxiv.org/abs/1908.10254v1
PDF https://arxiv.org/pdf/1908.10254v1.pdf
PWC https://paperswithcode.com/paper/large-scale-historical-watermark-recognition
Repo https://github.com/XiSHEN0220/WatermarkReco
Framework pytorch

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

Title Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval
Authors Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados, Yi-Zhe Song
Abstract In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research
Tasks Image Retrieval, Sketch-Based Image Retrieval
Published 2019-04-06
URL http://arxiv.org/abs/1904.03451v1
PDF http://arxiv.org/pdf/1904.03451v1.pdf
PWC https://paperswithcode.com/paper/doodle-to-search-practical-zero-shot-sketch
Repo https://github.com/sounakdey/doodle2search
Framework pytorch

OpenEDS: Open Eye Dataset

Title OpenEDS: Open Eye Dataset
Authors Stephan J. Garbin, Yiru Shen, Immo Schuetz, Robert Cavin, Gregory Hughes, Sachin S. Talathi
Abstract We present a large scale data set, OpenEDS: Open Eye Dataset, of eye-images captured using a virtual-reality (VR) head mounted display mounted with two synchronized eyefacing cameras at a frame rate of 200 Hz under controlled illumination. This dataset is compiled from video capture of the eye-region collected from 152 individual participants and is divided into four subsets: (i) 12,759 images with pixel-level annotations for key eye-regions: iris, pupil and sclera (ii) 252,690 unlabelled eye-images, (iii) 91,200 frames from randomly selected video sequence of 1.5 seconds in duration and (iv) 143 pairs of left and right point cloud data compiled from corneal topography of eye regions collected from a subset, 143 out of 152, participants in the study. A baseline experiment has been evaluated on OpenEDS for the task of semantic segmentation of pupil, iris, sclera and background, with the mean intersectionover-union (mIoU) of 98.3 %. We anticipate that OpenEDS will create opportunities to researchers in the eye tracking community and the broader machine learning and computer vision community to advance the state of eye-tracking for VR applications. The dataset is available for download upon request at https://research.fb.com/programs/openeds-challenge
Tasks Eye Tracking, Semantic Segmentation
Published 2019-04-30
URL https://arxiv.org/abs/1905.03702v2
PDF https://arxiv.org/pdf/1905.03702v2.pdf
PWC https://paperswithcode.com/paper/190503702
Repo https://github.com/lib314a/Good-And-Open
Framework none

HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation

Title HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
Authors Cheng Sun, Chi-Wei Hsiao, Min Sun, Hwann-Tzong Chen
Abstract We present a new approach to the problem of estimating the 3D room layout from a single panoramic image. We represent room layout as three 1D vectors that encode, at each image column, the boundary positions of floor-wall and ceiling-wall, and the existence of wall-wall boundary. The proposed network, HorizonNet, trained for predicting 1D layout, outperforms previous state-of-the-art approaches. The designed post-processing procedure for recovering 3D room layouts from 1D predictions can automatically infer the room shape with low computation cost - it takes less than 20ms for a panorama image while prior works might need dozens of seconds. We also propose Pano Stretch Data Augmentation, which can diversify panorama data and be applied to other panorama-related learning tasks. Due to the limited data available for non-cuboid layout, we relabel 65 general layout from the current dataset for finetuning. Our approach shows good performance on general layouts by qualitative results and cross-validation.
Tasks 3D Room Layouts From A Single Rgb Panorama, Data Augmentation
Published 2019-01-12
URL http://arxiv.org/abs/1901.03861v2
PDF http://arxiv.org/pdf/1901.03861v2.pdf
PWC https://paperswithcode.com/paper/horizonnet-learning-room-layout-with-1d
Repo https://github.com/sunset1995/HorizonNet
Framework pytorch

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Title Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
Authors Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, Huan Zhang, Ilya Razenshteyn, Sebastien Bubeck
Abstract Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably $\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further. Our code and trained models are available at http://github.com/Hadisalman/smoothing-adversarial .
Tasks Adversarial Attack, Adversarial Defense
Published 2019-06-09
URL https://arxiv.org/abs/1906.04584v5
PDF https://arxiv.org/pdf/1906.04584v5.pdf
PWC https://paperswithcode.com/paper/provably-robust-deep-learning-via
Repo https://github.com/Hadisalman/smoothing-adversarial
Framework pytorch

Deep Learning on Image Denoising: An overview

Title Deep Learning on Image Denoising: An overview
Authors Chunwei Tian, Lunke Fei, Wenxian Zheng, Yong Xu, Wangmeng Zuo, Chia-Wen Lin
Abstract Deep learning techniques have obtained much attention in image denoising. However, deep learning methods of different types deal with the noise have enormous differences. Specifically, discriminative learning based on deep learning can well address the Gaussian noise. Optimization model methods based on deep learning have good effect on estimating of the real noise. So far, there are little related researches to summarize different deep learning techniques for image denoising. In this paper, we make such a comparative study of different deep techniques in image denoising. We first classify the (1) deep convolutional neural networks (CNNs) for additive white noisy images, (2) deep CNNs for real noisy images, (3) deep CNNs for blind denoising and (4) deep CNNs for hybrid noisy images, which is the combination of noisy, blurred and low-resolution images. Then, we analyze the motivations and principles of deep learning methods of different types. Next, we compare and verify the state-of-the-art methods on public denoising datasets in terms of quantitative and qualitative analysis. Finally, we point out some potential challenges and directions of future research.
Tasks Denoising, Image Denoising
Published 2019-12-31
URL https://arxiv.org/abs/1912.13171v2
PDF https://arxiv.org/pdf/1912.13171v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-on-image-denoising-an-overview
Repo https://github.com/hellloxiaotian/Deep-Learning-on-Image-Denoising-An-overview
Framework none

Fast Graph Representation Learning with PyTorch Geometric

Title Fast Graph Representation Learning with PyTorch Geometric
Authors Matthias Fey, Jan Eric Lenssen
Abstract We introduce PyTorch Geometric, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch. In addition to general graph data structures and processing methods, it contains a variety of recently published methods from the domains of relational learning and 3D data processing. PyTorch Geometric achieves high data throughput by leveraging sparse GPU acceleration, by providing dedicated CUDA kernels and by introducing efficient mini-batch handling for input examples of different size. In this work, we present the library in detail and perform a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios.
Tasks Graph Classification, Graph Representation Learning, Node Classification, Relational Reasoning, Representation Learning
Published 2019-03-06
URL http://arxiv.org/abs/1903.02428v3
PDF http://arxiv.org/pdf/1903.02428v3.pdf
PWC https://paperswithcode.com/paper/fast-graph-representation-learning-with
Repo https://github.com/rusty1s/pytorch_geometric
Framework pytorch

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval

Title Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval
Authors Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille
Abstract Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications. Recently, research interests arise in solving this problem under the more realistic and challenging setting of zero-shot learning. In this paper, we investigate this problem from the viewpoint of domain adaptation which we show is critical in improving feature embedding in the zero-shot scenario. Based on a framework which starts with a pre-trained model on ImageNet and fine-tunes it on the training set of SBIR benchmark, we advocate the importance of preserving previously acquired knowledge, e.g., the rich discriminative features learned from ImageNet, to improve the model’s transfer ability. For this purpose, we design an approach named Semantic-Aware Knowledge prEservation (SAKE), which fine-tunes the pre-trained model in an economical way and leverages semantic information, e.g., inter-class relationship, to achieve the goal of knowledge preservation. Zero-shot experiments on two extended SBIR datasets, TU-Berlin and Sketchy, verify the superior performance of our approach. Extensive diagnostic experiments validate that knowledge preserved benefits SBIR in zero-shot settings, as a large fraction of the performance gain is from the more properly structured feature embedding for photo images. Code is available at: https://github.com/qliu24/SAKE.
Tasks Domain Adaptation, Image Retrieval, Sketch-Based Image Retrieval, Zero-Shot Learning
Published 2019-04-05
URL https://arxiv.org/abs/1904.03208v3
PDF https://arxiv.org/pdf/1904.03208v3.pdf
PWC https://paperswithcode.com/paper/semantic-aware-knowledge-preservation-for
Repo https://github.com/qliu24/SAKE
Framework pytorch

SOGNet: Scene Overlap Graph Network for Panoptic Segmentation

Title SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
Authors Yibo Yang, Hongyang Li, Xia Li, Qijie Zhao, Jianlong Wu, Zhouchen Lin
Abstract The panoptic segmentation task requires a unified result from semantic and instance segmentation outputs that may contain overlaps. However, current studies widely ignore modeling overlaps. In this study, we aim to model overlap relations among instances and resolve them for panoptic segmentation. Inspired by scene graph representation, we formulate the overlapping problem as a simplified case, named scene overlap graph. We leverage each object’s category, geometry and appearance features to perform relational embedding, and output a relation matrix that encodes overlap relations. In order to overcome the lack of supervision, we introduce a differentiable module to resolve the overlap between any pair of instances. The mask logits after removing overlaps are fed into per-pixel instance \verbid classification, which leverages the panoptic supervision to assist in the modeling of overlap relations. Besides, we generate an approximate ground truth of overlap relations as the weak supervision, to quantify the accuracy of overlap relations predicted by our method. Experiments on COCO and Cityscapes demonstrate that our method is able to accurately predict overlap relations, and outperform the state-of-the-art performance for panoptic segmentation. Our method also won the Innovation Award in COCO 2019 challenge.
Tasks Instance Segmentation, Panoptic Segmentation, Semantic Segmentation
Published 2019-11-18
URL https://arxiv.org/abs/1911.07527v1
PDF https://arxiv.org/pdf/1911.07527v1.pdf
PWC https://paperswithcode.com/paper/sognet-scene-overlap-graph-network-for
Repo https://github.com/LaoYang1994/SOGNet
Framework pytorch

Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations

Title Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations
Authors Xu Wang, Jingming He, Lin Ma
Abstract In this paper, we propose one novel model for point cloud semantic segmentation, which exploits both the local and global structures within the point cloud based on the contextual point representations. Specifically, we enrich each point representation by performing one novel gated fusion on the point itself and its contextual points. Afterwards, based on the enriched representation, we propose one novel graph pointnet module, relying on the graph attention block to dynamically compose and update each point representation within the local point cloud structure. Finally, we resort to the spatial-wise and channel-wise attention strategies to exploit the point cloud global structure and thereby yield the resulting semantic label for each point. Extensive results on the public point cloud databases, namely the S3DIS and ScanNet datasets, demonstrate the effectiveness of our proposed model, outperforming the state-of-the-art approaches. Our code for this paper is available at https://github.com/fly519/ELGS.
Tasks Semantic Segmentation
Published 2019-11-13
URL https://arxiv.org/abs/1911.05277v1
PDF https://arxiv.org/pdf/1911.05277v1.pdf
PWC https://paperswithcode.com/paper/exploiting-local-and-global-structure-for-1
Repo https://github.com/fly519/ELGS
Framework tf

Shepherding Hordes of Markov Chains

Title Shepherding Hordes of Markov Chains
Authors Milan Ceska, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen
Abstract This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains. Such families occur in software product lines, planning under partial observability, and sketching of probabilistic programs. Simple questions, like `does at least one family member satisfy a property?', are NP-hard. We tackle two problems: distinguish family members that satisfy a given quantitative property from those that do not, and determine a family member that satisfies the property optimally, i.e., with the highest probability or reward. We show that combining two well-known techniques, MDP model checking and abstraction refinement, mitigates the computational complexity. Experiments on a broad set of benchmarks show that in many situations, our approach is able to handle families of millions of MCs, providing superior scalability compared to existing solutions. |
Tasks
Published 2019-02-15
URL http://arxiv.org/abs/1902.05727v2
PDF http://arxiv.org/pdf/1902.05727v2.pdf
PWC https://paperswithcode.com/paper/shepherding-hordes-of-markov-chains
Repo https://github.com/moves-rwth/shepherd
Framework none

Order-Independent Structure Learning of Multivariate Regression Chain Graphs

Title Order-Independent Structure Learning of Multivariate Regression Chain Graphs
Authors Mohammad Ali Javidian, Marco Valtorta, Pooyan Jamshidi
Abstract This paper deals with multivariate regression chain graphs (MVR CGs), which were introduced by Cox and Wermuth [3,4] to represent linear causal models with correlated errors. We consider the PC-like algorithm for structure learning of MVR CGs, which is a constraint-based method proposed by Sonntag and Pe~{n}a in [18]. We show that the PC-like algorithm is order-dependent, in the sense that the output can depend on the order in which the variables are given. This order-dependence is a minor issue in low-dimensional settings. However, it can be very pronounced in high-dimensional settings, where it can lead to highly variable results. We propose two modifications of the PC-like algorithm that remove part or all of this order-dependence. Simulations under a variety of settings demonstrate the competitive performance of our algorithms in comparison with the original PC-like algorithm in low-dimensional settings and improved performance in high-dimensional settings.
Tasks
Published 2019-10-01
URL https://arxiv.org/abs/1910.01067v1
PDF https://arxiv.org/pdf/1910.01067v1.pdf
PWC https://paperswithcode.com/paper/order-independent-structure-learning-of
Repo https://github.com/majavid/SUM2019
Framework none

2017 Robotic Instrument Segmentation Challenge

Title 2017 Robotic Instrument Segmentation Challenge
Authors Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, Luis Herrera, Wenqi Li, Vladimir Iglovikov, Huoling Luo, Jian Yang, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel, Mahdi Azizian
Abstract In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison. However, this type of approach has had limited translation to problems in robotic assisted surgery as this field has never established the same level of common datasets and benchmarking methods. In 2015 a sub-challenge was introduced at the EndoVis workshop where a set of robotic images were provided with automatically generated annotations from robot forward kinematics. However, there were issues with this dataset due to the limited background variation, lack of complex motion and inaccuracies in the annotation. In this work we present the results of the 2017 challenge on robotic instrument segmentation which involved 10 teams participating in binary, parts and type based segmentation of articulated da Vinci robotic instruments.
Tasks Person Re-Identification
Published 2019-02-18
URL http://arxiv.org/abs/1902.06426v2
PDF http://arxiv.org/pdf/1902.06426v2.pdf
PWC https://paperswithcode.com/paper/2017-robotic-instrument-segmentation
Repo https://github.com/ternaus/robot-surgery-segmentation
Framework pytorch

Neural Tangents: Fast and Easy Infinite Neural Networks in Python

Title Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Authors Roman Novak, Lechao Xiao, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz
Abstract Neural Tangents is a library designed to enable research into infinite-width neural networks. It provides a high-level API for specifying complex and hierarchical neural network architectures. These networks can then be trained and evaluated either at finite-width as usual or in their infinite-width limit. Infinite-width networks can be trained analytically using exact Bayesian inference or using gradient descent via the Neural Tangent Kernel. Additionally, Neural Tangents provides tools to study gradient descent training dynamics of wide but finite networks in either function space or weight space. The entire library runs out-of-the-box on CPU, GPU, or TPU. All computations can be automatically distributed over multiple accelerators with near-linear scaling in the number of devices. Neural Tangents is available at www.github.com/google/neural-tangents. We also provide an accompanying interactive Colab notebook.
Tasks Bayesian Inference
Published 2019-12-05
URL https://arxiv.org/abs/1912.02803v1
PDF https://arxiv.org/pdf/1912.02803v1.pdf
PWC https://paperswithcode.com/paper/neural-tangents-fast-and-easy-infinite-neural-1
Repo https://github.com/google/neural-tangents
Framework jax
comments powered by Disqus