January 31, 2020

3171 words 15 mins read

Paper Group AWR 387

Multi-class Classification without Multi-class Labels. Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach. Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval. OpenEDS: Open Eye Dataset. HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation. Provably Robust …

Multi-class Classification without Multi-class Labels


Title	Multi-class Classification without Multi-class Labels
Authors	Yen-Chang Hsu, Zhaoyang Lv, Joel Schlosser, Phillip Odom, Zsolt Kira
Abstract	This work presents a new strategy for multi-class classification that requires no class-specific labels, but instead leverages pairwise similarity between examples, which is a weaker form of annotation. The proposed method, meta classification learning, optimizes a binary classifier for pairwise similarity prediction and through this process learns a multi-class classifier as a submodule. We formulate this approach, present a probabilistic graphical model for it, and derive a surprisingly simple loss function that can be used to learn neural network-based models. We then demonstrate that this same framework generalizes to the supervised, unsupervised cross-task, and semi-supervised settings. Our method is evaluated against state of the art in all three learning paradigms and shows a superior or comparable accuracy, providing evidence that learning multi-class classification without multi-class labels is a viable learning option.
Tasks
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00544v1
PDF	http://arxiv.org/pdf/1901.00544v1.pdf
PWC	https://paperswithcode.com/paper/multi-class-classification-without-multi
Repo	https://github.com/GT-RIPL/L2C
Framework	pytorch

Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach


Title	Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach
Authors	Xi Shen, Ilaria Pastrolin, Oumayma Bounou, Spyros Gidaris, Marc Smith, Olivier Poncet, Mathieu Aubry
Abstract	Historical watermark recognition is a highly practical, yet unsolved challenge for archivists and historians. With a large number of well-defined classes, cluttered and noisy samples, different types of representations, both subtle differences between classes and high intra-class variation, historical watermarks are also challenging for pattern recognition. In this paper, overcoming the difficulty of data collection, we present a large public dataset with more than 6k new photographs, allowing for the first time to tackle at scale the scenarios of practical interest for scholars: one-shot instance recognition and cross-domain one-shot instance recognition amongst more than 16k fine-grained classes. We demonstrate that this new dataset is large enough to train modern deep learning approaches, and show that standard methods can be improved considerably by using mid-level deep features. More precisely, we design both a matching score and a feature fine-tuning strategy based on filtering local matches using spatial consistency. This consistency-based approach provides important performance boost compared to strong baselines. Our model achieves 55% top-1 accuracy on our very challenging 16,753-class one-shot cross-domain recognition task, each class described by a single drawing from the classic Briquet catalog. In addition to watermark classification, we show our approach provides promising results on fine-grained sketch-based image retrieval.
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10254v1
PDF	https://arxiv.org/pdf/1908.10254v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-historical-watermark-recognition
Repo	https://github.com/XiSHEN0220/WatermarkReco
Framework	pytorch

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval


Title	Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval
Authors	Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados, Yi-Zhe Song
Abstract	In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03451v1
PDF	http://arxiv.org/pdf/1904.03451v1.pdf
PWC	https://paperswithcode.com/paper/doodle-to-search-practical-zero-shot-sketch
Repo	https://github.com/sounakdey/doodle2search
Framework	pytorch

OpenEDS: Open Eye Dataset


Title	OpenEDS: Open Eye Dataset
Authors	Stephan J. Garbin, Yiru Shen, Immo Schuetz, Robert Cavin, Gregory Hughes, Sachin S. Talathi
Abstract	We present a large scale data set, OpenEDS: Open Eye Dataset, of eye-images captured using a virtual-reality (VR) head mounted display mounted with two synchronized eyefacing cameras at a frame rate of 200 Hz under controlled illumination. This dataset is compiled from video capture of the eye-region collected from 152 individual participants and is divided into four subsets: (i) 12,759 images with pixel-level annotations for key eye-regions: iris, pupil and sclera (ii) 252,690 unlabelled eye-images, (iii) 91,200 frames from randomly selected video sequence of 1.5 seconds in duration and (iv) 143 pairs of left and right point cloud data compiled from corneal topography of eye regions collected from a subset, 143 out of 152, participants in the study. A baseline experiment has been evaluated on OpenEDS for the task of semantic segmentation of pupil, iris, sclera and background, with the mean intersectionover-union (mIoU) of 98.3 %. We anticipate that OpenEDS will create opportunities to researchers in the eye tracking community and the broader machine learning and computer vision community to advance the state of eye-tracking for VR applications. The dataset is available for download upon request at https://research.fb.com/programs/openeds-challenge
Tasks	Eye Tracking, Semantic Segmentation
Published	2019-04-30
URL	https://arxiv.org/abs/1905.03702v2
PDF	https://arxiv.org/pdf/1905.03702v2.pdf
PWC	https://paperswithcode.com/paper/190503702
Repo	https://github.com/lib314a/Good-And-Open
Framework	none

HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation


Title	HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
Authors	Cheng Sun, Chi-Wei Hsiao, Min Sun, Hwann-Tzong Chen
Abstract	We present a new approach to the problem of estimating the 3D room layout from a single panoramic image. We represent room layout as three 1D vectors that encode, at each image column, the boundary positions of floor-wall and ceiling-wall, and the existence of wall-wall boundary. The proposed network, HorizonNet, trained for predicting 1D layout, outperforms previous state-of-the-art approaches. The designed post-processing procedure for recovering 3D room layouts from 1D predictions can automatically infer the room shape with low computation cost - it takes less than 20ms for a panorama image while prior works might need dozens of seconds. We also propose Pano Stretch Data Augmentation, which can diversify panorama data and be applied to other panorama-related learning tasks. Due to the limited data available for non-cuboid layout, we relabel 65 general layout from the current dataset for finetuning. Our approach shows good performance on general layouts by qualitative results and cross-validation.
Tasks	3D Room Layouts From A Single Rgb Panorama, Data Augmentation
Published	2019-01-12
URL	http://arxiv.org/abs/1901.03861v2
PDF	http://arxiv.org/pdf/1901.03861v2.pdf
PWC	https://paperswithcode.com/paper/horizonnet-learning-room-layout-with-1d
Repo	https://github.com/sunset1995/HorizonNet
Framework	pytorch

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers


Title	Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
Authors	Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, Huan Zhang, Ilya Razenshteyn, Sebastien Bubeck
Abstract	Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably $\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further. Our code and trained models are available at http://github.com/Hadisalman/smoothing-adversarial .
Tasks	Adversarial Attack, Adversarial Defense
Published	2019-06-09
URL	https://arxiv.org/abs/1906.04584v5
PDF	https://arxiv.org/pdf/1906.04584v5.pdf
PWC	https://paperswithcode.com/paper/provably-robust-deep-learning-via
Repo	https://github.com/Hadisalman/smoothing-adversarial
Framework	pytorch

Deep Learning on Image Denoising: An overview


Title	Deep Learning on Image Denoising: An overview
Authors	Chunwei Tian, Lunke Fei, Wenxian Zheng, Yong Xu, Wangmeng Zuo, Chia-Wen Lin
Abstract	Deep learning techniques have obtained much attention in image denoising. However, deep learning methods of different types deal with the noise have enormous differences. Specifically, discriminative learning based on deep learning can well address the Gaussian noise. Optimization model methods based on deep learning have good effect on estimating of the real noise. So far, there are little related researches to summarize different deep learning techniques for image denoising. In this paper, we make such a comparative study of different deep techniques in image denoising. We first classify the (1) deep convolutional neural networks (CNNs) for additive white noisy images, (2) deep CNNs for real noisy images, (3) deep CNNs for blind denoising and (4) deep CNNs for hybrid noisy images, which is the combination of noisy, blurred and low-resolution images. Then, we analyze the motivations and principles of deep learning methods of different types. Next, we compare and verify the state-of-the-art methods on public denoising datasets in terms of quantitative and qualitative analysis. Finally, we point out some potential challenges and directions of future research.
Tasks	Denoising, Image Denoising
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13171v2
PDF	https://arxiv.org/pdf/1912.13171v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-on-image-denoising-an-overview
Repo	https://github.com/hellloxiaotian/Deep-Learning-on-Image-Denoising-An-overview
Framework	none

Fast Graph Representation Learning with PyTorch Geometric


Title	Fast Graph Representation Learning with PyTorch Geometric
Authors	Matthias Fey, Jan Eric Lenssen
Abstract	We introduce PyTorch Geometric, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch. In addition to general graph data structures and processing methods, it contains a variety of recently published methods from the domains of relational learning and 3D data processing. PyTorch Geometric achieves high data throughput by leveraging sparse GPU acceleration, by providing dedicated CUDA kernels and by introducing efficient mini-batch handling for input examples of different size. In this work, we present the library in detail and perform a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios.
Tasks	Graph Classification, Graph Representation Learning, Node Classification, Relational Reasoning, Representation Learning
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02428v3
PDF	http://arxiv.org/pdf/1903.02428v3.pdf
PWC	https://paperswithcode.com/paper/fast-graph-representation-learning-with
Repo	https://github.com/rusty1s/pytorch_geometric
Framework	pytorch

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval


Title	Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval
Authors	Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille
Abstract	Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications. Recently, research interests arise in solving this problem under the more realistic and challenging setting of zero-shot learning. In this paper, we investigate this problem from the viewpoint of domain adaptation which we show is critical in improving feature embedding in the zero-shot scenario. Based on a framework which starts with a pre-trained model on ImageNet and fine-tunes it on the training set of SBIR benchmark, we advocate the importance of preserving previously acquired knowledge, e.g., the rich discriminative features learned from ImageNet, to improve the model’s transfer ability. For this purpose, we design an approach named Semantic-Aware Knowledge prEservation (SAKE), which fine-tunes the pre-trained model in an economical way and leverages semantic information, e.g., inter-class relationship, to achieve the goal of knowledge preservation. Zero-shot experiments on two extended SBIR datasets, TU-Berlin and Sketchy, verify the superior performance of our approach. Extensive diagnostic experiments validate that knowledge preserved benefits SBIR in zero-shot settings, as a large fraction of the performance gain is from the more properly structured feature embedding for photo images. Code is available at: https://github.com/qliu24/SAKE.
Tasks	Domain Adaptation, Image Retrieval, Sketch-Based Image Retrieval, Zero-Shot Learning
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03208v3
PDF	https://arxiv.org/pdf/1904.03208v3.pdf
PWC	https://paperswithcode.com/paper/semantic-aware-knowledge-preservation-for
Repo	https://github.com/qliu24/SAKE
Framework	pytorch

SOGNet: Scene Overlap Graph Network for Panoptic Segmentation


Title	SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
Authors	Yibo Yang, Hongyang Li, Xia Li, Qijie Zhao, Jianlong Wu, Zhouchen Lin
Abstract	The panoptic segmentation task requires a unified result from semantic and instance segmentation outputs that may contain overlaps. However, current studies widely ignore modeling overlaps. In this study, we aim to model overlap relations among instances and resolve them for panoptic segmentation. Inspired by scene graph representation, we formulate the overlapping problem as a simplified case, named scene overlap graph. We leverage each object’s category, geometry and appearance features to perform relational embedding, and output a relation matrix that encodes overlap relations. In order to overcome the lack of supervision, we introduce a differentiable module to resolve the overlap between any pair of instances. The mask logits after removing overlaps are fed into per-pixel instance \verbid classification, which leverages the panoptic supervision to assist in the modeling of overlap relations. Besides, we generate an approximate ground truth of overlap relations as the weak supervision, to quantify the accuracy of overlap relations predicted by our method. Experiments on COCO and Cityscapes demonstrate that our method is able to accurately predict overlap relations, and outperform the state-of-the-art performance for panoptic segmentation. Our method also won the Innovation Award in COCO 2019 challenge.
Tasks	Instance Segmentation, Panoptic Segmentation, Semantic Segmentation
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07527v1
PDF	https://arxiv.org/pdf/1911.07527v1.pdf
PWC	https://paperswithcode.com/paper/sognet-scene-overlap-graph-network-for
Repo	https://github.com/LaoYang1994/SOGNet
Framework	pytorch

Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations


Title	Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations
Authors	Xu Wang, Jingming He, Lin Ma
Abstract	In this paper, we propose one novel model for point cloud semantic segmentation, which exploits both the local and global structures within the point cloud based on the contextual point representations. Specifically, we enrich each point representation by performing one novel gated fusion on the point itself and its contextual points. Afterwards, based on the enriched representation, we propose one novel graph pointnet module, relying on the graph attention block to dynamically compose and update each point representation within the local point cloud structure. Finally, we resort to the spatial-wise and channel-wise attention strategies to exploit the point cloud global structure and thereby yield the resulting semantic label for each point. Extensive results on the public point cloud databases, namely the S3DIS and ScanNet datasets, demonstrate the effectiveness of our proposed model, outperforming the state-of-the-art approaches. Our code for this paper is available at https://github.com/fly519/ELGS.
Tasks	Semantic Segmentation
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05277v1
PDF	https://arxiv.org/pdf/1911.05277v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-local-and-global-structure-for-1
Repo	https://github.com/fly519/ELGS
Framework	tf

Shepherding Hordes of Markov Chains


Title	Shepherding Hordes of Markov Chains
Authors	Milan Ceska, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen
Abstract	This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains. Such families occur in software product lines, planning under partial observability, and sketching of probabilistic programs. Simple questions, like `does at least one family member satisfy a property?', are NP-hard. We tackle two problems: distinguish family members that satisfy a given quantitative property from those that do not, and determine a family member that satisfies the property optimally, i.e., with the highest probability or reward. We show that combining two well-known techniques, MDP model checking and abstraction refinement, mitigates the computational complexity. Experiments on a broad set of benchmarks show that in many situations, our approach is able to handle families of millions of MCs, providing superior scalability compared to existing solutions. \|
Tasks
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05727v2
PDF	http://arxiv.org/pdf/1902.05727v2.pdf
PWC	https://paperswithcode.com/paper/shepherding-hordes-of-markov-chains
Repo	https://github.com/moves-rwth/shepherd
Framework	none

Order-Independent Structure Learning of Multivariate Regression Chain Graphs


Title	Order-Independent Structure Learning of Multivariate Regression Chain Graphs
Authors	Mohammad Ali Javidian, Marco Valtorta, Pooyan Jamshidi
Abstract	This paper deals with multivariate regression chain graphs (MVR CGs), which were introduced by Cox and Wermuth [3,4] to represent linear causal models with correlated errors. We consider the PC-like algorithm for structure learning of MVR CGs, which is a constraint-based method proposed by Sonntag and Pe~{n}a in [18]. We show that the PC-like algorithm is order-dependent, in the sense that the output can depend on the order in which the variables are given. This order-dependence is a minor issue in low-dimensional settings. However, it can be very pronounced in high-dimensional settings, where it can lead to highly variable results. We propose two modifications of the PC-like algorithm that remove part or all of this order-dependence. Simulations under a variety of settings demonstrate the competitive performance of our algorithms in comparison with the original PC-like algorithm in low-dimensional settings and improved performance in high-dimensional settings.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.01067v1
PDF	https://arxiv.org/pdf/1910.01067v1.pdf
PWC	https://paperswithcode.com/paper/order-independent-structure-learning-of
Repo	https://github.com/majavid/SUM2019
Framework	none

2017 Robotic Instrument Segmentation Challenge


Title	2017 Robotic Instrument Segmentation Challenge
Authors	Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, Luis Herrera, Wenqi Li, Vladimir Iglovikov, Huoling Luo, Jian Yang, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel, Mahdi Azizian
Abstract	In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison. However, this type of approach has had limited translation to problems in robotic assisted surgery as this field has never established the same level of common datasets and benchmarking methods. In 2015 a sub-challenge was introduced at the EndoVis workshop where a set of robotic images were provided with automatically generated annotations from robot forward kinematics. However, there were issues with this dataset due to the limited background variation, lack of complex motion and inaccuracies in the annotation. In this work we present the results of the 2017 challenge on robotic instrument segmentation which involved 10 teams participating in binary, parts and type based segmentation of articulated da Vinci robotic instruments.
Tasks	Person Re-Identification
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06426v2
PDF	http://arxiv.org/pdf/1902.06426v2.pdf
PWC	https://paperswithcode.com/paper/2017-robotic-instrument-segmentation
Repo	https://github.com/ternaus/robot-surgery-segmentation
Framework	pytorch

Neural Tangents: Fast and Easy Infinite Neural Networks in Python


Title	Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Authors	Roman Novak, Lechao Xiao, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz
Abstract	Neural Tangents is a library designed to enable research into infinite-width neural networks. It provides a high-level API for specifying complex and hierarchical neural network architectures. These networks can then be trained and evaluated either at finite-width as usual or in their infinite-width limit. Infinite-width networks can be trained analytically using exact Bayesian inference or using gradient descent via the Neural Tangent Kernel. Additionally, Neural Tangents provides tools to study gradient descent training dynamics of wide but finite networks in either function space or weight space. The entire library runs out-of-the-box on CPU, GPU, or TPU. All computations can be automatically distributed over multiple accelerators with near-linear scaling in the number of devices. Neural Tangents is available at www.github.com/google/neural-tangents. We also provide an accompanying interactive Colab notebook.
Tasks	Bayesian Inference
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02803v1
PDF	https://arxiv.org/pdf/1912.02803v1.pdf
PWC	https://paperswithcode.com/paper/neural-tangents-fast-and-easy-infinite-neural-1
Repo	https://github.com/google/neural-tangents
Framework	jax