January 25, 2020

2764 words 13 mins read

Paper Group ANR 1658

Learning from Web Data with Self-Organizing Memory Module. Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield. Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators. Learning to Detect and Retrieve Objects from Unlabeled Videos. Sentiment Tagging with Partial Labe …

Learning from Web Data with Self-Organizing Memory Module


Title	Learning from Web Data with Self-Organizing Memory Module
Authors	Yi Tu, Li Niu, Junjie Chen, Dawei Cheng, Liqing Zhang
Abstract	Learning from web data has attracted lots of research interest in recent years. However, crawled web images usually have two types of noises, label noise and background noise, which induce extra difficulties in utilizing them effectively. Most existing methods either rely on human supervision or ignore the background noise. In this paper, we propose a novel method, which is capable of handling these two types of noises together, without the supervision of clean images in the training stage. Particularly, we formulate our method under the framework of multi-instance learning by grouping ROIs (i.e., images and their region proposals) from the same category into bags. ROIs in each bag are assigned with different weights based on the representative/discriminative scores of their nearest clusters, in which the clusters and their scores are obtained via our designed memory module. Our memory module could be naturally integrated with the classification module, leading to an end-to-end trainable system. Extensive experiments on four benchmark datasets demonstrate the effectiveness of our method.
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12028v5
PDF	https://arxiv.org/pdf/1906.12028v5.pdf
PWC	https://paperswithcode.com/paper/protonet-learning-from-web-data-with-memory
Repo
Framework

Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield


Title	Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield
Authors	Dou Goodman, Tao Wei
Abstract	Many recent works demonstrated that Deep Learning models are vulnerable to adversarial examples.Fortunately, generating adversarial examples usually requires white-box access to the victim model, and the attacker can only access the APIs opened by cloud platforms. Thus, keeping models in the cloud can usually give a (false) sense of security.Unfortunately, cloud-based image classification service is not robust to simple transformations such as Gaussian Noise, Salt-and-Pepper Noise, Rotation and Monochromatization. In this paper,(1) we propose one novel attack method called Image Fusion(IF) attack, which achieve a high bypass rate,can be implemented only with OpenCV and is difficult to defend; and (2) we make the first attempt to conduct an extensive empirical study of Simple Transformation (ST) attacks against real-world cloud-based classification services. Through evaluations on four popular cloud platforms including Amazon, Google, Microsoft, Clarifai, we demonstrate that ST attack has a success rate of approximately 100% except Amazon approximately 50%, IF attack have a success rate over 98% among different classification services. (3) We discuss the possible defenses to address these security challenges.Experiments show that our defense technology can effectively defend known ST attacks.
Tasks	Image Classification
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07997v2
PDF	https://arxiv.org/pdf/1906.07997v2.pdf
PWC	https://paperswithcode.com/paper/cloud-based-image-classification-service-is
Repo
Framework

Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators


Title	Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators
Authors	Xuan Zhu, Yue Cheng, Jinye Peng, Rongzhi Wang, Mingnan Le, Xin Liu
Abstract	Generative adversarial network (GAN) for image super-resolution (SR) has attracted enormous interests in recent years. However, the GAN-based SR methods only use image discriminator to distinguish SR images and high-resolution (HR) images. Image discriminator fails to discriminate images accurately since image features cannot be fully expressed. In this paper, we design a new GAN-based SR framework GAN-IMC which includes generator, image discriminator, morphological component discriminator and color discriminator. The combination of multiple feature discriminators improves the accuracy of image discrimination. Adversarial training between the generator and multi-feature discriminators forces SR images to converge with HR images in terms of data and features distribution. Moreover, in some cases, feature enhancement of salient regions is also worth considering. GAN-IMC is further optimized by weighted content loss (GAN-IMCW), which effectively restores and enhances salient regions in SR images. The effectiveness and robustness of our method are confirmed by extensive experiments on public datasets. Compared with state-of-the-art methods, the proposed method not only achieves competitive Perceptual Index (PI) and Natural Image Quality Evaluator (NIQE) values but also obtains pleasant visual perception in image edge, texture, color and salient regions.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-04-24
URL	https://arxiv.org/abs/1904.10654v2
PDF	https://arxiv.org/pdf/1904.10654v2.pdf
PWC	https://paperswithcode.com/paper/super-resolution-based-generative-adversarial
Repo
Framework

Learning to Detect and Retrieve Objects from Unlabeled Videos


Title	Learning to Detect and Retrieve Objects from Unlabeled Videos
Authors	Elad Amrani, Rami Ben-Ari, Tal Hakim, Alex Bronstein
Abstract	Learning an object detector or retrieval requires a large data set with manual annotations. Such data sets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose to exploit the natural correlation in narrations and the visual presence of objects in video, to learn an object detector and retrieval without any manual labeling involved. We pose the problem as weakly supervised learning with noisy labels, and propose a novel object detection paradigm under these constraints. We handle the background rejection by using contrastive samples and confront the high level of label noise with a new clustering score. Our evaluation is based on a set of 11 manually annotated objects in over 5000 frames. We show comparison to a weakly-supervised approach as baseline and provide a strongly labeled upper bound.
Tasks	Object Detection
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11137v2
PDF	https://arxiv.org/pdf/1905.11137v2.pdf
PWC	https://paperswithcode.com/paper/toward-self-supervised-object-detection-in
Repo
Framework

Sentiment Tagging with Partial Labels using Modular Architectures


Title	Sentiment Tagging with Partial Labels using Modular Architectures
Authors	Xiao Zhang, Dan Goldwasser
Abstract	Many NLP learning tasks can be decomposed into several distinct sub-tasks, each associated with a partial label. In this paper we focus on a popular class of learning problems, sequence prediction applied to several sentiment analysis tasks, and suggest a modular learning approach in which different sub-tasks are learned using separate functional modules, combined to perform the final task while sharing information. Our experiments show this approach helps constrain the learning process and can alleviate some of the supervision efforts.
Tasks	Sentiment Analysis
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00534v2
PDF	https://arxiv.org/pdf/1906.00534v2.pdf
PWC	https://paperswithcode.com/paper/190600534
Repo
Framework

Constructing Dynamic Knowledge Graph for Visual Semantic Understanding and Applications in Autonomous Robotics


Title	Constructing Dynamic Knowledge Graph for Visual Semantic Understanding and Applications in Autonomous Robotics
Authors	Chen Jiang, Steven Lu, Martin Jagersand
Abstract	Interpreting semantic knowledge describing entities, relations and attributes explicitly with visuals and implicitly with in behind-scene common senses gain more attention in autonomous robotics. By incorporating vision and language modeling with common-sense knowledge, we can provide rich features indicating strong semantic meanings for human and robot action relationships, which can be utilized further in autonomous robotic controls. In this paper, we propose a systematic scheme to generate high-conceptual dynamic knowledge graphs representing Entity-Relation-Entity (E-R-E) and Entity-Attribute-Value (E-A-V) knowledges by “watching” a video clip. A combination of Vision-Language model and static ontology tree is used to illustrate workspace, configurations, functions and usages for both human and robot. The proposed method is flexible and well-versed. It will serve as our first positioning investigation for further research in various applications for autonomous robots.
Tasks	Common Sense Reasoning, Knowledge Graphs, Language Modelling
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07459v1
PDF	https://arxiv.org/pdf/1909.07459v1.pdf
PWC	https://paperswithcode.com/paper/constructing-dynamic-knowledge-graph-for
Repo
Framework

Causal Discovery by Kernel Intrinsic Invariance Measure


Title	Causal Discovery by Kernel Intrinsic Invariance Measure
Authors	Zhitang Chen, Shengyu Zhu, Yue Liu, Tim Tse
Abstract	Reasoning based on causality, instead of association has been considered as a key ingredient towards real machine intelligence. However, it is a challenging task to infer causal relationship/structure among variables. In recent years, an Independent Mechanism (IM) principle was proposed, stating that the mechanism generating the cause and the one mapping the cause to the effect are independent. As the conjecture, it is argued that in the causal direction, the conditional distributions instantiated at different value of the conditioning variable have less variation than the anti-causal direction. Existing state-of-the-arts simply compare the variance of the RKHS mean embedding norms of these conditional distributions. In this paper, we prove that this norm-based approach sacrifices important information of the original conditional distributions. We propose a Kernel Intrinsic Invariance Measure (KIIM) to capture higher order statistics corresponding to the shapes of the density functions. We show our algorithm can be reduced to an eigen-decomposition task on a kernel matrix measuring intrinsic deviance/invariance. Causal directions can then be inferred by comparing the KIIM scores of two hypothetic directions. Experiments on synthetic and real data are conducted to show the advantages of our methods over existing solutions.
Tasks	Causal Discovery
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00513v1
PDF	https://arxiv.org/pdf/1909.00513v1.pdf
PWC	https://paperswithcode.com/paper/causal-discovery-by-kernel-intrinsic
Repo
Framework

A Method for Computing Class-wise Universal Adversarial Perturbations


Title	A Method for Computing Class-wise Universal Adversarial Perturbations
Authors	Tejus Gupta, Abhishek Sinha, Nupur Kumari, Mayank Singh, Balaji Krishnamurthy
Abstract	We present an algorithm for computing class-specific universal adversarial perturbations for deep neural networks. Such perturbations can induce misclassification in a large fraction of images of a specific class. Unlike previous methods that use iterative optimization for computing a universal perturbation, the proposed method employs a perturbation that is a linear function of weights of the neural network and hence can be computed much faster. The method does not require any training data and has no hyper-parameters. The attack obtains 34% to 51% fooling rate on state-of-the-art deep neural networks on ImageNet and transfers across models. We also study the characteristics of the decision boundaries learned by standard and adversarially trained models to understand the universal adversarial perturbations.
Tasks
Published	2019-12-01
URL	https://arxiv.org/abs/1912.00466v1
PDF	https://arxiv.org/pdf/1912.00466v1.pdf
PWC	https://paperswithcode.com/paper/a-method-for-computing-class-wise-universal
Repo
Framework

Why Build an Assistant in Minecraft?


Title	Why Build an Assistant in Minecraft?
Authors	Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston
Abstract	In this document we describe a rationale for a research program aimed at building an open “assistant” in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09273v2
PDF	https://arxiv.org/pdf/1907.09273v2.pdf
PWC	https://paperswithcode.com/paper/why-build-an-assistant-in-minecraft
Repo
Framework

Manifold Mixup improves text recognition with CTC loss


Title	Manifold Mixup improves text recognition with CTC loss
Authors	Bastien Moysset, Ronaldo Messina
Abstract	Modern handwritten text recognition techniques employ deep recurrent neural networks. The use of these techniques is especially efficient when a large amount of annotated data is available for parameter estimation. Data augmentation can be used to enhance the performance of the systems when data is scarce. Manifold Mixup is a modern method of data augmentation that meld two images or the feature maps corresponding to these images and the targets are fused accordingly. We propose to apply the Manifold Mixup to text recognition while adapting it to work with a Connectionist Temporal Classification cost. We show that Manifold Mixup improves text recognition results on various languages and datasets.
Tasks	Data Augmentation
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04246v1
PDF	http://arxiv.org/pdf/1903.04246v1.pdf
PWC	https://paperswithcode.com/paper/manifold-mixup-improves-text-recognition-with
Repo
Framework

Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks


Title	Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks
Authors	Yueru Chen, Yijing Yang, Min Zhang, C. -C. Jay Kuo
Abstract	A semi-supervised learning framework using the feedforward-designed convolutional neural networks (FF-CNNs) is proposed for image classification in this work. One unique property of FF-CNNs is that no backpropagation is used in model parameters determination. Since unlabeled data may not always enhance semi-supervised learning, we define an effective quality score and use it to select a subset of unlabeled data in the training process. We conduct experiments on the MNIST, SVHN, and CIFAR-10 datasets, and show that the proposed semi-supervised FF-CNN solution outperforms the CNN trained by backpropagation (BP-CNN) when the amount of labeled data is reduced. Furthermore, we develop an ensemble system that combines the output decision vectors of different semi-supervised FF-CNNs to boost classification accuracy. The ensemble systems can achieve further performance gains on all three benchmarking datasets.
Tasks	Image Classification
Published	2019-02-06
URL	http://arxiv.org/abs/1902.01980v1
PDF	http://arxiv.org/pdf/1902.01980v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-via-feedforward
Repo
Framework

Competing Models


Title	Competing Models
Authors	Jose Luis Montiel Olea, Pietro Ortoleva, Mallesh M Pai, Andrea Prat
Abstract	We develop a model in which different agents compete to predict a variable of interest. This variable is related to observables via an unknown data generating process. All agents are Bayesian, but may have `misspecified models' of the world, i.e., they consider different subsets of observables to make their prediction. After observing a common dataset, who has the highest confidence in her predictive ability? We characterize it and show that it crucially depends on the size of the dataset. With big data, we show it is typically` large-dimensional,’ possibly using more variables than the true model. With small data, we show (under additional assumptions) that it is an agent using a model that is `small-dimensional,’ in the sense of considering fewer covariates than the true data generating process. The theory is applied to auctions of assets where bidders observe the same information but hold different priors. \|
Tasks	Model Selection
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03809v2
PDF	https://arxiv.org/pdf/1907.03809v2.pdf
PWC	https://paperswithcode.com/paper/competing-models
Repo
Framework

2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud


Title	2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud
Authors	Mengdan Feng, Sixing Hu, Marcelo Ang, Gim Hee Lee
Abstract	Large-scale point cloud generated from 3D sensors is more accurate than its image-based counterpart. However, it is seldom used in visual pose estimation due to the difficulty in obtaining 2D-3D image to point cloud correspondences. In this paper, we propose the 2D3D-MatchNet - an end-to-end deep network architecture to jointly learn the descriptors for 2D and 3D keypoint from image and point cloud, respectively. As a result, we are able to directly match and establish 2D-3D correspondences from the query image and 3D point cloud reference map for visual pose estimation. We create our Oxford 2D-3D Patches dataset from the Oxford Robotcar dataset with the ground truth camera poses and 2D-3D image to point cloud correspondences for training and testing the deep network. Experimental results verify the feasibility of our approach.
Tasks	Pose Estimation
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09742v1
PDF	http://arxiv.org/pdf/1904.09742v1.pdf
PWC	https://paperswithcode.com/paper/2d3d-matchnet-learning-to-match-keypoints
Repo
Framework

Learning Competitive and Discriminative Reconstructions for Anomaly Detection


Title	Learning Competitive and Discriminative Reconstructions for Anomaly Detection
Authors	Kai Tian, Shuigeng Zhou, Jianping Fan, Jihong Guan
Abstract	Most of the existing methods for anomaly detection use only positive data to learn the data distribution, thus they usually need a pre-defined threshold at the detection stage to determine whether a test instance is an outlier. Unfortunately, a good threshold is vital for the performance and it is really hard to find an optimal one. In this paper, we take the discriminative information implied in unlabeled data into consideration and propose a new method for anomaly detection that can learn the labels of unlabelled data directly. Our proposed method has an end-to-end architecture with one encoder and two decoders that are trained to model inliers and outliers’ data distributions in a competitive way. This architecture works in a discriminative manner without suffering from overfitting, and the training algorithm of our model is adopted from SGD, thus it is efficient and scalable even for large-scale datasets. Empirical studies on 7 datasets including KDD99, MNIST, Caltech-256, and ImageNet etc. show that our model outperforms the state-of-the-art methods.
Tasks	Anomaly Detection
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07058v1
PDF	http://arxiv.org/pdf/1903.07058v1.pdf
PWC	https://paperswithcode.com/paper/learning-competitive-and-discriminative
Repo
Framework

Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data


Title	Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Authors	Dominik Linzner, Michael Schmidt, Heinz Koeppl
Abstract	Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incomplete, the latent states of the CTBN have to be estimated by laboriously simulating the intractable dynamics of the assumed CTBN. This is a problem, especially for structure learning tasks, where this has to be done for each element of a super-exponentially growing set of possible structures. In order to circumvent this notorious bottleneck, we develop a novel gradient-based approach to structure learning. Instead of sampling and scoring all possible structures individually, we assume the generator of the CTBN to be composed as a mixture of generators stemming from different structures. In this framework, structure learning can be performed via a gradient-based optimization of mixture weights. We combine this approach with a new variational method that allows for a closed-form calculation of this mixture marginal likelihood. We show the scalability of our method by learning structures of previously inaccessible sizes from synthetic and real-world data.
Tasks	Time Series
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04570v3
PDF	https://arxiv.org/pdf/1909.04570v3.pdf
PWC	https://paperswithcode.com/paper/scalable-structure-learning-of-continuous
Repo
Framework