Paper Group ANR 1658
Learning from Web Data with Self-Organizing Memory Module. Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield. Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators. Learning to Detect and Retrieve Objects from Unlabeled Videos. Sentiment Tagging with Partial Labe …
Learning from Web Data with Self-Organizing Memory Module
Title | Learning from Web Data with Self-Organizing Memory Module |
Authors | Yi Tu, Li Niu, Junjie Chen, Dawei Cheng, Liqing Zhang |
Abstract | Learning from web data has attracted lots of research interest in recent years. However, crawled web images usually have two types of noises, label noise and background noise, which induce extra difficulties in utilizing them effectively. Most existing methods either rely on human supervision or ignore the background noise. In this paper, we propose a novel method, which is capable of handling these two types of noises together, without the supervision of clean images in the training stage. Particularly, we formulate our method under the framework of multi-instance learning by grouping ROIs (i.e., images and their region proposals) from the same category into bags. ROIs in each bag are assigned with different weights based on the representative/discriminative scores of their nearest clusters, in which the clusters and their scores are obtained via our designed memory module. Our memory module could be naturally integrated with the classification module, leading to an end-to-end trainable system. Extensive experiments on four benchmark datasets demonstrate the effectiveness of our method. |
Tasks | |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12028v5 |
https://arxiv.org/pdf/1906.12028v5.pdf | |
PWC | https://paperswithcode.com/paper/protonet-learning-from-web-data-with-memory |
Repo | |
Framework | |
Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield
Title | Cloud-based Image Classification Service Is Not Robust To Simple Transformations: A Forgotten Battlefield |
Authors | Dou Goodman, Tao Wei |
Abstract | Many recent works demonstrated that Deep Learning models are vulnerable to adversarial examples.Fortunately, generating adversarial examples usually requires white-box access to the victim model, and the attacker can only access the APIs opened by cloud platforms. Thus, keeping models in the cloud can usually give a (false) sense of security.Unfortunately, cloud-based image classification service is not robust to simple transformations such as Gaussian Noise, Salt-and-Pepper Noise, Rotation and Monochromatization. In this paper,(1) we propose one novel attack method called Image Fusion(IF) attack, which achieve a high bypass rate,can be implemented only with OpenCV and is difficult to defend; and (2) we make the first attempt to conduct an extensive empirical study of Simple Transformation (ST) attacks against real-world cloud-based classification services. Through evaluations on four popular cloud platforms including Amazon, Google, Microsoft, Clarifai, we demonstrate that ST attack has a success rate of approximately 100% except Amazon approximately 50%, IF attack have a success rate over 98% among different classification services. (3) We discuss the possible defenses to address these security challenges.Experiments show that our defense technology can effectively defend known ST attacks. |
Tasks | Image Classification |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07997v2 |
https://arxiv.org/pdf/1906.07997v2.pdf | |
PWC | https://paperswithcode.com/paper/cloud-based-image-classification-service-is |
Repo | |
Framework | |
Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators
Title | Super-Resolved Image Perceptual Quality Improvement via Multi-Feature Discriminators |
Authors | Xuan Zhu, Yue Cheng, Jinye Peng, Rongzhi Wang, Mingnan Le, Xin Liu |
Abstract | Generative adversarial network (GAN) for image super-resolution (SR) has attracted enormous interests in recent years. However, the GAN-based SR methods only use image discriminator to distinguish SR images and high-resolution (HR) images. Image discriminator fails to discriminate images accurately since image features cannot be fully expressed. In this paper, we design a new GAN-based SR framework GAN-IMC which includes generator, image discriminator, morphological component discriminator and color discriminator. The combination of multiple feature discriminators improves the accuracy of image discrimination. Adversarial training between the generator and multi-feature discriminators forces SR images to converge with HR images in terms of data and features distribution. Moreover, in some cases, feature enhancement of salient regions is also worth considering. GAN-IMC is further optimized by weighted content loss (GAN-IMCW), which effectively restores and enhances salient regions in SR images. The effectiveness and robustness of our method are confirmed by extensive experiments on public datasets. Compared with state-of-the-art methods, the proposed method not only achieves competitive Perceptual Index (PI) and Natural Image Quality Evaluator (NIQE) values but also obtains pleasant visual perception in image edge, texture, color and salient regions. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-04-24 |
URL | https://arxiv.org/abs/1904.10654v2 |
https://arxiv.org/pdf/1904.10654v2.pdf | |
PWC | https://paperswithcode.com/paper/super-resolution-based-generative-adversarial |
Repo | |
Framework | |
Learning to Detect and Retrieve Objects from Unlabeled Videos
Title | Learning to Detect and Retrieve Objects from Unlabeled Videos |
Authors | Elad Amrani, Rami Ben-Ari, Tal Hakim, Alex Bronstein |
Abstract | Learning an object detector or retrieval requires a large data set with manual annotations. Such data sets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose to exploit the natural correlation in narrations and the visual presence of objects in video, to learn an object detector and retrieval without any manual labeling involved. We pose the problem as weakly supervised learning with noisy labels, and propose a novel object detection paradigm under these constraints. We handle the background rejection by using contrastive samples and confront the high level of label noise with a new clustering score. Our evaluation is based on a set of 11 manually annotated objects in over 5000 frames. We show comparison to a weakly-supervised approach as baseline and provide a strongly labeled upper bound. |
Tasks | Object Detection |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11137v2 |
https://arxiv.org/pdf/1905.11137v2.pdf | |
PWC | https://paperswithcode.com/paper/toward-self-supervised-object-detection-in |
Repo | |
Framework | |
Sentiment Tagging with Partial Labels using Modular Architectures
Title | Sentiment Tagging with Partial Labels using Modular Architectures |
Authors | Xiao Zhang, Dan Goldwasser |
Abstract | Many NLP learning tasks can be decomposed into several distinct sub-tasks, each associated with a partial label. In this paper we focus on a popular class of learning problems, sequence prediction applied to several sentiment analysis tasks, and suggest a modular learning approach in which different sub-tasks are learned using separate functional modules, combined to perform the final task while sharing information. Our experiments show this approach helps constrain the learning process and can alleviate some of the supervision efforts. |
Tasks | Sentiment Analysis |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00534v2 |
https://arxiv.org/pdf/1906.00534v2.pdf | |
PWC | https://paperswithcode.com/paper/190600534 |
Repo | |
Framework | |
Constructing Dynamic Knowledge Graph for Visual Semantic Understanding and Applications in Autonomous Robotics
Title | Constructing Dynamic Knowledge Graph for Visual Semantic Understanding and Applications in Autonomous Robotics |
Authors | Chen Jiang, Steven Lu, Martin Jagersand |
Abstract | Interpreting semantic knowledge describing entities, relations and attributes explicitly with visuals and implicitly with in behind-scene common senses gain more attention in autonomous robotics. By incorporating vision and language modeling with common-sense knowledge, we can provide rich features indicating strong semantic meanings for human and robot action relationships, which can be utilized further in autonomous robotic controls. In this paper, we propose a systematic scheme to generate high-conceptual dynamic knowledge graphs representing Entity-Relation-Entity (E-R-E) and Entity-Attribute-Value (E-A-V) knowledges by “watching” a video clip. A combination of Vision-Language model and static ontology tree is used to illustrate workspace, configurations, functions and usages for both human and robot. The proposed method is flexible and well-versed. It will serve as our first positioning investigation for further research in various applications for autonomous robots. |
Tasks | Common Sense Reasoning, Knowledge Graphs, Language Modelling |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07459v1 |
https://arxiv.org/pdf/1909.07459v1.pdf | |
PWC | https://paperswithcode.com/paper/constructing-dynamic-knowledge-graph-for |
Repo | |
Framework | |
Causal Discovery by Kernel Intrinsic Invariance Measure
Title | Causal Discovery by Kernel Intrinsic Invariance Measure |
Authors | Zhitang Chen, Shengyu Zhu, Yue Liu, Tim Tse |
Abstract | Reasoning based on causality, instead of association has been considered as a key ingredient towards real machine intelligence. However, it is a challenging task to infer causal relationship/structure among variables. In recent years, an Independent Mechanism (IM) principle was proposed, stating that the mechanism generating the cause and the one mapping the cause to the effect are independent. As the conjecture, it is argued that in the causal direction, the conditional distributions instantiated at different value of the conditioning variable have less variation than the anti-causal direction. Existing state-of-the-arts simply compare the variance of the RKHS mean embedding norms of these conditional distributions. In this paper, we prove that this norm-based approach sacrifices important information of the original conditional distributions. We propose a Kernel Intrinsic Invariance Measure (KIIM) to capture higher order statistics corresponding to the shapes of the density functions. We show our algorithm can be reduced to an eigen-decomposition task on a kernel matrix measuring intrinsic deviance/invariance. Causal directions can then be inferred by comparing the KIIM scores of two hypothetic directions. Experiments on synthetic and real data are conducted to show the advantages of our methods over existing solutions. |
Tasks | Causal Discovery |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00513v1 |
https://arxiv.org/pdf/1909.00513v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-discovery-by-kernel-intrinsic |
Repo | |
Framework | |
A Method for Computing Class-wise Universal Adversarial Perturbations
Title | A Method for Computing Class-wise Universal Adversarial Perturbations |
Authors | Tejus Gupta, Abhishek Sinha, Nupur Kumari, Mayank Singh, Balaji Krishnamurthy |
Abstract | We present an algorithm for computing class-specific universal adversarial perturbations for deep neural networks. Such perturbations can induce misclassification in a large fraction of images of a specific class. Unlike previous methods that use iterative optimization for computing a universal perturbation, the proposed method employs a perturbation that is a linear function of weights of the neural network and hence can be computed much faster. The method does not require any training data and has no hyper-parameters. The attack obtains 34% to 51% fooling rate on state-of-the-art deep neural networks on ImageNet and transfers across models. We also study the characteristics of the decision boundaries learned by standard and adversarially trained models to understand the universal adversarial perturbations. |
Tasks | |
Published | 2019-12-01 |
URL | https://arxiv.org/abs/1912.00466v1 |
https://arxiv.org/pdf/1912.00466v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-for-computing-class-wise-universal |
Repo | |
Framework | |
Why Build an Assistant in Minecraft?
Title | Why Build an Assistant in Minecraft? |
Authors | Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston |
Abstract | In this document we describe a rationale for a research program aimed at building an open “assistant” in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09273v2 |
https://arxiv.org/pdf/1907.09273v2.pdf | |
PWC | https://paperswithcode.com/paper/why-build-an-assistant-in-minecraft |
Repo | |
Framework | |
Manifold Mixup improves text recognition with CTC loss
Title | Manifold Mixup improves text recognition with CTC loss |
Authors | Bastien Moysset, Ronaldo Messina |
Abstract | Modern handwritten text recognition techniques employ deep recurrent neural networks. The use of these techniques is especially efficient when a large amount of annotated data is available for parameter estimation. Data augmentation can be used to enhance the performance of the systems when data is scarce. Manifold Mixup is a modern method of data augmentation that meld two images or the feature maps corresponding to these images and the targets are fused accordingly. We propose to apply the Manifold Mixup to text recognition while adapting it to work with a Connectionist Temporal Classification cost. We show that Manifold Mixup improves text recognition results on various languages and datasets. |
Tasks | Data Augmentation |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04246v1 |
http://arxiv.org/pdf/1903.04246v1.pdf | |
PWC | https://paperswithcode.com/paper/manifold-mixup-improves-text-recognition-with |
Repo | |
Framework | |
Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks
Title | Semi-supervised learning via Feedforward-Designed Convolutional Neural Networks |
Authors | Yueru Chen, Yijing Yang, Min Zhang, C. -C. Jay Kuo |
Abstract | A semi-supervised learning framework using the feedforward-designed convolutional neural networks (FF-CNNs) is proposed for image classification in this work. One unique property of FF-CNNs is that no backpropagation is used in model parameters determination. Since unlabeled data may not always enhance semi-supervised learning, we define an effective quality score and use it to select a subset of unlabeled data in the training process. We conduct experiments on the MNIST, SVHN, and CIFAR-10 datasets, and show that the proposed semi-supervised FF-CNN solution outperforms the CNN trained by backpropagation (BP-CNN) when the amount of labeled data is reduced. Furthermore, we develop an ensemble system that combines the output decision vectors of different semi-supervised FF-CNNs to boost classification accuracy. The ensemble systems can achieve further performance gains on all three benchmarking datasets. |
Tasks | Image Classification |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.01980v1 |
http://arxiv.org/pdf/1902.01980v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-via-feedforward |
Repo | |
Framework | |
Competing Models
Title | Competing Models |
Authors | Jose Luis Montiel Olea, Pietro Ortoleva, Mallesh M Pai, Andrea Prat |
Abstract | We develop a model in which different agents compete to predict a variable of interest. This variable is related to observables via an unknown data generating process. All agents are Bayesian, but may have misspecified models' of the world, i.e., they consider different subsets of observables to make their prediction. After observing a common dataset, who has the highest confidence in her predictive ability? We characterize it and show that it crucially depends on the size of the dataset. With big data, we show it is typically large-dimensional,’ possibly using more variables than the true model. With small data, we show (under additional assumptions) that it is an agent using a model that is `small-dimensional,’ in the sense of considering fewer covariates than the true data generating process. The theory is applied to auctions of assets where bidders observe the same information but hold different priors. | |
Tasks | Model Selection |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03809v2 |
https://arxiv.org/pdf/1907.03809v2.pdf | |
PWC | https://paperswithcode.com/paper/competing-models |
Repo | |
Framework | |
2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud
Title | 2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud |
Authors | Mengdan Feng, Sixing Hu, Marcelo Ang, Gim Hee Lee |
Abstract | Large-scale point cloud generated from 3D sensors is more accurate than its image-based counterpart. However, it is seldom used in visual pose estimation due to the difficulty in obtaining 2D-3D image to point cloud correspondences. In this paper, we propose the 2D3D-MatchNet - an end-to-end deep network architecture to jointly learn the descriptors for 2D and 3D keypoint from image and point cloud, respectively. As a result, we are able to directly match and establish 2D-3D correspondences from the query image and 3D point cloud reference map for visual pose estimation. We create our Oxford 2D-3D Patches dataset from the Oxford Robotcar dataset with the ground truth camera poses and 2D-3D image to point cloud correspondences for training and testing the deep network. Experimental results verify the feasibility of our approach. |
Tasks | Pose Estimation |
Published | 2019-04-22 |
URL | http://arxiv.org/abs/1904.09742v1 |
http://arxiv.org/pdf/1904.09742v1.pdf | |
PWC | https://paperswithcode.com/paper/2d3d-matchnet-learning-to-match-keypoints |
Repo | |
Framework | |
Learning Competitive and Discriminative Reconstructions for Anomaly Detection
Title | Learning Competitive and Discriminative Reconstructions for Anomaly Detection |
Authors | Kai Tian, Shuigeng Zhou, Jianping Fan, Jihong Guan |
Abstract | Most of the existing methods for anomaly detection use only positive data to learn the data distribution, thus they usually need a pre-defined threshold at the detection stage to determine whether a test instance is an outlier. Unfortunately, a good threshold is vital for the performance and it is really hard to find an optimal one. In this paper, we take the discriminative information implied in unlabeled data into consideration and propose a new method for anomaly detection that can learn the labels of unlabelled data directly. Our proposed method has an end-to-end architecture with one encoder and two decoders that are trained to model inliers and outliers’ data distributions in a competitive way. This architecture works in a discriminative manner without suffering from overfitting, and the training algorithm of our model is adopted from SGD, thus it is efficient and scalable even for large-scale datasets. Empirical studies on 7 datasets including KDD99, MNIST, Caltech-256, and ImageNet etc. show that our model outperforms the state-of-the-art methods. |
Tasks | Anomaly Detection |
Published | 2019-03-17 |
URL | http://arxiv.org/abs/1903.07058v1 |
http://arxiv.org/pdf/1903.07058v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-competitive-and-discriminative |
Repo | |
Framework | |
Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Title | Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data |
Authors | Dominik Linzner, Michael Schmidt, Heinz Koeppl |
Abstract | Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incomplete, the latent states of the CTBN have to be estimated by laboriously simulating the intractable dynamics of the assumed CTBN. This is a problem, especially for structure learning tasks, where this has to be done for each element of a super-exponentially growing set of possible structures. In order to circumvent this notorious bottleneck, we develop a novel gradient-based approach to structure learning. Instead of sampling and scoring all possible structures individually, we assume the generator of the CTBN to be composed as a mixture of generators stemming from different structures. In this framework, structure learning can be performed via a gradient-based optimization of mixture weights. We combine this approach with a new variational method that allows for a closed-form calculation of this mixture marginal likelihood. We show the scalability of our method by learning structures of previously inaccessible sizes from synthetic and real-world data. |
Tasks | Time Series |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04570v3 |
https://arxiv.org/pdf/1909.04570v3.pdf | |
PWC | https://paperswithcode.com/paper/scalable-structure-learning-of-continuous |
Repo | |
Framework | |