October 20, 2019

3059 words 15 mins read

Paper Group AWR 190

Adversarial Texts with Gradient Methods. GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. Collective Matrix Completion. Indian Regional Movie Dataset for Recommender Systems. Unsupervised Single Image Dehazing Using Dark Channel Prior Loss. Incre …

Adversarial Texts with Gradient Methods


Title	Adversarial Texts with Gradient Methods
Authors	Zhitao Gong, Wenlu Wang, Bo Li, Dawn Song, Wei-Shinn Ku
Abstract	Adversarial samples for images have been extensively studied in the literature. Among many of the attacking methods, gradient-based methods are both effective and easy to compute. In this work, we propose a framework to adapt the gradient attacking methods on images to text domain. The main difficulties for generating adversarial texts with gradient methods are i) the input space is discrete, which makes it difficult to accumulate small noise directly in the inputs, and ii) the measurement of the quality of the adversarial texts is difficult. We tackle the first problem by searching for adversarials in the embedding space and then reconstruct the adversarial texts via nearest neighbor search. For the latter problem, we employ the Word Mover’s Distance (WMD) to quantify the quality of adversarial texts. Through extensive experiments on three datasets, IMDB movie reviews, Reuters-2 and Reuters-5 newswires, we show that our framework can leverage gradient attacking methods to generate very high-quality adversarial texts that are only a few words different from the original texts. There are many cases where we can change one word to alter the label of the whole piece of text. We successfully incorporate FGM and DeepFool into our framework. In addition, we empirically show that WMD is closely related to the quality of adversarial texts.
Tasks
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07175v2
PDF	http://arxiv.org/pdf/1801.07175v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-texts-with-gradient-methods
Repo	https://github.com/gongzhitaao/adversarial-text
Framework	tf

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild


Title	GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild
Authors	Lianghua Huang, Xin Zhao, Kaiqi Huang
Abstract	We introduce here a large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k. Specifically, GOT-10k is built upon the backbone of WordNet structure and it populates the majority of over 560 classes of moving objects and 87 motion patterns, magnitudes wider than the most recent similar-scale counterparts. The contributions of this paper are summarized in the following: (1) GOT-10k offers over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and stable evaluation of deep trackers. (2) GOT-10k is by far the first video trajectory dataset that uses the semantic hierarchy of WordNet to guide class population. (3) For the first time, GOT-10k introduces the one-shot protocol for tracker evaluation, where the training and test classes are zero-overlapped. The protocol avoids biased evaluation results towards familiar objects and it promotes generalization in tracker development. (4) We conduct extensive tracking experiments with 39 typical tracking algorithms on GOT-10k and analyze their results in this paper. (5) Finally, we develop a comprehensive platform for the tracking community that offers full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The annotations of GOT-10k’s test data are kept private to avoid tuning parameters on it. The database, toolkits, evaluation server and baseline results are available at http://got-10k.aitestunion.com.
Tasks	Object Tracking
Published	2018-10-29
URL	https://arxiv.org/abs/1810.11981v3
PDF	https://arxiv.org/pdf/1810.11981v3.pdf
PWC	https://paperswithcode.com/paper/got-10k-a-large-high-diversity-benchmark-for
Repo	https://github.com/got-10k/toolkit
Framework	none

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition


Title	Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Authors	Pete Warden
Abstract	Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems. Discusses why this task is an interesting challenge, and why it requires a specialized dataset that is different from conventional datasets used for automatic speech recognition of full sentences. Suggests a methodology for reproducible and comparable accuracy metrics for this task. Describes how the data was collected and verified, what it contains, previous versions and properties. Concludes by reporting baseline results of models trained on this dataset.
Tasks	Accuracy Metrics, Keyword Spotting, Speech Recognition
Published	2018-04-09
URL	http://arxiv.org/abs/1804.03209v1
PDF	http://arxiv.org/pdf/1804.03209v1.pdf
PWC	https://paperswithcode.com/paper/speech-commands-a-dataset-for-limited
Repo	https://github.com/saraalemadi/DroneAudioDataset
Framework	tf

Collective Matrix Completion


Title	Collective Matrix Completion
Authors	Mokhtar Z. Alaya, Olga Klopp
Abstract	Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then, we relax the assumption of exponential family distribution for the noise and we investigate the distribution-free case. In this setting, we do not assume any specific model for the observations. The estimation procedures are based on minimizing the sum of a goodness-of-fit term and the nuclear norm penalization of the whole collective matrix. We prove that the proposed estimators achieve fast rates of convergence under the two considered settings and we corroborate our results with numerical experiments.
Tasks	Matrix Completion
Published	2018-07-24
URL	https://arxiv.org/abs/1807.09010v3
PDF	https://arxiv.org/pdf/1807.09010v3.pdf
PWC	https://paperswithcode.com/paper/collective-matrix-completion
Repo	https://github.com/mzalaya/collectivemc
Framework	none

Indian Regional Movie Dataset for Recommender Systems


Title	Indian Regional Movie Dataset for Recommender Systems
Authors	Prerna Agarwal, Richa Verma, Angshul Majumdar
Abstract	Indian regional movie dataset is the first database of regional Indian movies, users and their ratings. It consists of movies belonging to 18 different Indian regional languages and metadata of users with varying demographics. Through this dataset, the diversity of Indian regional cinema and its huge viewership is captured. We analyze the dataset that contains roughly 10K ratings of 919 users and 2,851 movies using some supervised and unsupervised collaborative filtering techniques like Probabilistic Matrix Factorization, Matrix Completion, Blind Compressed Sensing etc. The dataset consists of metadata information of users like age, occupation, home state and known languages. It also consists of metadata of movies like genre, language, release year and cast. India has a wide base of viewers which is evident by the large number of movies released every year and the huge box-office revenue. This dataset can be used for designing recommendation systems for Indian users and regional movies, which do not, yet, exist. The dataset can be downloaded from \href{https://goo.gl/EmTPv6}{https://goo.gl/EmTPv6}.
Tasks	Matrix Completion, Recommendation Systems
Published	2018-01-07
URL	http://arxiv.org/abs/1801.02203v1
PDF	http://arxiv.org/pdf/1801.02203v1.pdf
PWC	https://paperswithcode.com/paper/indian-regional-movie-dataset-for-recommender
Repo	https://github.com/MadPr0grammer/CF_Project
Framework	none

Unsupervised Single Image Dehazing Using Dark Channel Prior Loss


Title	Unsupervised Single Image Dehazing Using Dark Channel Prior Loss
Authors	Alona Golts, Daniel Freedman, Michael Elad
Abstract	Single image dehazing is a critical stage in many modern-day autonomous vision applications. Early prior-based methods often involved a time-consuming minimization of a hand-crafted energy function. Recent learning-based approaches utilize the representational power of deep neural networks (DNNs) to learn the underlying transformation between hazy and clear images. Due to inherent limitations in collecting matching clear and hazy images, these methods resort to training on synthetic data; constructed from indoor images and corresponding depth information. This may result in a possible domain shift when treating outdoor scenes. We propose a completely unsupervised method of training via minimization of the well-known, Dark Channel Prior (DCP) energy function. Instead of feeding the network with synthetic data, we solely use real-world outdoor images and tune the network’s parameters by directly minimizing the DCP. Although our “Deep DCP” technique can be regarded as a fast approximator of DCP, it actually improves its results significantly. This suggests an additional regularization obtained via the network and learning process. Experiments show that our method performs on par with large-scale supervised methods.
Tasks	Image Dehazing, Single Image Dehazing
Published	2018-12-06
URL	https://arxiv.org/abs/1812.07051v2
PDF	https://arxiv.org/pdf/1812.07051v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-single-image-dehazing-using-dark
Repo	https://github.com/AlonaGolts/Deep_Energy
Framework	tf

Increasing the adversarial robustness and explainability of capsule networks with $γ$-capsules


Title	Increasing the adversarial robustness and explainability of capsule networks with $γ$-capsules
Authors	David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez
Abstract	In this paper we introduce a new inductive bias for capsule networks and call networks that use this prior $\gamma$-capsule networks. Our inductive bias that is inspired by TE neurons of the inferior temporal cortex increases the adversarial robustness and the explainability of capsule networks. A theoretical framework with formal definitions of $\gamma$-capsule networks and metrics for evaluation are also provided. Under our framework we show that common capsule networks do not necessarily make use of this inductive bias. For this reason we introduce a novel routing algorithm and use a different training algorithm to be able to implement $\gamma$-capsule networks. We then show experimentally that $\gamma$-capsule networks are indeed more transparent and more robust against adversarial attacks than regular capsule networks.
Tasks
Published	2018-12-23
URL	https://arxiv.org/abs/1812.09707v4
PDF	https://arxiv.org/pdf/1812.09707v4.pdf
PWC	https://paperswithcode.com/paper/training-deep-capsule-networks
Repo	https://github.com/peerdavid/gamma-capsule-network
Framework	tf

Universal Semi-Supervised Semantic Segmentation


Title	Universal Semi-Supervised Semantic Segmentation
Authors	Tarun Kalluri, Girish Varma, Manmohan Chandraker, C V Jawahar
Abstract	In recent years, the need for semantic segmentation has arisen across several different applications and environments. However, the expense and redundancy of annotation often limits the quantity of labels available for training in any domain, while deployment is easier if a single model works well across domains. In this paper, we pose the novel problem of universal semi-supervised semantic segmentation and propose a solution framework, to meet the dual needs of lower annotation and deployment costs. In contrast to counterpoints such as fine tuning, joint training or unsupervised domain adaptation, universal semi-supervised segmentation ensures that across all domains: (i) a single model is deployed, (ii) unlabeled data is used, (iii) performance is improved, (iv) only a few labels are needed and (v) label spaces may differ. To address this, we minimize supervised as well as within and cross-domain unsupervised losses, introducing a novel feature alignment objective based on pixel-aware entropy regularization for the latter. We demonstrate quantitative advantages over other approaches on several combinations of segmentation datasets across different geographies (Germany, England, India) and environments (outdoors, indoors), as well as qualitative insights on the aligned representations.
Tasks	Domain Adaptation, Semantic Segmentation, Semi-Supervised Semantic Segmentation, Unsupervised Domain Adaptation
Published	2018-11-26
URL	https://arxiv.org/abs/1811.10323v3
PDF	https://arxiv.org/pdf/1811.10323v3.pdf
PWC	https://paperswithcode.com/paper/universal-semi-supervised-semantic
Repo	https://github.com/tarun005/USSS_ICCV19
Framework	pytorch

Semantic Labeling in Very High Resolution Images via a Self-Cascaded Convolutional Neural Network


Title	Semantic Labeling in Very High Resolution Images via a Self-Cascaded Convolutional Neural Network
Authors	Yongcheng Liu, Bin Fan, Lingfeng Wang, Jun Bai, Shiming Xiang, Chunhong Pan
Abstract	Semantic labeling for very high resolution (VHR) images in urban areas, is of significant importance in a wide range of remote sensing applications. However, many confusing manmade objects and intricate fine-structured objects make it very difficult to obtain both coherent and accurate labeling results. For this challenging task, we propose a novel deep model with convolutional neural networks (CNNs), i.e., an end-to-end self-cascaded network (ScasNet). Specifically, for confusing manmade objects, ScasNet improves the labeling coherence with sequential global-to-local contexts aggregation. Technically, multi-scale contexts are captured on the output of a CNN encoder, and then they are successively aggregated in a self-cascaded manner. Meanwhile, for fine-structured objects, ScasNet boosts the labeling accuracy with a coarse-to-fine refinement strategy. It progressively refines the target objects using the low-level features learned by CNN’s shallow layers. In addition, to correct the latent fitting residual caused by multi-feature fusion inside ScasNet, a dedicated residual correction scheme is proposed. It greatly improves the effectiveness of ScasNet. Extensive experimental results on three public datasets, including two challenging benchmarks, show that ScasNet achieves the state-of-the-art performance.
Tasks
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11236v1
PDF	http://arxiv.org/pdf/1807.11236v1.pdf
PWC	https://paperswithcode.com/paper/semantic-labeling-in-very-high-resolution
Repo	https://github.com/Yochengliu/ScasNet
Framework	none

Learning Anonymized Representations with Adversarial Neural Networks


Title	Learning Anonymized Representations with Adversarial Neural Networks
Authors	Clément Feutry, Pablo Piantanida, Yoshua Bengio, Pierre Duhamel
Abstract	Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel information theoretical bounds. We introduce a novel training objective for simultaneously training a predictor over target variables of interest (the regular labels) while preventing an intermediate representation to be predictive of the private labels. The architecture is based on three sub-networks: one going from input to representation, one from representation to predicted regular labels, and one from representation to predicted private labels. The training procedure aims at learning representations that preserve the relevant part of the information (about regular labels) while dismissing information about the private labels which correspond to the identity of a person. We demonstrate the success of this approach for two distinct classification versus anonymization tasks (handwritten digits and sentiment analysis).
Tasks	Representation Learning, Sentiment Analysis
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09386v1
PDF	http://arxiv.org/pdf/1802.09386v1.pdf
PWC	https://paperswithcode.com/paper/learning-anonymized-representations-with
Repo	https://github.com/maxfriedrich/deid-training-data
Framework	tf

Scalable agent alignment via reward modeling: a research direction


Title	Scalable agent alignment via reward modeling: a research direction
Authors	Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg
Abstract	One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions. Designing such reward functions is difficult in part because the user only has an implicit understanding of the task objective. This gives rise to the agent alignment problem: how do we create agents that behave in accordance with the user’s intentions? We outline a high-level research direction to solve the agent alignment problem centered around reward modeling: learning a reward function from interaction with the user and optimizing the learned reward function with reinforcement learning. We discuss the key challenges we expect to face when scaling reward modeling to complex and general domains, concrete approaches to mitigate these challenges, and ways to establish trust in the resulting agents.
Tasks	Atari Games
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07871v1
PDF	http://arxiv.org/pdf/1811.07871v1.pdf
PWC	https://paperswithcode.com/paper/scalable-agent-alignment-via-reward-modeling
Repo	https://github.com/rddy/ReQueST
Framework	tf

Image Smoothing via Unsupervised Learning


Title	Image Smoothing via Unsupervised Learning
Authors	Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, Xin Tong
Abstract	Image smoothing represents a fundamental component of many disparate computer vision and graphics applications. In this paper, we present a unified unsupervised (label-free) learning framework that facilitates generating flexible and high-quality smoothing effects by directly learning from data using deep convolutional neural networks (CNNs). The heart of the design is the training signal as a novel energy function that includes an edge-preserving regularizer which helps maintain important yet potentially vulnerable image structures, and a spatially-adaptive Lp flattening criterion which imposes different forms of regularization onto different image regions for better smoothing quality. We implement a diverse set of image smoothing solutions employing the unified framework targeting various applications such as, image abstraction, pencil sketching, detail enhancement, texture removal and content-aware image manipulation, and obtain results comparable with or better than previous methods. Moreover, our method is extremely fast with a modern GPU (e.g, 200 fps for 1280x720 images). Our codes and model are released in https://github.com/fqnchina/ImageSmoothing.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02804v1
PDF	http://arxiv.org/pdf/1811.02804v1.pdf
PWC	https://paperswithcode.com/paper/image-smoothing-via-unsupervised-learning
Repo	https://github.com/fqnchina/ImageSmoothing
Framework	pytorch

Are generative deep models for novelty detection truly better?


Title	Are generative deep models for novelty detection truly better?
Authors	Vít Škvára, Tomáš Pevný, Václav Šmídl
Abstract	Many deep models have been recently proposed for anomaly detection. This paper presents comparison of selected generative deep models and classical anomaly detection methods on an extensive number of non–image benchmark datasets. We provide statistical comparison of the selected models, in many configurations, architectures and hyperparamaters. We arrive to conclusion that performance of the generative models is determined by the process of selection of their hyperparameters. Specifically, performance of the deep generative models deteriorates with decreasing amount of anomalous samples used in hyperparameter selection. In practical scenarios of anomaly detection, none of the deep generative models systematically outperforms the kNN.
Tasks	Anomaly Detection
Published	2018-07-13
URL	http://arxiv.org/abs/1807.05027v1
PDF	http://arxiv.org/pdf/1807.05027v1.pdf
PWC	https://paperswithcode.com/paper/are-generative-deep-models-for-novelty
Repo	https://github.com/smidl/AnomalyDetection.jl
Framework	none

A Projection Method for Metric-Constrained Optimization


Title	A Projection Method for Metric-Constrained Optimization
Authors	Nate Veldt, David Gleich, Anthony Wirth, James Saunderson
Abstract	We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, they are challenging to solve in practice due to the high memory requirement of black-box solvers. In order to address this challenge we first prove that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem. We then developed a general solver for metric-constrained linear and quadratic programs by generalizing and improving a simple projection algorithm originally developed for metric nearness. We give several novel approximation guarantees for using our framework to find lower bounds for optimal solutions to several challenging graph clustering problems. We also demonstrate the power of our framework by solving optimizing problems involving up to 10^{8} variables and 10^{11} constraints.
Tasks	Graph Clustering
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01678v1
PDF	http://arxiv.org/pdf/1806.01678v1.pdf
PWC	https://paperswithcode.com/paper/a-projection-method-for-metric-constrained
Repo	https://github.com/nveldt/MetricOptimization
Framework	none

Sequential Variational Autoencoders for Collaborative Filtering


Title	Sequential Variational Autoencoders for Collaborative Filtering
Authors	Noveen Sachdeva, Giuseppe Manco, Ettore Ritacco, Vikram Pudi
Abstract	Variational autoencoders were proven successful in domains such as computer vision and speech processing. Their adoption for modeling user preferences is still unexplored, although recently it is starting to gain attention in the current literature. In this work, we propose a model which extends variational autoencoders by exploiting the rich information present in the past preference history. We introduce a recurrent version of the VAE, where instead of passing a subset of the whole history regardless of temporal dependencies, we rather pass the consumption sequence subset through a recurrent neural network. At each time-step of the RNN, the sequence is fed through a series of fully-connected layers, the output of which models the probability distribution of the most likely future preferences. We show that handling temporal information is crucial for improving the accuracy of the VAE: In fact, our model beats the current state-of-the-art by valuable margins because of its ability to capture temporal dependencies among the user-consumption sequence using the recurrent encoder still keeping the fundamentals of variational autoencoders intact.
Tasks
Published	2018-11-25
URL	http://arxiv.org/abs/1811.09975v1
PDF	http://arxiv.org/pdf/1811.09975v1.pdf
PWC	https://paperswithcode.com/paper/sequential-variational-autoencoders-for
Repo	https://github.com/noveens/svae_cf
Framework	pytorch