October 20, 2019

3138 words 15 mins read

Paper Group AWR 250

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning. Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality. Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models. Multiview Learning of Weighted Majority Vote by Bregman Divergence Min …

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning


Title	AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning
Authors	Jinyuan Jia, Neil Zhenqiang Gong
Abstract	Users in various web and mobile applications are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer a target user’s private attributes (e.g., location, sexual orientation, political view) from its public data (e.g., rating scores, page likes). Existing defenses leverage game theory or heuristics based on correlations between the public data and attributes. These defenses are not practical. Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users’ public data. In this paper, we present AttriGuard, a practical defense against attribute inference attacks. AttriGuard is computationally tractable and has small utility loss. Our AttriGuard works in two phases. Suppose we aim to protect a user’s private attribute. In Phase I, for each value of the attribute, we find a minimum noise such that if we add the noise to the user’s public data, then the attacker’s classifier is very likely to infer the attribute value for the user. We find the minimum noise via adapting existing evasion attacks in adversarial machine learning. In Phase II, we sample one attribute value according to a certain probability distribution and add the corresponding noise found in Phase I to the user’s public data. We formulate finding the probability distribution as solving a constrained convex optimization problem. We extensively evaluate AttriGuard and compare it with existing methods using a real-world dataset. Our results show that AttriGuard substantially outperforms existing methods. Our work is the first one that shows evasion attacks can be used as defensive techniques for privacy protection.
Tasks
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04810v1
PDF	http://arxiv.org/pdf/1805.04810v1.pdf
PWC	https://paperswithcode.com/paper/attriguard-a-practical-defense-against
Repo	https://github.com/jjy1994/AttriGuard
Framework	tf

Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality


Title	Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality
Authors	Miaoyan Wang, Lexin Li
Abstract	We consider the problem of decomposition of multiway tensor with binary entries. Such data problems arise frequently in numerous applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose that the observed binary entries follow a Bernoulli model, develop a rank-constrained likelihood-based estimation procedure, and obtain the theoretical accuracy guarantees. Specifically, we establish the error bound of the tensor estimation, and show that the obtained rate is minimax optimal under the considered model. We demonstrate the efficacy of our approach through both simulations and analyses of multiple real-world datasets on the tasks of tensor completion and clustering.
Tasks
Published	2018-11-13
URL	https://arxiv.org/abs/1811.05076v2
PDF	https://arxiv.org/pdf/1811.05076v2.pdf
PWC	https://paperswithcode.com/paper/learning-from-binary-multiway-data
Repo	https://github.com/Miaoyanwang/Binary-Tensor
Framework	none

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models


Title	Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models
Authors	Henry B. Moss, David S. Leslie, Paul Rayson
Abstract	K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so our performance estimates are in fact stochastic, with variability that can be substantial for natural language processing tasks. We demonstrate that these unstable estimates cannot be relied upon for effective parameter tuning. The resulting tuned parameters are highly sensitive to how our data is partitioned, meaning that we often select sub-optimal parameter choices and have serious reproducibility issues. Instead, we propose to use the less variable J-K-fold CV, in which J independent K-fold cross validations are used to assess performance. Our main contributions are extending J-K-fold CV from performance estimation to parameter tuning and investigating how to choose J and K. We argue that variability is more important than bias for effective tuning and so advocate lower choices of K than are typically seen in the NLP literature, instead use the saved computation to increase J. To demonstrate the generality of our recommendations we investigate a wide range of case-studies: sentiment classification (both general and target-specific), part-of-speech tagging and document classification.
Tasks	Document Classification, Model Selection, Part-Of-Speech Tagging, Sentiment Analysis
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07139v1
PDF	http://arxiv.org/pdf/1806.07139v1.pdf
PWC	https://paperswithcode.com/paper/using-j-k-fold-cross-validation-to-reduce
Repo	https://github.com/henrymoss/COLING2018
Framework	none

Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization


Title	Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization
Authors	Anil Goyal, Emilie Morvant, Massih-Reza Amini
Abstract	We tackle the issue of classifier combinations when observations have multiple views. Our method jointly learns view-specific weighted majority vote classifiers (i.e. for each view) over a set of base voters, and a second weighted majority vote classifier over the set of these view-specific weighted majority vote classifiers. We show that the empirical risk minimization of the final majority vote given a multiview training set can be cast as the minimization of Bregman divergences. This allows us to derive a parallel-update optimization algorithm for learning our multiview model. We empirically study our algorithm with a particular focus on the impact of the training set size on the multiview learning results. The experiments show that our approach is able to overcome the lack of labeled information.
Tasks	Document Classification, Multilingual text classification, Multiview Learning, Text Classification
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10212v1
PDF	http://arxiv.org/pdf/1805.10212v1.pdf
PWC	https://paperswithcode.com/paper/multiview-learning-of-weighted-majority-vote
Repo	https://github.com/goyalanil/Multiview_Dataset_MNIST
Framework	none

Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer


Title	Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
Authors	Ashnil Kumar, Michael Fulham, Dagan Feng, Jinman Kim
Abstract	The analysis of multi-modality positron emission tomography and computed tomography (PET-CT) images for computer aided diagnosis applications requires combining the sensitivity of PET to detect abnormal regions with anatomical localization from CT. Current methods for PET-CT image analysis either process the modalities separately or fuse information from each modality based on knowledge about the image analysis task. These methods generally do not consider the spatially varying visual characteristics that encode different information across the different modalities, which have different priorities at different locations. For example, a high abnormal PET uptake in the lungs is more meaningful for tumor detection than physiological PET uptake in the heart. Our aim is to improve fusion of the complementary information in multi-modality PET-CT with a new supervised convolutional neural network (CNN) that learns to fuse complementary information for multi-modality medical image analysis. Our CNN first encodes modality-specific features and then uses them to derive a spatially varying fusion map that quantifies the relative importance of each modality’s features across different spatial locations. These fusion maps are then multiplied with the modality-specific feature maps to obtain a representation of the complementary multi-modality information at different locations, which can then be used for image analysis. We evaluated the ability of our CNN to detect and segment multiple regions with different fusion requirements using a dataset of PET-CT images of lung cancer. We compared our method to baseline techniques for multi-modality image fusion and segmentation. Our findings show that our CNN had a significantly higher foreground detection accuracy (99.29%, p < 0.05) than the fusion baselines and a significantly higher Dice score (63.85%) than recent PET-CT tumor segmentation methods.
Tasks
Published	2018-10-05
URL	https://arxiv.org/abs/1810.02492v2
PDF	https://arxiv.org/pdf/1810.02492v2.pdf
PWC	https://paperswithcode.com/paper/co-learning-feature-fusion-maps-from-pet-ct
Repo	https://github.com/ashnilkumar/colearn
Framework	tf

Stein Points


Title	Stein Points
Authors	Wilson Ye Chen, Lester Mackey, Jackson Gorham, François-Xavier Briol, Chris J. Oates
Abstract	An important task in computational statistics and machine learning is to approximate a posterior distribution $p(x)$ with an empirical measure supported on a set of representative points ${x_i}_{i=1}^n$. This paper focuses on methods where the selection of points is essentially deterministic, with an emphasis on achieving accurate approximation when $n$ is small. To this end, we present `Stein Points’. The idea is to exploit either a greedy or a conditional gradient method to iteratively minimise a kernel Stein discrepancy between the empirical measure and $p(x)$. Our empirical results demonstrate that Stein Points enable accurate approximation of the posterior at modest computational cost. In addition, theoretical results are provided to establish convergence of the method. \|
Tasks
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10161v4
PDF	http://arxiv.org/pdf/1803.10161v4.pdf
PWC	https://paperswithcode.com/paper/stein-points
Repo	https://github.com/wilson-ye-chen/stein_points
Framework	none

Rethinking the Faster R-CNN Architecture for Temporal Action Localization


Title	Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Authors	Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David A. Ross, Jia Deng, Rahul Sukthankar
Abstract	We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework. TAL-Net addresses three key shortcomings of existing approaches: (1) we improve receptive field alignment using a multi-scale architecture that can accommodate extreme variation in action durations; (2) we better exploit the temporal context of actions for both proposal generation and action classification by appropriately extending receptive fields; and (3) we explicitly consider multi-stream feature fusion and demonstrate that fusing motion late is important. We achieve state-of-the-art performance for both action proposal and localization on THUMOS’14 detection benchmark and competitive performance on ActivityNet challenge.
Tasks	Action Classification, Action Localization, Object Detection, Temporal Action Localization
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07667v1
PDF	http://arxiv.org/pdf/1804.07667v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-the-faster-r-cnn-architecture-for
Repo	https://github.com/devbas/ovassistant-alpha
Framework	none

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues


Title	A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues
Authors	Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler
Abstract	In this paper, we comprehensively describe the methodology of our submissions to the One-Minute Gradual-Emotion Behavior Challenge 2018.
Tasks
Published	2018-05-02
URL	https://arxiv.org/abs/1805.00638v2
PDF	https://arxiv.org/pdf/1805.00638v2.pdf
PWC	https://paperswithcode.com/paper/a-deep-network-for-arousal-valence-emotion
Repo	https://github.com/pengsongyou/OMG-ADSC
Framework	pytorch

Comparative Document Summarisation via Classification


Title	Comparative Document Summarisation via Classification
Authors	Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie
Abstract	This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summarisation, interpretable machine learning, and data subset selection. In particular, by casting the problem as a binary classification amongst different groups, we derive objectives based on the notion of maximum mean discrepancy, as well as a simple yet effective gradient-based optimisation strategy. Our new formulation allows scalable evaluations of comparative summarisation as a classification task, both automatically and via crowd-sourcing. To this end, we evaluate comparative summarisation methods on a newly curated collection of controversial news topics over 13 months. We observe that gradient-based optimisation outperforms discrete and baseline approaches in 14 out of 24 different automatic evaluation settings. In crowd-sourced evaluations, summaries from gradient optimisation elicit 7% more accurate classification from human workers than discrete optimisation. Our result contrasts with recent literature on submodular data subset selection that favours discrete optimisation. We posit that our formulation of comparative summarisation will prove useful in a diverse range of use cases such as comparing content sources, authors, related topics, or distinct view points.
Tasks	Interpretable Machine Learning
Published	2018-12-06
URL	https://arxiv.org/abs/1812.02171v2
PDF	https://arxiv.org/pdf/1812.02171v2.pdf
PWC	https://paperswithcode.com/paper/comparative-document-summarisation-via
Repo	https://github.com/computationalmedia/compsumm
Framework	none

Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach


Title	Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach
Authors	Tarun Kumar Gupta, Khalid Raza
Abstract	The performance of Feedforward neural network (FNN) fully de-pends upon the selection of architecture and training algorithm. FNN architecture can be tweaked using several parameters, such as the number of hidden layers, number of hidden neurons at each hidden layer and number of connections between layers. There may be exponential combinations for these architectural attributes which may be unmanageable manually, so it requires an algorithm which can automatically design an optimal architecture with high generalization ability. Numerous optimization algorithms have been utilized for FNN architecture determination. This paper proposes a new methodology which can work on the estimation of hidden layers and their respective neurons for FNN. This work combines the advantages of Tabu search (TS) and Gradient descent with momentum backpropagation (GDM) training algorithm to demonstrate how Tabu search can automatically select the best architecture from the populated architectures based on minimum testing error criteria. The proposed approach has been tested on four classification benchmark dataset of different size.
Tasks
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05979v1
PDF	http://arxiv.org/pdf/1808.05979v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-deep-neural-network-architecture-a
Repo	https://github.com/marzekan/Digit-classifier
Framework	none

Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models


Title	Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
Authors	Shanshan Wu, Sujay Sanghavi, Alexandros G. Dimakis
Abstract	We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. samples. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an $\ell_{2,1}$ group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an $\ell_1$ constraint (or $\ell_{2,1}$ constraint) and the sample vector has an $\ell_{\infty}$ constraint (or $\ell_{2, \infty}$ constraint). We also show that the proposed convex programs can be efficiently solved in $\tilde{O}(n^2)$ running time (where $n$ is the number of variables) under the same statistical guarantees. We provide experimental results to support our analysis.
Tasks
Published	2018-10-28
URL	https://arxiv.org/abs/1810.11905v3
PDF	https://arxiv.org/pdf/1810.11905v3.pdf
PWC	https://paperswithcode.com/paper/sparse-logistic-regression-learns-all
Repo	https://github.com/wushanshan/GraphLearn
Framework	none

Exploration by Random Network Distillation


Title	Exploration by Random Network Distillation
Authors	Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov
Abstract	We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma’s Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level.
Tasks	Atari Games, Montezuma’s Revenge
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12894v1
PDF	http://arxiv.org/pdf/1810.12894v1.pdf
PWC	https://paperswithcode.com/paper/exploration-by-random-network-distillation
Repo	https://github.com/kngwyu/intrinsic-rewards
Framework	pytorch

Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games


Title	Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games
Authors	Zhengxing Chen, Chris Amato, Truong-Huy Nguyen, Seth Cooper, Yizhou Sun, Magy Seif El-Nasr
Abstract	Deck building is a crucial component in playing Collectible Card Games (CCGs). The goal of deck building is to choose a fixed-sized subset of cards from a large card pool, so that they work well together in-game against specific opponents. Existing methods either lack flexibility to adapt to different opponents or require large computational resources, still making them unsuitable for any real-time or large-scale application. We propose a new deck recommendation system, named Q-DeckRec, which learns a deck search policy during a training phase and uses it to solve deck building problem instances. Our experimental results demonstrate Q-DeckRec requires less computational resources to build winning-effective decks after a training phase compared to several baseline methods.
Tasks	Card Games
Published	2018-06-26
URL	http://arxiv.org/abs/1806.09771v1
PDF	http://arxiv.org/pdf/1806.09771v1.pdf
PWC	https://paperswithcode.com/paper/q-deckrec-a-fast-deck-recommendation-system
Repo	https://github.com/czxttkl/X-AI
Framework	tf

An Adaptive Locally Connected Neuron Model: Focusing Neuron


Title	An Adaptive Locally Connected Neuron Model: Focusing Neuron
Authors	F. Boray Tek
Abstract	We present a new artificial neuron model capable of learning its receptive field in the spatial domain of inputs. The name for the new model is focusing neuron because it can adapt both its receptive field location and size (aperture) during training. A network or a layer formed of such neurons can learn and generate unique connection structures for particular inputs/problems. The new model requires neither heuristics nor additional optimizations. Hence, all parameters, including those controlling the focus could be trained using the stochastic gradient descent optimization. We have empirically shown the capacity and viability of the new model with tests on synthetic and real datasets. We have constructed simple networks with one or two hidden layers; also employed fully connected networks with the same configurations as controls. In noise-added synthetic Gaussian blob datasets, we observed that focusing neurons can steer their receptive fields away from the redundant inputs and focused into more informative ones. Tests on common datasets such as MNIST have shown that a network of two hidden focusing layers can perform better (99.21% test accuracy) than a fully connected dense network with the same configuration.
Tasks
Published	2018-08-31
URL	https://arxiv.org/abs/1809.09533v2
PDF	https://arxiv.org/pdf/1809.09533v2.pdf
PWC	https://paperswithcode.com/paper/an-adaptive-locally-connected-neuron-model
Repo	https://github.com/btekgit/FocusingNeuron
Framework	none

Exposing DeepFake Videos By Detecting Face Warping Artifacts


Title	Exposing DeepFake Videos By Detecting Face Warping Artifacts
Authors	Yuezun Li, Siwei Lyu
Abstract	In this work, we describe a new deep learning based method that can effectively distinguish AI-generated fake videos (referred to as {\em DeepFake} videos hereafter) from real videos. Our method is based on the observations that current DeepFake algorithm can only generate images of limited resolutions, which need to be further warped to match the original faces in the source video. Such transforms leave distinctive artifacts in the resulting DeepFake videos, and we show that they can be effectively captured by convolutional neural networks (CNNs). Compared to previous methods which use a large amount of real and DeepFake generated images to train CNN classifier, our method does not need DeepFake generated images as negative training examples since we target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images. The advantages of our method are two-fold: (1) Such artifacts can be simulated directly using simple image processing operations on a image to make it as negative example. Since training a DeepFake model to generate negative examples is time-consuming and resource-demanding, our method saves a plenty of time and resources in training data collection; (2) Since such artifacts are general existed in DeepFake videos from different sources, our method is more robust compared to others. Our method is evaluated on two sets of DeepFake video datasets for its effectiveness in practice.
Tasks	Face Swapping
Published	2018-11-01
URL	https://arxiv.org/abs/1811.00656v3
PDF	https://arxiv.org/pdf/1811.00656v3.pdf
PWC	https://paperswithcode.com/paper/exposing-deepfake-videos-by-detecting-face
Repo	https://github.com/danmohaha/CVPRW2019_Face_Artifacts
Framework	tf