October 20, 2019

3138 words 15 mins read

Paper Group AWR 250

Paper Group AWR 250

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning. Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality. Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models. Multiview Learning of Weighted Majority Vote by Bregman Divergence Min …

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning

Title AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning
Authors Jinyuan Jia, Neil Zhenqiang Gong
Abstract Users in various web and mobile applications are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer a target user’s private attributes (e.g., location, sexual orientation, political view) from its public data (e.g., rating scores, page likes). Existing defenses leverage game theory or heuristics based on correlations between the public data and attributes. These defenses are not practical. Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users’ public data. In this paper, we present AttriGuard, a practical defense against attribute inference attacks. AttriGuard is computationally tractable and has small utility loss. Our AttriGuard works in two phases. Suppose we aim to protect a user’s private attribute. In Phase I, for each value of the attribute, we find a minimum noise such that if we add the noise to the user’s public data, then the attacker’s classifier is very likely to infer the attribute value for the user. We find the minimum noise via adapting existing evasion attacks in adversarial machine learning. In Phase II, we sample one attribute value according to a certain probability distribution and add the corresponding noise found in Phase I to the user’s public data. We formulate finding the probability distribution as solving a constrained convex optimization problem. We extensively evaluate AttriGuard and compare it with existing methods using a real-world dataset. Our results show that AttriGuard substantially outperforms existing methods. Our work is the first one that shows evasion attacks can be used as defensive techniques for privacy protection.
Tasks
Published 2018-05-13
URL http://arxiv.org/abs/1805.04810v1
PDF http://arxiv.org/pdf/1805.04810v1.pdf
PWC https://paperswithcode.com/paper/attriguard-a-practical-defense-against
Repo https://github.com/jjy1994/AttriGuard
Framework tf

Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality

Title Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality
Authors Miaoyan Wang, Lexin Li
Abstract We consider the problem of decomposition of multiway tensor with binary entries. Such data problems arise frequently in numerous applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose that the observed binary entries follow a Bernoulli model, develop a rank-constrained likelihood-based estimation procedure, and obtain the theoretical accuracy guarantees. Specifically, we establish the error bound of the tensor estimation, and show that the obtained rate is minimax optimal under the considered model. We demonstrate the efficacy of our approach through both simulations and analyses of multiple real-world datasets on the tasks of tensor completion and clustering.
Tasks
Published 2018-11-13
URL https://arxiv.org/abs/1811.05076v2
PDF https://arxiv.org/pdf/1811.05076v2.pdf
PWC https://paperswithcode.com/paper/learning-from-binary-multiway-data
Repo https://github.com/Miaoyanwang/Binary-Tensor
Framework none

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

Title Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models
Authors Henry B. Moss, David S. Leslie, Paul Rayson
Abstract K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so our performance estimates are in fact stochastic, with variability that can be substantial for natural language processing tasks. We demonstrate that these unstable estimates cannot be relied upon for effective parameter tuning. The resulting tuned parameters are highly sensitive to how our data is partitioned, meaning that we often select sub-optimal parameter choices and have serious reproducibility issues. Instead, we propose to use the less variable J-K-fold CV, in which J independent K-fold cross validations are used to assess performance. Our main contributions are extending J-K-fold CV from performance estimation to parameter tuning and investigating how to choose J and K. We argue that variability is more important than bias for effective tuning and so advocate lower choices of K than are typically seen in the NLP literature, instead use the saved computation to increase J. To demonstrate the generality of our recommendations we investigate a wide range of case-studies: sentiment classification (both general and target-specific), part-of-speech tagging and document classification.
Tasks Document Classification, Model Selection, Part-Of-Speech Tagging, Sentiment Analysis
Published 2018-06-19
URL http://arxiv.org/abs/1806.07139v1
PDF http://arxiv.org/pdf/1806.07139v1.pdf
PWC https://paperswithcode.com/paper/using-j-k-fold-cross-validation-to-reduce
Repo https://github.com/henrymoss/COLING2018
Framework none

Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization

Title Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization
Authors Anil Goyal, Emilie Morvant, Massih-Reza Amini
Abstract We tackle the issue of classifier combinations when observations have multiple views. Our method jointly learns view-specific weighted majority vote classifiers (i.e. for each view) over a set of base voters, and a second weighted majority vote classifier over the set of these view-specific weighted majority vote classifiers. We show that the empirical risk minimization of the final majority vote given a multiview training set can be cast as the minimization of Bregman divergences. This allows us to derive a parallel-update optimization algorithm for learning our multiview model. We empirically study our algorithm with a particular focus on the impact of the training set size on the multiview learning results. The experiments show that our approach is able to overcome the lack of labeled information.
Tasks Document Classification, Multilingual text classification, Multiview Learning, Text Classification
Published 2018-05-25
URL http://arxiv.org/abs/1805.10212v1
PDF http://arxiv.org/pdf/1805.10212v1.pdf
PWC https://paperswithcode.com/paper/multiview-learning-of-weighted-majority-vote
Repo https://github.com/goyalanil/Multiview_Dataset_MNIST
Framework none

Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer

Title Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
Authors Ashnil Kumar, Michael Fulham, Dagan Feng, Jinman Kim
Abstract The analysis of multi-modality positron emission tomography and computed tomography (PET-CT) images for computer aided diagnosis applications requires combining the sensitivity of PET to detect abnormal regions with anatomical localization from CT. Current methods for PET-CT image analysis either process the modalities separately or fuse information from each modality based on knowledge about the image analysis task. These methods generally do not consider the spatially varying visual characteristics that encode different information across the different modalities, which have different priorities at different locations. For example, a high abnormal PET uptake in the lungs is more meaningful for tumor detection than physiological PET uptake in the heart. Our aim is to improve fusion of the complementary information in multi-modality PET-CT with a new supervised convolutional neural network (CNN) that learns to fuse complementary information for multi-modality medical image analysis. Our CNN first encodes modality-specific features and then uses them to derive a spatially varying fusion map that quantifies the relative importance of each modality’s features across different spatial locations. These fusion maps are then multiplied with the modality-specific feature maps to obtain a representation of the complementary multi-modality information at different locations, which can then be used for image analysis. We evaluated the ability of our CNN to detect and segment multiple regions with different fusion requirements using a dataset of PET-CT images of lung cancer. We compared our method to baseline techniques for multi-modality image fusion and segmentation. Our findings show that our CNN had a significantly higher foreground detection accuracy (99.29%, p < 0.05) than the fusion baselines and a significantly higher Dice score (63.85%) than recent PET-CT tumor segmentation methods.
Tasks
Published 2018-10-05
URL https://arxiv.org/abs/1810.02492v2
PDF https://arxiv.org/pdf/1810.02492v2.pdf
PWC https://paperswithcode.com/paper/co-learning-feature-fusion-maps-from-pet-ct
Repo https://github.com/ashnilkumar/colearn
Framework tf

Stein Points

Title Stein Points
Authors Wilson Ye Chen, Lester Mackey, Jackson Gorham, François-Xavier Briol, Chris J. Oates
Abstract An important task in computational statistics and machine learning is to approximate a posterior distribution $p(x)$ with an empirical measure supported on a set of representative points ${x_i}_{i=1}^n$. This paper focuses on methods where the selection of points is essentially deterministic, with an emphasis on achieving accurate approximation when $n$ is small. To this end, we present `Stein Points’. The idea is to exploit either a greedy or a conditional gradient method to iteratively minimise a kernel Stein discrepancy between the empirical measure and $p(x)$. Our empirical results demonstrate that Stein Points enable accurate approximation of the posterior at modest computational cost. In addition, theoretical results are provided to establish convergence of the method. |
Tasks
Published 2018-03-27
URL http://arxiv.org/abs/1803.10161v4
PDF http://arxiv.org/pdf/1803.10161v4.pdf
PWC https://paperswithcode.com/paper/stein-points
Repo https://github.com/wilson-ye-chen/stein_points
Framework none

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

Title Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Authors Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David A. Ross, Jia Deng, Rahul Sukthankar
Abstract We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework. TAL-Net addresses three key shortcomings of existing approaches: (1) we improve receptive field alignment using a multi-scale architecture that can accommodate extreme variation in action durations; (2) we better exploit the temporal context of actions for both proposal generation and action classification by appropriately extending receptive fields; and (3) we explicitly consider multi-stream feature fusion and demonstrate that fusing motion late is important. We achieve state-of-the-art performance for both action proposal and localization on THUMOS’14 detection benchmark and competitive performance on ActivityNet challenge.
Tasks Action Classification, Action Localization, Object Detection, Temporal Action Localization
Published 2018-04-20
URL http://arxiv.org/abs/1804.07667v1
PDF http://arxiv.org/pdf/1804.07667v1.pdf
PWC https://paperswithcode.com/paper/rethinking-the-faster-r-cnn-architecture-for
Repo https://github.com/devbas/ovassistant-alpha
Framework none

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

Title A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues
Authors Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler
Abstract In this paper, we comprehensively describe the methodology of our submissions to the One-Minute Gradual-Emotion Behavior Challenge 2018.
Tasks
Published 2018-05-02
URL https://arxiv.org/abs/1805.00638v2
PDF https://arxiv.org/pdf/1805.00638v2.pdf
PWC https://paperswithcode.com/paper/a-deep-network-for-arousal-valence-emotion
Repo https://github.com/pengsongyou/OMG-ADSC
Framework pytorch

Comparative Document Summarisation via Classification

Title Comparative Document Summarisation via Classification
Authors Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie
Abstract This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summarisation, interpretable machine learning, and data subset selection. In particular, by casting the problem as a binary classification amongst different groups, we derive objectives based on the notion of maximum mean discrepancy, as well as a simple yet effective gradient-based optimisation strategy. Our new formulation allows scalable evaluations of comparative summarisation as a classification task, both automatically and via crowd-sourcing. To this end, we evaluate comparative summarisation methods on a newly curated collection of controversial news topics over 13 months. We observe that gradient-based optimisation outperforms discrete and baseline approaches in 14 out of 24 different automatic evaluation settings. In crowd-sourced evaluations, summaries from gradient optimisation elicit 7% more accurate classification from human workers than discrete optimisation. Our result contrasts with recent literature on submodular data subset selection that favours discrete optimisation. We posit that our formulation of comparative summarisation will prove useful in a diverse range of use cases such as comparing content sources, authors, related topics, or distinct view points.
Tasks Interpretable Machine Learning
Published 2018-12-06
URL https://arxiv.org/abs/1812.02171v2
PDF https://arxiv.org/pdf/1812.02171v2.pdf
PWC https://paperswithcode.com/paper/comparative-document-summarisation-via
Repo https://github.com/computationalmedia/compsumm
Framework none

Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach

Title Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach
Authors Tarun Kumar Gupta, Khalid Raza
Abstract The performance of Feedforward neural network (FNN) fully de-pends upon the selection of architecture and training algorithm. FNN architecture can be tweaked using several parameters, such as the number of hidden layers, number of hidden neurons at each hidden layer and number of connections between layers. There may be exponential combinations for these architectural attributes which may be unmanageable manually, so it requires an algorithm which can automatically design an optimal architecture with high generalization ability. Numerous optimization algorithms have been utilized for FNN architecture determination. This paper proposes a new methodology which can work on the estimation of hidden layers and their respective neurons for FNN. This work combines the advantages of Tabu search (TS) and Gradient descent with momentum backpropagation (GDM) training algorithm to demonstrate how Tabu search can automatically select the best architecture from the populated architectures based on minimum testing error criteria. The proposed approach has been tested on four classification benchmark dataset of different size.
Tasks
Published 2018-08-17
URL http://arxiv.org/abs/1808.05979v1
PDF http://arxiv.org/pdf/1808.05979v1.pdf
PWC https://paperswithcode.com/paper/optimizing-deep-neural-network-architecture-a
Repo https://github.com/marzekan/Digit-classifier
Framework none

Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models

Title Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
Authors Shanshan Wu, Sujay Sanghavi, Alexandros G. Dimakis
Abstract We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. samples. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an $\ell_{2,1}$ group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an $\ell_1$ constraint (or $\ell_{2,1}$ constraint) and the sample vector has an $\ell_{\infty}$ constraint (or $\ell_{2, \infty}$ constraint). We also show that the proposed convex programs can be efficiently solved in $\tilde{O}(n^2)$ running time (where $n$ is the number of variables) under the same statistical guarantees. We provide experimental results to support our analysis.
Tasks
Published 2018-10-28
URL https://arxiv.org/abs/1810.11905v3
PDF https://arxiv.org/pdf/1810.11905v3.pdf
PWC https://paperswithcode.com/paper/sparse-logistic-regression-learns-all
Repo https://github.com/wushanshan/GraphLearn
Framework none

Exploration by Random Network Distillation

Title Exploration by Random Network Distillation
Authors Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov
Abstract We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma’s Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level.
Tasks Atari Games, Montezuma’s Revenge
Published 2018-10-30
URL http://arxiv.org/abs/1810.12894v1
PDF http://arxiv.org/pdf/1810.12894v1.pdf
PWC https://paperswithcode.com/paper/exploration-by-random-network-distillation
Repo https://github.com/kngwyu/intrinsic-rewards
Framework pytorch

Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games

Title Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games
Authors Zhengxing Chen, Chris Amato, Truong-Huy Nguyen, Seth Cooper, Yizhou Sun, Magy Seif El-Nasr
Abstract Deck building is a crucial component in playing Collectible Card Games (CCGs). The goal of deck building is to choose a fixed-sized subset of cards from a large card pool, so that they work well together in-game against specific opponents. Existing methods either lack flexibility to adapt to different opponents or require large computational resources, still making them unsuitable for any real-time or large-scale application. We propose a new deck recommendation system, named Q-DeckRec, which learns a deck search policy during a training phase and uses it to solve deck building problem instances. Our experimental results demonstrate Q-DeckRec requires less computational resources to build winning-effective decks after a training phase compared to several baseline methods.
Tasks Card Games
Published 2018-06-26
URL http://arxiv.org/abs/1806.09771v1
PDF http://arxiv.org/pdf/1806.09771v1.pdf
PWC https://paperswithcode.com/paper/q-deckrec-a-fast-deck-recommendation-system
Repo https://github.com/czxttkl/X-AI
Framework tf

An Adaptive Locally Connected Neuron Model: Focusing Neuron

Title An Adaptive Locally Connected Neuron Model: Focusing Neuron
Authors F. Boray Tek
Abstract We present a new artificial neuron model capable of learning its receptive field in the spatial domain of inputs. The name for the new model is focusing neuron because it can adapt both its receptive field location and size (aperture) during training. A network or a layer formed of such neurons can learn and generate unique connection structures for particular inputs/problems. The new model requires neither heuristics nor additional optimizations. Hence, all parameters, including those controlling the focus could be trained using the stochastic gradient descent optimization. We have empirically shown the capacity and viability of the new model with tests on synthetic and real datasets. We have constructed simple networks with one or two hidden layers; also employed fully connected networks with the same configurations as controls. In noise-added synthetic Gaussian blob datasets, we observed that focusing neurons can steer their receptive fields away from the redundant inputs and focused into more informative ones. Tests on common datasets such as MNIST have shown that a network of two hidden focusing layers can perform better (99.21% test accuracy) than a fully connected dense network with the same configuration.
Tasks
Published 2018-08-31
URL https://arxiv.org/abs/1809.09533v2
PDF https://arxiv.org/pdf/1809.09533v2.pdf
PWC https://paperswithcode.com/paper/an-adaptive-locally-connected-neuron-model
Repo https://github.com/btekgit/FocusingNeuron
Framework none

Exposing DeepFake Videos By Detecting Face Warping Artifacts

Title Exposing DeepFake Videos By Detecting Face Warping Artifacts
Authors Yuezun Li, Siwei Lyu
Abstract In this work, we describe a new deep learning based method that can effectively distinguish AI-generated fake videos (referred to as {\em DeepFake} videos hereafter) from real videos. Our method is based on the observations that current DeepFake algorithm can only generate images of limited resolutions, which need to be further warped to match the original faces in the source video. Such transforms leave distinctive artifacts in the resulting DeepFake videos, and we show that they can be effectively captured by convolutional neural networks (CNNs). Compared to previous methods which use a large amount of real and DeepFake generated images to train CNN classifier, our method does not need DeepFake generated images as negative training examples since we target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images. The advantages of our method are two-fold: (1) Such artifacts can be simulated directly using simple image processing operations on a image to make it as negative example. Since training a DeepFake model to generate negative examples is time-consuming and resource-demanding, our method saves a plenty of time and resources in training data collection; (2) Since such artifacts are general existed in DeepFake videos from different sources, our method is more robust compared to others. Our method is evaluated on two sets of DeepFake video datasets for its effectiveness in practice.
Tasks Face Swapping
Published 2018-11-01
URL https://arxiv.org/abs/1811.00656v3
PDF https://arxiv.org/pdf/1811.00656v3.pdf
PWC https://paperswithcode.com/paper/exposing-deepfake-videos-by-detecting-face
Repo https://github.com/danmohaha/CVPRW2019_Face_Artifacts
Framework tf
comments powered by Disqus