Paper Group AWR 250
AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning. Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality. Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models. Multiview Learning of Weighted Majority Vote by Bregman Divergence Min …
AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning
Title | AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning |
Authors | Jinyuan Jia, Neil Zhenqiang Gong |
Abstract | Users in various web and mobile applications are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer a target user’s private attributes (e.g., location, sexual orientation, political view) from its public data (e.g., rating scores, page likes). Existing defenses leverage game theory or heuristics based on correlations between the public data and attributes. These defenses are not practical. Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users’ public data. In this paper, we present AttriGuard, a practical defense against attribute inference attacks. AttriGuard is computationally tractable and has small utility loss. Our AttriGuard works in two phases. Suppose we aim to protect a user’s private attribute. In Phase I, for each value of the attribute, we find a minimum noise such that if we add the noise to the user’s public data, then the attacker’s classifier is very likely to infer the attribute value for the user. We find the minimum noise via adapting existing evasion attacks in adversarial machine learning. In Phase II, we sample one attribute value according to a certain probability distribution and add the corresponding noise found in Phase I to the user’s public data. We formulate finding the probability distribution as solving a constrained convex optimization problem. We extensively evaluate AttriGuard and compare it with existing methods using a real-world dataset. Our results show that AttriGuard substantially outperforms existing methods. Our work is the first one that shows evasion attacks can be used as defensive techniques for privacy protection. |
Tasks | |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04810v1 |
http://arxiv.org/pdf/1805.04810v1.pdf | |
PWC | https://paperswithcode.com/paper/attriguard-a-practical-defense-against |
Repo | https://github.com/jjy1994/AttriGuard |
Framework | tf |
Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality
Title | Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality |
Authors | Miaoyan Wang, Lexin Li |
Abstract | We consider the problem of decomposition of multiway tensor with binary entries. Such data problems arise frequently in numerous applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose that the observed binary entries follow a Bernoulli model, develop a rank-constrained likelihood-based estimation procedure, and obtain the theoretical accuracy guarantees. Specifically, we establish the error bound of the tensor estimation, and show that the obtained rate is minimax optimal under the considered model. We demonstrate the efficacy of our approach through both simulations and analyses of multiple real-world datasets on the tasks of tensor completion and clustering. |
Tasks | |
Published | 2018-11-13 |
URL | https://arxiv.org/abs/1811.05076v2 |
https://arxiv.org/pdf/1811.05076v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-binary-multiway-data |
Repo | https://github.com/Miaoyanwang/Binary-Tensor |
Framework | none |
Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models
Title | Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models |
Authors | Henry B. Moss, David S. Leslie, Paul Rayson |
Abstract | K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so our performance estimates are in fact stochastic, with variability that can be substantial for natural language processing tasks. We demonstrate that these unstable estimates cannot be relied upon for effective parameter tuning. The resulting tuned parameters are highly sensitive to how our data is partitioned, meaning that we often select sub-optimal parameter choices and have serious reproducibility issues. Instead, we propose to use the less variable J-K-fold CV, in which J independent K-fold cross validations are used to assess performance. Our main contributions are extending J-K-fold CV from performance estimation to parameter tuning and investigating how to choose J and K. We argue that variability is more important than bias for effective tuning and so advocate lower choices of K than are typically seen in the NLP literature, instead use the saved computation to increase J. To demonstrate the generality of our recommendations we investigate a wide range of case-studies: sentiment classification (both general and target-specific), part-of-speech tagging and document classification. |
Tasks | Document Classification, Model Selection, Part-Of-Speech Tagging, Sentiment Analysis |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07139v1 |
http://arxiv.org/pdf/1806.07139v1.pdf | |
PWC | https://paperswithcode.com/paper/using-j-k-fold-cross-validation-to-reduce |
Repo | https://github.com/henrymoss/COLING2018 |
Framework | none |
Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization
Title | Multiview Learning of Weighted Majority Vote by Bregman Divergence Minimization |
Authors | Anil Goyal, Emilie Morvant, Massih-Reza Amini |
Abstract | We tackle the issue of classifier combinations when observations have multiple views. Our method jointly learns view-specific weighted majority vote classifiers (i.e. for each view) over a set of base voters, and a second weighted majority vote classifier over the set of these view-specific weighted majority vote classifiers. We show that the empirical risk minimization of the final majority vote given a multiview training set can be cast as the minimization of Bregman divergences. This allows us to derive a parallel-update optimization algorithm for learning our multiview model. We empirically study our algorithm with a particular focus on the impact of the training set size on the multiview learning results. The experiments show that our approach is able to overcome the lack of labeled information. |
Tasks | Document Classification, Multilingual text classification, Multiview Learning, Text Classification |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10212v1 |
http://arxiv.org/pdf/1805.10212v1.pdf | |
PWC | https://paperswithcode.com/paper/multiview-learning-of-weighted-majority-vote |
Repo | https://github.com/goyalanil/Multiview_Dataset_MNIST |
Framework | none |
Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
Title | Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer |
Authors | Ashnil Kumar, Michael Fulham, Dagan Feng, Jinman Kim |
Abstract | The analysis of multi-modality positron emission tomography and computed tomography (PET-CT) images for computer aided diagnosis applications requires combining the sensitivity of PET to detect abnormal regions with anatomical localization from CT. Current methods for PET-CT image analysis either process the modalities separately or fuse information from each modality based on knowledge about the image analysis task. These methods generally do not consider the spatially varying visual characteristics that encode different information across the different modalities, which have different priorities at different locations. For example, a high abnormal PET uptake in the lungs is more meaningful for tumor detection than physiological PET uptake in the heart. Our aim is to improve fusion of the complementary information in multi-modality PET-CT with a new supervised convolutional neural network (CNN) that learns to fuse complementary information for multi-modality medical image analysis. Our CNN first encodes modality-specific features and then uses them to derive a spatially varying fusion map that quantifies the relative importance of each modality’s features across different spatial locations. These fusion maps are then multiplied with the modality-specific feature maps to obtain a representation of the complementary multi-modality information at different locations, which can then be used for image analysis. We evaluated the ability of our CNN to detect and segment multiple regions with different fusion requirements using a dataset of PET-CT images of lung cancer. We compared our method to baseline techniques for multi-modality image fusion and segmentation. Our findings show that our CNN had a significantly higher foreground detection accuracy (99.29%, p < 0.05) than the fusion baselines and a significantly higher Dice score (63.85%) than recent PET-CT tumor segmentation methods. |
Tasks | |
Published | 2018-10-05 |
URL | https://arxiv.org/abs/1810.02492v2 |
https://arxiv.org/pdf/1810.02492v2.pdf | |
PWC | https://paperswithcode.com/paper/co-learning-feature-fusion-maps-from-pet-ct |
Repo | https://github.com/ashnilkumar/colearn |
Framework | tf |
Stein Points
Title | Stein Points |
Authors | Wilson Ye Chen, Lester Mackey, Jackson Gorham, François-Xavier Briol, Chris J. Oates |
Abstract | An important task in computational statistics and machine learning is to approximate a posterior distribution $p(x)$ with an empirical measure supported on a set of representative points ${x_i}_{i=1}^n$. This paper focuses on methods where the selection of points is essentially deterministic, with an emphasis on achieving accurate approximation when $n$ is small. To this end, we present `Stein Points’. The idea is to exploit either a greedy or a conditional gradient method to iteratively minimise a kernel Stein discrepancy between the empirical measure and $p(x)$. Our empirical results demonstrate that Stein Points enable accurate approximation of the posterior at modest computational cost. In addition, theoretical results are provided to establish convergence of the method. | |
Tasks | |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10161v4 |
http://arxiv.org/pdf/1803.10161v4.pdf | |
PWC | https://paperswithcode.com/paper/stein-points |
Repo | https://github.com/wilson-ye-chen/stein_points |
Framework | none |
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Title | Rethinking the Faster R-CNN Architecture for Temporal Action Localization |
Authors | Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David A. Ross, Jia Deng, Rahul Sukthankar |
Abstract | We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework. TAL-Net addresses three key shortcomings of existing approaches: (1) we improve receptive field alignment using a multi-scale architecture that can accommodate extreme variation in action durations; (2) we better exploit the temporal context of actions for both proposal generation and action classification by appropriately extending receptive fields; and (3) we explicitly consider multi-stream feature fusion and demonstrate that fusing motion late is important. We achieve state-of-the-art performance for both action proposal and localization on THUMOS’14 detection benchmark and competitive performance on ActivityNet challenge. |
Tasks | Action Classification, Action Localization, Object Detection, Temporal Action Localization |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07667v1 |
http://arxiv.org/pdf/1804.07667v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-the-faster-r-cnn-architecture-for |
Repo | https://github.com/devbas/ovassistant-alpha |
Framework | none |
A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues
Title | A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues |
Authors | Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler |
Abstract | In this paper, we comprehensively describe the methodology of our submissions to the One-Minute Gradual-Emotion Behavior Challenge 2018. |
Tasks | |
Published | 2018-05-02 |
URL | https://arxiv.org/abs/1805.00638v2 |
https://arxiv.org/pdf/1805.00638v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-network-for-arousal-valence-emotion |
Repo | https://github.com/pengsongyou/OMG-ADSC |
Framework | pytorch |
Comparative Document Summarisation via Classification
Title | Comparative Document Summarisation via Classification |
Authors | Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon, Lexing Xie |
Abstract | This paper considers extractive summarisation in a comparative setting: given two or more document groups (e.g., separated by publication time), the goal is to select a small number of documents that are representative of each group, and also maximally distinguishable from other groups. We formulate a set of new objective functions for this problem that connect recent literature on document summarisation, interpretable machine learning, and data subset selection. In particular, by casting the problem as a binary classification amongst different groups, we derive objectives based on the notion of maximum mean discrepancy, as well as a simple yet effective gradient-based optimisation strategy. Our new formulation allows scalable evaluations of comparative summarisation as a classification task, both automatically and via crowd-sourcing. To this end, we evaluate comparative summarisation methods on a newly curated collection of controversial news topics over 13 months. We observe that gradient-based optimisation outperforms discrete and baseline approaches in 14 out of 24 different automatic evaluation settings. In crowd-sourced evaluations, summaries from gradient optimisation elicit 7% more accurate classification from human workers than discrete optimisation. Our result contrasts with recent literature on submodular data subset selection that favours discrete optimisation. We posit that our formulation of comparative summarisation will prove useful in a diverse range of use cases such as comparing content sources, authors, related topics, or distinct view points. |
Tasks | Interpretable Machine Learning |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02171v2 |
https://arxiv.org/pdf/1812.02171v2.pdf | |
PWC | https://paperswithcode.com/paper/comparative-document-summarisation-via |
Repo | https://github.com/computationalmedia/compsumm |
Framework | none |
Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach
Title | Optimizing Deep Neural Network Architecture: A Tabu Search Based Approach |
Authors | Tarun Kumar Gupta, Khalid Raza |
Abstract | The performance of Feedforward neural network (FNN) fully de-pends upon the selection of architecture and training algorithm. FNN architecture can be tweaked using several parameters, such as the number of hidden layers, number of hidden neurons at each hidden layer and number of connections between layers. There may be exponential combinations for these architectural attributes which may be unmanageable manually, so it requires an algorithm which can automatically design an optimal architecture with high generalization ability. Numerous optimization algorithms have been utilized for FNN architecture determination. This paper proposes a new methodology which can work on the estimation of hidden layers and their respective neurons for FNN. This work combines the advantages of Tabu search (TS) and Gradient descent with momentum backpropagation (GDM) training algorithm to demonstrate how Tabu search can automatically select the best architecture from the populated architectures based on minimum testing error criteria. The proposed approach has been tested on four classification benchmark dataset of different size. |
Tasks | |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.05979v1 |
http://arxiv.org/pdf/1808.05979v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-deep-neural-network-architecture-a |
Repo | https://github.com/marzekan/Digit-classifier |
Framework | none |
Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models
Title | Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models |
Authors | Shanshan Wu, Sujay Sanghavi, Alexandros G. Dimakis |
Abstract | We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. samples. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an $\ell_{2,1}$ group-norm constraint needs to be used. We show that this algorithm can recover any arbitrary discrete pairwise graphical model, and also characterize its sample complexity as a function of model width, alphabet size, edge parameter accuracy, and the number of variables. We show that along every one of these axes, it matches or improves on all existing results and algorithms for this problem. Our analysis applies a sharp generalization error bound for logistic regression when the weight vector has an $\ell_1$ constraint (or $\ell_{2,1}$ constraint) and the sample vector has an $\ell_{\infty}$ constraint (or $\ell_{2, \infty}$ constraint). We also show that the proposed convex programs can be efficiently solved in $\tilde{O}(n^2)$ running time (where $n$ is the number of variables) under the same statistical guarantees. We provide experimental results to support our analysis. |
Tasks | |
Published | 2018-10-28 |
URL | https://arxiv.org/abs/1810.11905v3 |
https://arxiv.org/pdf/1810.11905v3.pdf | |
PWC | https://paperswithcode.com/paper/sparse-logistic-regression-learns-all |
Repo | https://github.com/wushanshan/GraphLearn |
Framework | none |
Exploration by Random Network Distillation
Title | Exploration by Random Network Distillation |
Authors | Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov |
Abstract | We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma’s Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using demonstrations or having access to the underlying state of the game, and occasionally completes the first level. |
Tasks | Atari Games, Montezuma’s Revenge |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12894v1 |
http://arxiv.org/pdf/1810.12894v1.pdf | |
PWC | https://paperswithcode.com/paper/exploration-by-random-network-distillation |
Repo | https://github.com/kngwyu/intrinsic-rewards |
Framework | pytorch |
Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games
Title | Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games |
Authors | Zhengxing Chen, Chris Amato, Truong-Huy Nguyen, Seth Cooper, Yizhou Sun, Magy Seif El-Nasr |
Abstract | Deck building is a crucial component in playing Collectible Card Games (CCGs). The goal of deck building is to choose a fixed-sized subset of cards from a large card pool, so that they work well together in-game against specific opponents. Existing methods either lack flexibility to adapt to different opponents or require large computational resources, still making them unsuitable for any real-time or large-scale application. We propose a new deck recommendation system, named Q-DeckRec, which learns a deck search policy during a training phase and uses it to solve deck building problem instances. Our experimental results demonstrate Q-DeckRec requires less computational resources to build winning-effective decks after a training phase compared to several baseline methods. |
Tasks | Card Games |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09771v1 |
http://arxiv.org/pdf/1806.09771v1.pdf | |
PWC | https://paperswithcode.com/paper/q-deckrec-a-fast-deck-recommendation-system |
Repo | https://github.com/czxttkl/X-AI |
Framework | tf |
An Adaptive Locally Connected Neuron Model: Focusing Neuron
Title | An Adaptive Locally Connected Neuron Model: Focusing Neuron |
Authors | F. Boray Tek |
Abstract | We present a new artificial neuron model capable of learning its receptive field in the spatial domain of inputs. The name for the new model is focusing neuron because it can adapt both its receptive field location and size (aperture) during training. A network or a layer formed of such neurons can learn and generate unique connection structures for particular inputs/problems. The new model requires neither heuristics nor additional optimizations. Hence, all parameters, including those controlling the focus could be trained using the stochastic gradient descent optimization. We have empirically shown the capacity and viability of the new model with tests on synthetic and real datasets. We have constructed simple networks with one or two hidden layers; also employed fully connected networks with the same configurations as controls. In noise-added synthetic Gaussian blob datasets, we observed that focusing neurons can steer their receptive fields away from the redundant inputs and focused into more informative ones. Tests on common datasets such as MNIST have shown that a network of two hidden focusing layers can perform better (99.21% test accuracy) than a fully connected dense network with the same configuration. |
Tasks | |
Published | 2018-08-31 |
URL | https://arxiv.org/abs/1809.09533v2 |
https://arxiv.org/pdf/1809.09533v2.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-locally-connected-neuron-model |
Repo | https://github.com/btekgit/FocusingNeuron |
Framework | none |
Exposing DeepFake Videos By Detecting Face Warping Artifacts
Title | Exposing DeepFake Videos By Detecting Face Warping Artifacts |
Authors | Yuezun Li, Siwei Lyu |
Abstract | In this work, we describe a new deep learning based method that can effectively distinguish AI-generated fake videos (referred to as {\em DeepFake} videos hereafter) from real videos. Our method is based on the observations that current DeepFake algorithm can only generate images of limited resolutions, which need to be further warped to match the original faces in the source video. Such transforms leave distinctive artifacts in the resulting DeepFake videos, and we show that they can be effectively captured by convolutional neural networks (CNNs). Compared to previous methods which use a large amount of real and DeepFake generated images to train CNN classifier, our method does not need DeepFake generated images as negative training examples since we target the artifacts in affine face warping as the distinctive feature to distinguish real and fake images. The advantages of our method are two-fold: (1) Such artifacts can be simulated directly using simple image processing operations on a image to make it as negative example. Since training a DeepFake model to generate negative examples is time-consuming and resource-demanding, our method saves a plenty of time and resources in training data collection; (2) Since such artifacts are general existed in DeepFake videos from different sources, our method is more robust compared to others. Our method is evaluated on two sets of DeepFake video datasets for its effectiveness in practice. |
Tasks | Face Swapping |
Published | 2018-11-01 |
URL | https://arxiv.org/abs/1811.00656v3 |
https://arxiv.org/pdf/1811.00656v3.pdf | |
PWC | https://paperswithcode.com/paper/exposing-deepfake-videos-by-detecting-face |
Repo | https://github.com/danmohaha/CVPRW2019_Face_Artifacts |
Framework | tf |