Paper Group AWR 18
AdvHat: Real-world adversarial attack on ArcFace Face ID system. CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval. The Implicit Bias of Depth: How Incremental Learning Drives Generalization. Geometric Back-projection Network for Point Cloud Classification. Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Dem …
AdvHat: Real-world adversarial attack on ArcFace Face ID system
Title | AdvHat: Real-world adversarial attack on ArcFace Face ID system |
Authors | Stepan Komkov, Aleksandr Petiushko |
Abstract | In this paper we propose a novel easily reproducible technique to attack the best public Face ID system ArcFace in different shooting conditions. To create an attack, we print the rectangular paper sticker on a common color printer and put it on the hat. The adversarial sticker is prepared with a novel algorithm for off-plane transformations of the image which imitates sticker location on the hat. Such an approach confuses the state-of-the-art public Face ID model LResNet100E-IR, ArcFace@ms1m-refine-v2 and is transferable to other Face ID models. |
Tasks | Adversarial Attack |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08705v1 |
https://arxiv.org/pdf/1908.08705v1.pdf | |
PWC | https://paperswithcode.com/paper/advhat-real-world-adversarial-attack-on |
Repo | https://github.com/papermsucode/advhat |
Framework | tf |
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Title | CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval |
Authors | Zihao Wang, Xihui Liu, Hongsheng Li, Lu Sheng, Junjie Yan, Xiaogang Wang, Jing Shao |
Abstract | Text-image cross-modal retrieval is a challenging task in the field of language and vision. Most previous approaches independently embed images and sentences into a joint embedding space and compare their similarities. However, previous approaches rarely explore the interactions between images and sentences before calculating similarities in the joint space. Intuitively, when matching between images and sentences, human beings would alternatively attend to regions in images and words in sentences, and select the most salient information considering the interaction between both modalities. In this paper, we propose Cross-modal Adaptive Message Passing (CAMP), which adaptively controls the information flow for message passing across modalities. Our approach not only takes comprehensive and fine-grained cross-modal interactions into account, but also properly handles negative pairs and irrelevant information with an adaptive gating scheme. Moreover, instead of conventional joint embedding approaches for text-image matching, we infer the matching score based on the fused features, and propose a hardest negative binary cross-entropy loss for training. Results on COCO and Flickr30k significantly surpass state-of-the-art methods, demonstrating the effectiveness of our approach. |
Tasks | Cross-Modal Retrieval, Image Retrieval |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05506v1 |
https://arxiv.org/pdf/1909.05506v1.pdf | |
PWC | https://paperswithcode.com/paper/camp-cross-modal-adaptive-message-passing-for |
Repo | https://github.com/ZihaoWang-CV/CAMP_iccv19 |
Framework | pytorch |
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
Title | The Implicit Bias of Depth: How Incremental Learning Drives Generalization |
Authors | Daniel Gissin, Shai Shalev-Shwartz, Amit Daniely |
Abstract | A leading hypothesis for the surprising generalization of neural networks is that the dynamics of gradient descent bias the model towards simple solutions, by searching through the solution space in an incremental order of complexity. We formally define the notion of incremental learning dynamics and derive the conditions on depth and initialization for which this phenomenon arises in deep linear models. Our main theoretical contribution is a dynamical depth separation result, proving that while shallow models can exhibit incremental learning dynamics, they require the initialization to be exponentially small for these dynamics to present themselves. However, once the model becomes deeper, the dependence becomes polynomial and incremental learning can arise in more natural settings. We complement our theoretical findings by experimenting with deep matrix sensing, quadratic neural networks and with binary classification using diagonal and convolutional linear networks, showing all of these models exhibit incremental learning. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12051v2 |
https://arxiv.org/pdf/1909.12051v2.pdf | |
PWC | https://paperswithcode.com/paper/the-implicit-bias-of-depth-how-incremental |
Repo | https://github.com/dsgissin/Incremental-Learning |
Framework | tf |
Geometric Back-projection Network for Point Cloud Classification
Title | Geometric Back-projection Network for Point Cloud Classification |
Authors | Shi Qiu, Saeed Anwar, Nick Barnes |
Abstract | As the basic task of point cloud learning, classification is fundamental but always challenging. To address some unsolved problems of existing methods, we propose a CNN based network leveraging an idea of error-correcting feedback structure to comprehensively capture the local features of 3D point clouds. Besides, we also enrich the explicit and implicit geometric information of point clouds in low-level 3D space and high-level feature space, respectively. By applying an attention module based on channel affinity, that focuses on distinct channels, the learned feature map of our network can effectively avoid redundancy. The performance on synthetic and real-world datasets demonstrate the superiority and applicability of our network. Comparing with other state-of-the-art methods, our approach balances accuracy and efficiency. |
Tasks | |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12885v3 |
https://arxiv.org/pdf/1911.12885v3.pdf | |
PWC | https://paperswithcode.com/paper/geometric-feedback-network-for-point-cloud |
Repo | https://github.com/ShiQiu0419/GFNet |
Framework | pytorch |
Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures
Title | Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures |
Authors | Filipe Rodrigues, Carlos Lima Azevedo |
Abstract | Reinforcement learning (RL) constitutes a promising solution for alleviating the problem of traffic congestion. In particular, deep RL algorithms have been shown to produce adaptive traffic signal controllers that outperform conventional systems. However, in order to be reliable in highly dynamic urban areas, such controllers need to be robust with the respect to a series of exogenous sources of uncertainty. In this paper, we develop an open-source callback-based framework for promoting the flexible evaluation of different deep RL configurations under a traffic simulation environment. With this framework, we investigate how deep RL-based adaptive traffic controllers perform under different scenarios, namely under demand surges caused by special events, capacity reductions from incidents and sensor failures. We extract several key insights for the development of robust deep RL algorithms for traffic control and propose concrete designs to mitigate the impact of the considered exogenous uncertainties. |
Tasks | |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08353v2 |
https://arxiv.org/pdf/1904.08353v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-deep-reinforcement-learning |
Repo | https://github.com/fmpr/CAREL |
Framework | tf |
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
Title | Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference |
Authors | R. Thomas McCoy, Ellie Pavlick, Tal Linzen |
Abstract | A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical overlap heuristic, the subsequence heuristic, and the constituent heuristic. To determine whether models have adopted these heuristics, we introduce a controlled evaluation set called HANS (Heuristic Analysis for NLI Systems), which contains many examples where the heuristics fail. We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics. We conclude that there is substantial room for improvement in NLI systems, and that the HANS dataset can motivate and measure progress in this area |
Tasks | Natural Language Inference |
Published | 2019-02-04 |
URL | https://arxiv.org/abs/1902.01007v4 |
https://arxiv.org/pdf/1902.01007v4.pdf | |
PWC | https://paperswithcode.com/paper/right-for-the-wrong-reasons-diagnosing |
Repo | https://github.com/tommccoy1/hans |
Framework | none |
Differentially Private Bayesian Linear Regression
Title | Differentially Private Bayesian Linear Regression |
Authors | Garrett Bernstein, Daniel Sheldon |
Abstract | Linear regression is an important tool across many fields that work with sensitive human-sourced data. Significant prior work has focused on producing differentially private point estimates, which provide a privacy guarantee to individuals while still allowing modelers to draw insights from data by estimating regression coefficients. We investigate the problem of Bayesian linear regression, with the goal of computing posterior distributions that correctly quantify uncertainty given privately released statistics. We show that a naive approach that ignores the noise injected by the privacy mechanism does a poor job in realistic data settings. We then develop noise-aware methods that perform inference over the privacy mechanism and produce correct posteriors across a wide range of scenarios. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13153v1 |
https://arxiv.org/pdf/1910.13153v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-bayesian-linear |
Repo | https://github.com/gbernstein6/private_bayesian_regression |
Framework | none |
Efficient Adaptation of Pretrained Transformers for Abstractive Summarization
Title | Efficient Adaptation of Pretrained Transformers for Abstractive Summarization |
Authors | Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi |
Abstract | Large-scale learning of transformer language models has yielded improvements on a variety of natural language understanding tasks. Whether they can be effectively adapted for summarization, however, has been less explored, as the learned representations are less seamlessly integrated into existing neural text production architectures. In this work, we propose two solutions for efficiently adapting pretrained transformer language models as text summarizers: source embeddings and domain-adaptive training. We test these solutions on three abstractive summarization datasets, achieving new state of the art performance on two of them. Finally, we show that these improvements are achieved by producing more focused summaries with fewer superfluous and that performance improvements are more pronounced on more abstractive datasets. |
Tasks | Abstractive Text Summarization |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00138v1 |
https://arxiv.org/pdf/1906.00138v1.pdf | |
PWC | https://paperswithcode.com/paper/190600138 |
Repo | https://github.com/Andrew03/transformer-abstractive-summarization |
Framework | pytorch |
Deep Generalized Method of Moments for Instrumental Variable Analysis
Title | Deep Generalized Method of Moments for Instrumental Variable Analysis |
Authors | Andrew Bennett, Nathan Kallus, Tobias Schnabel |
Abstract | Instrumental variable analysis is a powerful tool for estimating causal effects when randomization or full control of confounders is not possible. The application of standard methods such as 2SLS, GMM, and more recent variants are significantly impeded when the causal effects are complex, the instruments are high-dimensional, and/or the treatment is high-dimensional. In this paper, we propose the DeepGMM algorithm to overcome this. Our algorithm is based on a new variational reformulation of GMM with optimal inverse-covariance weighting that allows us to efficiently control very many moment conditions. We further develop practical techniques for optimization and model selection that make it particularly successful in practice. Our algorithm is also computationally tractable and can handle large-scale datasets. Numerical results show our algorithm matches the performance of the best tuned methods in standard settings and continues to work in high-dimensional settings where even recent methods break. |
Tasks | Model Selection |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12495v1 |
https://arxiv.org/pdf/1905.12495v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-generalized-method-of-moments-for |
Repo | https://github.com/CausalML/DeepGMM |
Framework | pytorch |
Aggregation via Separation: Boosting Facial Landmark Detector with Semi-Supervised Style Translation
Title | Aggregation via Separation: Boosting Facial Landmark Detector with Semi-Supervised Style Translation |
Authors | Shengju Qian, Keqiang Sun, Wayne Wu, Chen Qian, Jiaya Jia |
Abstract | Facial landmark detection, or face alignment, is a fundamental task that has been extensively studied. In this paper, we investigate a new perspective of facial landmark detection and demonstrate it leads to further notable improvement. Given that any face images can be factored into space of style that captures lighting, texture and image environment, and a style-invariant structure space, our key idea is to leverage disentangled style and shape space of each individual to augment existing structures via style translation. With these augmented synthetic samples, our semi-supervised model surprisingly outperforms the fully-supervised one by a large margin. Extensive experiments verify the effectiveness of our idea with state-of-the-art results on WFLW, 300W, COFW, and AFLW datasets. Our proposed structure is general and could be assembled into any face alignment frameworks. The code is made publicly available at https://github.com/thesouthfrog/stylealign. |
Tasks | Face Alignment, Facial Landmark Detection |
Published | 2019-08-18 |
URL | https://arxiv.org/abs/1908.06440v1 |
https://arxiv.org/pdf/1908.06440v1.pdf | |
PWC | https://paperswithcode.com/paper/aggregation-via-separation-boosting-facial |
Repo | https://github.com/thesouthfrog/stylealign |
Framework | tf |
On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems
Title | On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems |
Authors | Baekjin Kim, Ambuj Tewari |
Abstract | We investigate the optimality of perturbation based algorithms in the stochastic and adversarial multi-armed bandit problems. For the stochastic case, we provide a unified regret analysis for both sub-Weibull and bounded perturbations when rewards are sub-Gaussian. Our bounds are instance optimal for sub-Weibull perturbations with parameter 2 that also have a matching lower tail bound, and all bounded support perturbations where there is sufficient probability mass at the extremes of the support. For the adversarial setting, we prove rigorous barriers against two natural solution approaches using tools from discrete choice theory and extreme value theory. Our results suggest that the optimal perturbation, if it exists, will be of Frechet-type. |
Tasks | |
Published | 2019-02-02 |
URL | https://arxiv.org/abs/1902.00610v4 |
https://arxiv.org/pdf/1902.00610v4.pdf | |
PWC | https://paperswithcode.com/paper/on-the-optimality-of-perturbations-in |
Repo | https://github.com/Kimbaekjin/Perturbation-Methods-StochasticMAB |
Framework | none |
Learning to compress and search visual data in large-scale systems
Title | Learning to compress and search visual data in large-scale systems |
Authors | Sohrab Ferdowsi |
Abstract | The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results. |
Tasks | Compressive Sensing, Denoising, Image Compression, Image Denoising |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08437v1 |
http://arxiv.org/pdf/1901.08437v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-compress-and-search-visual-data |
Repo | https://github.com/sssohrab/PhDthesis |
Framework | none |
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Title | vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations |
Authors | Alexei Baevski, Steffen Schneider, Michael Auli |
Abstract | We propose vq-wav2vec to learn discrete representations of audio segments through a wav2vec-style self-supervised context prediction task. The algorithm uses either a gumbel softmax or online k-means clustering to quantize the dense representations. Discretization enables the direct application of algorithms from the NLP community which require discrete inputs. Experiments show that BERT pre-training achieves a new state of the art on TIMIT phoneme classification and WSJ speech recognition. |
Tasks | Speech Recognition |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05453v3 |
https://arxiv.org/pdf/1910.05453v3.pdf | |
PWC | https://paperswithcode.com/paper/vq-wav2vec-self-supervised-learning-of-1 |
Repo | https://github.com/pytorch/fairseq |
Framework | pytorch |
Bayesian Volumetric Autoregressive generative models for better semisupervised learning
Title | Bayesian Volumetric Autoregressive generative models for better semisupervised learning |
Authors | Guilherme Pombo, Robert Gray, Tom Varsavsky, John Ashburner, Parashkev Nachev |
Abstract | Deep generative models are rapidly gaining traction in medical imaging. Nonetheless, most generative architectures struggle to capture the underlying probability distributions of volumetric data, exhibit convergence problems, and offer no robust indices of model uncertainty. By comparison, the autoregressive generative model PixelCNN can be extended to volumetric data with relative ease, it readily attempts to learn the true underlying probability distribution and it still admits a Bayesian reformulation that provides a principled framework for reasoning about model uncertainty. Our contributions in this paper are two fold: first, we extend PixelCNN to work with volumetric brain magnetic resonance imaging data. Second, we show that reformulating this model to approximate a deep Gaussian process yields a measure of uncertainty that improves the performance of semi-supervised learning, in particular classification performance in settings where the proportion of labelled data is low. We quantify this improvement across classification, regression, and semantic segmentation tasks, training and testing on clinical magnetic resonance brain imaging data comprising T1-weighted and diffusion-weighted sequences. |
Tasks | Semantic Segmentation |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11559v1 |
https://arxiv.org/pdf/1907.11559v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-volumetric-autoregressive-generative |
Repo | https://github.com/guilherme-pombo/3DPixelCNN |
Framework | tf |
RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction
Title | RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction |
Authors | Rui Xia, Mengran Zhang, Zixiang Ding |
Abstract | The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01236v1 |
https://arxiv.org/pdf/1906.01236v1.pdf | |
PWC | https://paperswithcode.com/paper/rthn-a-rnn-transformer-hierarchical-network |
Repo | https://github.com/NUSTM/RTHN |
Framework | tf |