Paper Group ANR 909
Siamese Generative Adversarial Privatizer for Biometric Data. A Deeper Look at Power Normalizations. Cell-aware Stacked LSTMs for Modeling Sentences. Deep CNN based feature extractor for text-prompted speaker recognition. Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild. Contemplating Visual Emotions: Understanding and Overcomi …
Siamese Generative Adversarial Privatizer for Biometric Data
Title | Siamese Generative Adversarial Privatizer for Biometric Data |
Authors | Witold Oleszkiewicz, Peter Kairouz, Karol Piczak, Ram Rajagopal, Tomasz Trzcinski |
Abstract | State-of-the-art machine learning algorithms can be fooled by carefully crafted adversarial examples. As such, adversarial examples present a concrete problem in AI safety. In this work we turn the tables and ask the following question: can we harness the power of adversarial examples to prevent malicious adversaries from learning identifying information from data while allowing non-malicious entities to benefit from the utility of the same data? For instance, can we use adversarial examples to anonymize biometric dataset of faces while retaining usefulness of this data for other purposes, such as emotion recognition? To address this question, we propose a simple yet effective method, called Siamese Generative Adversarial Privatizer (SGAP), that exploits the properties of a Siamese neural network to find discriminative features that convey identifying information. When coupled with a generative model, our approach is able to correctly locate and disguise identifying information, while minimally reducing the utility of the privatized dataset. Extensive evaluation on a biometric dataset of fingerprints and cartoon faces confirms usefulness of our simple yet effective method. |
Tasks | Emotion Recognition |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08757v3 |
http://arxiv.org/pdf/1804.08757v3.pdf | |
PWC | https://paperswithcode.com/paper/siamese-generative-adversarial-privatizer-for |
Repo | |
Framework | |
A Deeper Look at Power Normalizations
Title | A Deeper Look at Power Normalizations |
Authors | Piotr Koniusz, Hongguang Zhang, Fatih Porikli |
Abstract | Power Normalizations (PN) are very useful non-linear operators in the context of Bag-of-Words data representations as they tackle problems such as feature imbalance. In this paper, we reconsider these operators in the deep learning setup by introducing a novel layer that implements PN for non-linear pooling of feature maps. Specifically, by using a kernel formulation, our layer combines the feature vectors and their respective spatial locations in the feature maps produced by the last convolutional layer of CNN. Linearization of such a kernel results in a positive definite matrix capturing the second-order statistics of the feature vectors, to which PN operators are applied. We study two types of PN functions, namely (i) MaxExp and (ii) Gamma, addressing their role and meaning in the context of nonlinear pooling. We also provide a probabilistic interpretation of these operators and derive their surrogates with well-behaved gradients for end-to-end CNN learning. We apply our theory to practice by implementing the PN layer on a ResNet-50 model and showcase experiments on four benchmarks for fine-grained recognition, scene recognition, and material classification. Our results demonstrate state-of-the-art performance across all these tasks. |
Tasks | Material Classification, Scene Recognition |
Published | 2018-06-24 |
URL | http://arxiv.org/abs/1806.09183v1 |
http://arxiv.org/pdf/1806.09183v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deeper-look-at-power-normalizations |
Repo | |
Framework | |
Cell-aware Stacked LSTMs for Modeling Sentences
Title | Cell-aware Stacked LSTMs for Modeling Sentences |
Authors | Jihun Choi, Taeuk Kim, Sang-goo Lee |
Abstract | We propose a method of stacking multiple long short-term memory (LSTM) layers for modeling sentences. In contrast to the conventional stacked LSTMs where only hidden states are fed as input to the next layer, the suggested architecture accepts both hidden and memory cell states of the preceding layer and fuses information from the left and the lower context using the soft gating mechanism of LSTMs. Thus the architecture modulates the amount of information to be delivered not only in horizontal recurrence but also in vertical connections, from which useful features extracted from lower layers are effectively conveyed to upper layers. We dub this architecture Cell-aware Stacked LSTM (CAS-LSTM) and show from experiments that our models bring significant performance gain over the standard LSTMs on benchmark datasets for natural language inference, paraphrase detection, sentiment classification, and machine translation. We also conduct extensive qualitative analysis to understand the internal behavior of the suggested approach. |
Tasks | Machine Translation, Natural Language Inference, Paraphrase Identification, Sentiment Analysis |
Published | 2018-09-07 |
URL | https://arxiv.org/abs/1809.02279v2 |
https://arxiv.org/pdf/1809.02279v2.pdf | |
PWC | https://paperswithcode.com/paper/cell-aware-stacked-lstms-for-modeling |
Repo | |
Framework | |
Deep CNN based feature extractor for text-prompted speaker recognition
Title | Deep CNN based feature extractor for text-prompted speaker recognition |
Authors | Sergey Novoselov, Oleg Kudashev, Vadim Schemelinin, Ivan Kremnev, Galina Lavrentyeva |
Abstract | Deep learning is still not a very common tool in speaker verification field. We study deep convolutional neural network performance in the text-prompted speaker verification task. The prompted passphrase is segmented into word states - i.e. digits -to test each digit utterance separately. We train a single high-level feature extractor for all states and use cosine similarity metric for scoring. The key feature of our network is the Max-Feature-Map activation function, which acts as an embedded feature selector. By using multitask learning scheme to train the high-level feature extractor we were able to surpass the classic baseline systems in terms of quality and achieved impressive results for such a novice approach, getting 2.85% EER on the RSR2015 evaluation set. Fusion of the proposed and the baseline systems improves this result. |
Tasks | Speaker Recognition, Speaker Verification |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.05307v1 |
http://arxiv.org/pdf/1803.05307v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-cnn-based-feature-extractor-for-text |
Repo | |
Framework | |
Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild
Title | Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild |
Authors | Zhen-Hua Feng, Patrik Huber, Josef Kittler, Peter JB Hancock, Xiao-Jun Wu, Qijun Zhao, Paul Koppen, Matthias Rätsch |
Abstract | This paper investigates the evaluation of dense 3D face reconstruction from a single 2D image in the wild. To this end, we organise a competition that provides a new benchmark dataset that contains 2000 2D facial images of 135 subjects as well as their 3D ground truth face scans. In contrast to previous competitions or challenges, the aim of this new benchmark dataset is to evaluate the accuracy of a 3D dense face reconstruction algorithm using real, accurate and high-resolution 3D ground truth face scans. In addition to the dataset, we provide a standard protocol as well as a Python script for the evaluation. Last, we report the results obtained by three state-of-the-art 3D face reconstruction systems on the new benchmark dataset. The competition is organised along with the 2018 13th IEEE Conference on Automatic Face & Gesture Recognition. |
Tasks | 3D Face Reconstruction, 3D Reconstruction, Face Reconstruction, Gesture Recognition |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05536v2 |
http://arxiv.org/pdf/1803.05536v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-dense-3d-reconstruction-from-2d |
Repo | |
Framework | |
Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias
Title | Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias |
Authors | Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, Amit K. Roy-Chowdhury |
Abstract | While machine learning approaches to visual emotion recognition offer great promise, current methods consider training and testing models on small scale datasets covering limited visual emotion concepts. Our analysis identifies an important but long overlooked issue of existing visual emotion benchmarks in the form of dataset biases. We design a series of tests to show and measure how such dataset biases obstruct learning a generalizable emotion recognition model. Based on our analysis, we propose a webly supervised approach by leveraging a large quantity of stock image data. Our approach uses a simple yet effective curriculum guided training strategy for learning discriminative emotion features. We discover that the models learned using our large scale stock image dataset exhibit significantly better generalization ability than the existing datasets without the manual collection of even a single label. Moreover, visual representation learned using our approach holds a lot of promise across a variety of tasks on different image and video datasets. |
Tasks | Emotion Recognition |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02212v1 |
http://arxiv.org/pdf/1808.02212v1.pdf | |
PWC | https://paperswithcode.com/paper/contemplating-visual-emotions-understanding |
Repo | |
Framework | |
Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering
Title | Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering |
Authors | Wei Wang, Ming Yan, Chen Wu |
Abstract | This paper describes a novel hierarchical attention network for reading comprehension style question answering, which aims to answer questions for a given narrative paragraph. In the proposed method, attention and fusion are conducted horizontally and vertically across layers at different levels of granularity between question and paragraph. Specifically, it first encode the question and paragraph with fine-grained language embeddings, to better capture the respective representations at semantic level. Then it proposes a multi-granularity fusion approach to fully fuse information from both global and attended representations. Finally, it introduces a hierarchical attention network to focuses on the answer span progressively with multi-level softalignment. Extensive experiments on the large-scale SQuAD and TriviaQA datasets validate the effectiveness of the proposed method. At the time of writing the paper (Jan. 12th 2018), our model achieves the first position on the SQuAD leaderboard for both single and ensemble models. We also achieves state-of-the-art results on TriviaQA, AddSent and AddOne-Sent datasets. |
Tasks | Question Answering, Reading Comprehension |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.11934v1 |
http://arxiv.org/pdf/1811.11934v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-granularity-hierarchical-attention |
Repo | |
Framework | |
On the Robustness of Interpretability Methods
Title | On the Robustness of Interpretability Methods |
Authors | David Alvarez-Melis, Tommi S. Jaakkola |
Abstract | We argue that robustness of explanations—i.e., that similar inputs should give rise to similar explanations—is a key desideratum for interpretability. We introduce metrics to quantify robustness and demonstrate that current methods do not perform well according to these metrics. Finally, we propose ways that robustness can be enforced on existing interpretability approaches. |
Tasks | |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08049v1 |
http://arxiv.org/pdf/1806.08049v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-robustness-of-interpretability-methods |
Repo | |
Framework | |
Learning Beyond Human Expertise with Generative Models for Dental Restorations
Title | Learning Beyond Human Expertise with Generative Models for Dental Restorations |
Authors | Jyh-Jing Hwang, Sergei Azernikov, Alexei A. Efros, Stella X. Yu |
Abstract | Computer vision has advanced significantly that many discriminative approaches such as object recognition are now widely used in real applications. We present another exciting development that utilizes generative models for the mass customization of medical products such as dental crowns. In the dental industry, it takes a technician years of training to design synthetic crowns that restore the function and integrity of missing teeth. Each crown must be customized to individual patients, and it requires human expertise in a time-consuming and labor-intensive process, even with computer-assisted design software. We develop a fully automatic approach that learns not only from human designs of dental crowns, but also from natural spatial profiles between opposing teeth. The latter is hard to account for by technicians but important for proper biting and chewing functions. Built upon a Generative Adversar-ial Network architecture (GAN), our deep learning model predicts the customized crown-filled depth scan from the crown-missing depth scan and opposing depth scan. We propose to incorporate additional space constraints and statistical compatibility into learning. Our automatic designs exceed human technicians’ standards for good morphology and functionality, and our algorithm is being tested for production use. |
Tasks | Object Recognition |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1804.00064v1 |
http://arxiv.org/pdf/1804.00064v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-beyond-human-expertise-with |
Repo | |
Framework | |
Dyna: A Method of Momentum for Stochastic Optimization
Title | Dyna: A Method of Momentum for Stochastic Optimization |
Authors | Zhidong Han |
Abstract | An algorithm is presented for momentum gradient descent optimization based on the first-order differential equation of the Newtonian dynamics. The fictitious mass is introduced to the dynamics of momentum for regularizing the adaptive stepsize of each individual parameter. The dynamic relaxation is adapted for stochastic optimization of nonlinear objective functions through an explicit time integration with varying damping ratio. The adaptive stepsize is optimized for each individual neural network layer based on the number of inputs. The adaptive stepsize for every parameter over the entire neural network is uniformly optimized with one upper bound, independent of sparsity, for better overall convergence rate. The numerical implementation of the algorithm is similar to the Adam Optimizer, possessing computational efficiency, similar memory requirements, etc. There are three hyper-parameters in the algorithm with clear physical interpretation. Preliminary trials show promise in performance and convergence. |
Tasks | Stochastic Optimization |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04933v1 |
http://arxiv.org/pdf/1805.04933v1.pdf | |
PWC | https://paperswithcode.com/paper/dyna-a-method-of-momentum-for-stochastic |
Repo | |
Framework | |
Robust Classification of Financial Risk
Title | Robust Classification of Financial Risk |
Authors | Suproteem K. Sarkar, Kojin Oshiba, Daniel Giebisch, Yaron Singer |
Abstract | Algorithms are increasingly common components of high-impact decision-making, and a growing body of literature on adversarial examples in laboratory settings indicates that standard machine learning models are not robust. This suggests that real-world systems are also susceptible to manipulation or misclassification, which especially poses a challenge to machine learning models used in financial services. We use the loan grade classification problem to explore how machine learning models are sensitive to small changes in user-reported data, using adversarial attacks documented in the literature and an original, domain-specific attack. Our work shows that a robust optimization algorithm can build models for financial services that are resistant to misclassification on perturbations. To the best of our knowledge, this is the first study of adversarial attacks and defenses for deep learning in financial services. |
Tasks | Decision Making |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11079v1 |
http://arxiv.org/pdf/1811.11079v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-classification-of-financial-risk |
Repo | |
Framework | |
Event Coreference Resolution Using Neural Network Classifiers
Title | Event Coreference Resolution Using Neural Network Classifiers |
Authors | Arun Pandian, Lamana Mulaffer, Kemal Oflazer, Amna AlZeyara |
Abstract | This paper presents a neural network classifier approach to detecting both within- and cross- document event coreference effectively using only event mention based features. Our approach does not (yet) rely on any event argument features such as semantic roles or spatiotemporal arguments. Experimental results on the ECB+ dataset show that our approach produces F1 scores that significantly outperform the state-of-the-art methods for both within-document and cross-document event coreference resolution when we use B3 and CEAFe evaluation measures, but gets worse F1 score with the MUC measure. However, when we use the CoNLL measure, which is the average of these three scores, our approach has slightly better F1 for within- document event coreference resolution but is significantly better for cross-document event coreference resolution. |
Tasks | Coreference Resolution |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.04216v1 |
http://arxiv.org/pdf/1810.04216v1.pdf | |
PWC | https://paperswithcode.com/paper/event-coreference-resolution-using-neural |
Repo | |
Framework | |
Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model
Title | Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model |
Authors | Niko Brummer, Anna Silnova, Lukas Burget, Themos Stafylakis |
Abstract | Embeddings in machine learning are low-dimensional representations of complex input patterns, with the property that simple geometric operations like Euclidean distances and dot products can be used for classification and comparison tasks. The proposed meta-embeddings are special embeddings that live in more general inner product spaces. They are designed to propagate uncertainty to the final output in speaker recognition and similar applications. The familiar Gaussian PLDA model (GPLDA) can be re-formulated as an extractor for Gaussian meta-embeddings (GMEs), such that likelihood ratio scores are given by Hilbert space inner products between Gaussian likelihood functions. GMEs extracted by the GPLDA model have fixed precisions and do not propagate uncertainty. We show that a generalization to heavy-tailed PLDA gives GMEs with variable precisions, which do propagate uncertainty. Experiments on NIST SRE 2010 and 2016 show that the proposed method applied to i-vectors without length normalization is up to 20% more accurate than GPLDA applied to length-normalized ivectors. |
Tasks | Speaker Recognition |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09777v1 |
http://arxiv.org/pdf/1802.09777v1.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-meta-embeddings-for-efficient |
Repo | |
Framework | |
Predicting Infant Motor Development Status using Day Long Movement Data from Wearable Sensors
Title | Predicting Infant Motor Development Status using Day Long Movement Data from Wearable Sensors |
Authors | David Goodfellow, Ruoyu Zhi, Rebecca Funke, Jose Carlos Pulido, Maja Mataric, Beth A. Smith |
Abstract | Infants with a variety of complications at or before birth are classified as being at risk for developmental delays (AR). As they grow older, they are followed by healthcare providers in an effort to discern whether they are on a typical or impaired developmental trajectory. Often, it is difficult to make an accurate determination early in infancy as infants with typical development (TD) display high variability in their developmental trajectories both in content and timing. Studies have shown that spontaneous movements have the potential to differentiate typical and atypical trajectories early in life using sensors and kinematic analysis systems. In this study, machine learning classification algorithms are used to take inertial movement from wearable sensors placed on an infant for a day and predict if the infant is AR or TD, thus further establishing the connection between early spontaneous movement and developmental trajectory. |
Tasks | |
Published | 2018-07-07 |
URL | http://arxiv.org/abs/1807.02617v2 |
http://arxiv.org/pdf/1807.02617v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-infant-motor-development-status |
Repo | |
Framework | |
Analyzing Uncertainty in Neural Machine Translation
Title | Analyzing Uncertainty in Neural Machine Translation |
Authors | Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato |
Abstract | Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some shortcomings of current models. As part of this study, we release multiple human reference translations for two popular benchmarks. |
Tasks | Calibration, Machine Translation |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1803.00047v4 |
http://arxiv.org/pdf/1803.00047v4.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-uncertainty-in-neural-machine |
Repo | |
Framework | |