October 17, 2019

2874 words 14 mins read

Paper Group ANR 909

Siamese Generative Adversarial Privatizer for Biometric Data. A Deeper Look at Power Normalizations. Cell-aware Stacked LSTMs for Modeling Sentences. Deep CNN based feature extractor for text-prompted speaker recognition. Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild. Contemplating Visual Emotions: Understanding and Overcomi …

Siamese Generative Adversarial Privatizer for Biometric Data


Title	Siamese Generative Adversarial Privatizer for Biometric Data
Authors	Witold Oleszkiewicz, Peter Kairouz, Karol Piczak, Ram Rajagopal, Tomasz Trzcinski
Abstract	State-of-the-art machine learning algorithms can be fooled by carefully crafted adversarial examples. As such, adversarial examples present a concrete problem in AI safety. In this work we turn the tables and ask the following question: can we harness the power of adversarial examples to prevent malicious adversaries from learning identifying information from data while allowing non-malicious entities to benefit from the utility of the same data? For instance, can we use adversarial examples to anonymize biometric dataset of faces while retaining usefulness of this data for other purposes, such as emotion recognition? To address this question, we propose a simple yet effective method, called Siamese Generative Adversarial Privatizer (SGAP), that exploits the properties of a Siamese neural network to find discriminative features that convey identifying information. When coupled with a generative model, our approach is able to correctly locate and disguise identifying information, while minimally reducing the utility of the privatized dataset. Extensive evaluation on a biometric dataset of fingerprints and cartoon faces confirms usefulness of our simple yet effective method.
Tasks	Emotion Recognition
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08757v3
PDF	http://arxiv.org/pdf/1804.08757v3.pdf
PWC	https://paperswithcode.com/paper/siamese-generative-adversarial-privatizer-for
Repo
Framework

A Deeper Look at Power Normalizations


Title	A Deeper Look at Power Normalizations
Authors	Piotr Koniusz, Hongguang Zhang, Fatih Porikli
Abstract	Power Normalizations (PN) are very useful non-linear operators in the context of Bag-of-Words data representations as they tackle problems such as feature imbalance. In this paper, we reconsider these operators in the deep learning setup by introducing a novel layer that implements PN for non-linear pooling of feature maps. Specifically, by using a kernel formulation, our layer combines the feature vectors and their respective spatial locations in the feature maps produced by the last convolutional layer of CNN. Linearization of such a kernel results in a positive definite matrix capturing the second-order statistics of the feature vectors, to which PN operators are applied. We study two types of PN functions, namely (i) MaxExp and (ii) Gamma, addressing their role and meaning in the context of nonlinear pooling. We also provide a probabilistic interpretation of these operators and derive their surrogates with well-behaved gradients for end-to-end CNN learning. We apply our theory to practice by implementing the PN layer on a ResNet-50 model and showcase experiments on four benchmarks for fine-grained recognition, scene recognition, and material classification. Our results demonstrate state-of-the-art performance across all these tasks.
Tasks	Material Classification, Scene Recognition
Published	2018-06-24
URL	http://arxiv.org/abs/1806.09183v1
PDF	http://arxiv.org/pdf/1806.09183v1.pdf
PWC	https://paperswithcode.com/paper/a-deeper-look-at-power-normalizations
Repo
Framework

Cell-aware Stacked LSTMs for Modeling Sentences


Title	Cell-aware Stacked LSTMs for Modeling Sentences
Authors	Jihun Choi, Taeuk Kim, Sang-goo Lee
Abstract	We propose a method of stacking multiple long short-term memory (LSTM) layers for modeling sentences. In contrast to the conventional stacked LSTMs where only hidden states are fed as input to the next layer, the suggested architecture accepts both hidden and memory cell states of the preceding layer and fuses information from the left and the lower context using the soft gating mechanism of LSTMs. Thus the architecture modulates the amount of information to be delivered not only in horizontal recurrence but also in vertical connections, from which useful features extracted from lower layers are effectively conveyed to upper layers. We dub this architecture Cell-aware Stacked LSTM (CAS-LSTM) and show from experiments that our models bring significant performance gain over the standard LSTMs on benchmark datasets for natural language inference, paraphrase detection, sentiment classification, and machine translation. We also conduct extensive qualitative analysis to understand the internal behavior of the suggested approach.
Tasks	Machine Translation, Natural Language Inference, Paraphrase Identification, Sentiment Analysis
Published	2018-09-07
URL	https://arxiv.org/abs/1809.02279v2
PDF	https://arxiv.org/pdf/1809.02279v2.pdf
PWC	https://paperswithcode.com/paper/cell-aware-stacked-lstms-for-modeling
Repo
Framework

Deep CNN based feature extractor for text-prompted speaker recognition


Title	Deep CNN based feature extractor for text-prompted speaker recognition
Authors	Sergey Novoselov, Oleg Kudashev, Vadim Schemelinin, Ivan Kremnev, Galina Lavrentyeva
Abstract	Deep learning is still not a very common tool in speaker verification field. We study deep convolutional neural network performance in the text-prompted speaker verification task. The prompted passphrase is segmented into word states - i.e. digits -to test each digit utterance separately. We train a single high-level feature extractor for all states and use cosine similarity metric for scoring. The key feature of our network is the Max-Feature-Map activation function, which acts as an embedded feature selector. By using multitask learning scheme to train the high-level feature extractor we were able to surpass the classic baseline systems in terms of quality and achieved impressive results for such a novice approach, getting 2.85% EER on the RSR2015 evaluation set. Fusion of the proposed and the baseline systems improves this result.
Tasks	Speaker Recognition, Speaker Verification
Published	2018-03-13
URL	http://arxiv.org/abs/1803.05307v1
PDF	http://arxiv.org/pdf/1803.05307v1.pdf
PWC	https://paperswithcode.com/paper/deep-cnn-based-feature-extractor-for-text
Repo
Framework

Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild


Title	Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild
Authors	Zhen-Hua Feng, Patrik Huber, Josef Kittler, Peter JB Hancock, Xiao-Jun Wu, Qijun Zhao, Paul Koppen, Matthias Rätsch
Abstract	This paper investigates the evaluation of dense 3D face reconstruction from a single 2D image in the wild. To this end, we organise a competition that provides a new benchmark dataset that contains 2000 2D facial images of 135 subjects as well as their 3D ground truth face scans. In contrast to previous competitions or challenges, the aim of this new benchmark dataset is to evaluate the accuracy of a 3D dense face reconstruction algorithm using real, accurate and high-resolution 3D ground truth face scans. In addition to the dataset, we provide a standard protocol as well as a Python script for the evaluation. Last, we report the results obtained by three state-of-the-art 3D face reconstruction systems on the new benchmark dataset. The competition is organised along with the 2018 13th IEEE Conference on Automatic Face & Gesture Recognition.
Tasks	3D Face Reconstruction, 3D Reconstruction, Face Reconstruction, Gesture Recognition
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05536v2
PDF	http://arxiv.org/pdf/1803.05536v2.pdf
PWC	https://paperswithcode.com/paper/evaluation-of-dense-3d-reconstruction-from-2d
Repo
Framework

Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias


Title	Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias
Authors	Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, Amit K. Roy-Chowdhury
Abstract	While machine learning approaches to visual emotion recognition offer great promise, current methods consider training and testing models on small scale datasets covering limited visual emotion concepts. Our analysis identifies an important but long overlooked issue of existing visual emotion benchmarks in the form of dataset biases. We design a series of tests to show and measure how such dataset biases obstruct learning a generalizable emotion recognition model. Based on our analysis, we propose a webly supervised approach by leveraging a large quantity of stock image data. Our approach uses a simple yet effective curriculum guided training strategy for learning discriminative emotion features. We discover that the models learned using our large scale stock image dataset exhibit significantly better generalization ability than the existing datasets without the manual collection of even a single label. Moreover, visual representation learned using our approach holds a lot of promise across a variety of tasks on different image and video datasets.
Tasks	Emotion Recognition
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02212v1
PDF	http://arxiv.org/pdf/1808.02212v1.pdf
PWC	https://paperswithcode.com/paper/contemplating-visual-emotions-understanding
Repo
Framework

Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering


Title	Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering
Authors	Wei Wang, Ming Yan, Chen Wu
Abstract	This paper describes a novel hierarchical attention network for reading comprehension style question answering, which aims to answer questions for a given narrative paragraph. In the proposed method, attention and fusion are conducted horizontally and vertically across layers at different levels of granularity between question and paragraph. Specifically, it first encode the question and paragraph with fine-grained language embeddings, to better capture the respective representations at semantic level. Then it proposes a multi-granularity fusion approach to fully fuse information from both global and attended representations. Finally, it introduces a hierarchical attention network to focuses on the answer span progressively with multi-level softalignment. Extensive experiments on the large-scale SQuAD and TriviaQA datasets validate the effectiveness of the proposed method. At the time of writing the paper (Jan. 12th 2018), our model achieves the first position on the SQuAD leaderboard for both single and ensemble models. We also achieves state-of-the-art results on TriviaQA, AddSent and AddOne-Sent datasets.
Tasks	Question Answering, Reading Comprehension
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11934v1
PDF	http://arxiv.org/pdf/1811.11934v1.pdf
PWC	https://paperswithcode.com/paper/multi-granularity-hierarchical-attention
Repo
Framework

On the Robustness of Interpretability Methods


Title	On the Robustness of Interpretability Methods
Authors	David Alvarez-Melis, Tommi S. Jaakkola
Abstract	We argue that robustness of explanations—i.e., that similar inputs should give rise to similar explanations—is a key desideratum for interpretability. We introduce metrics to quantify robustness and demonstrate that current methods do not perform well according to these metrics. Finally, we propose ways that robustness can be enforced on existing interpretability approaches.
Tasks
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08049v1
PDF	http://arxiv.org/pdf/1806.08049v1.pdf
PWC	https://paperswithcode.com/paper/on-the-robustness-of-interpretability-methods
Repo
Framework

Learning Beyond Human Expertise with Generative Models for Dental Restorations


Title	Learning Beyond Human Expertise with Generative Models for Dental Restorations
Authors	Jyh-Jing Hwang, Sergei Azernikov, Alexei A. Efros, Stella X. Yu
Abstract	Computer vision has advanced significantly that many discriminative approaches such as object recognition are now widely used in real applications. We present another exciting development that utilizes generative models for the mass customization of medical products such as dental crowns. In the dental industry, it takes a technician years of training to design synthetic crowns that restore the function and integrity of missing teeth. Each crown must be customized to individual patients, and it requires human expertise in a time-consuming and labor-intensive process, even with computer-assisted design software. We develop a fully automatic approach that learns not only from human designs of dental crowns, but also from natural spatial profiles between opposing teeth. The latter is hard to account for by technicians but important for proper biting and chewing functions. Built upon a Generative Adversar-ial Network architecture (GAN), our deep learning model predicts the customized crown-filled depth scan from the crown-missing depth scan and opposing depth scan. We propose to incorporate additional space constraints and statistical compatibility into learning. Our automatic designs exceed human technicians’ standards for good morphology and functionality, and our algorithm is being tested for production use.
Tasks	Object Recognition
Published	2018-03-30
URL	http://arxiv.org/abs/1804.00064v1
PDF	http://arxiv.org/pdf/1804.00064v1.pdf
PWC	https://paperswithcode.com/paper/learning-beyond-human-expertise-with
Repo
Framework

Dyna: A Method of Momentum for Stochastic Optimization


Title	Dyna: A Method of Momentum for Stochastic Optimization
Authors	Zhidong Han
Abstract	An algorithm is presented for momentum gradient descent optimization based on the first-order differential equation of the Newtonian dynamics. The fictitious mass is introduced to the dynamics of momentum for regularizing the adaptive stepsize of each individual parameter. The dynamic relaxation is adapted for stochastic optimization of nonlinear objective functions through an explicit time integration with varying damping ratio. The adaptive stepsize is optimized for each individual neural network layer based on the number of inputs. The adaptive stepsize for every parameter over the entire neural network is uniformly optimized with one upper bound, independent of sparsity, for better overall convergence rate. The numerical implementation of the algorithm is similar to the Adam Optimizer, possessing computational efficiency, similar memory requirements, etc. There are three hyper-parameters in the algorithm with clear physical interpretation. Preliminary trials show promise in performance and convergence.
Tasks	Stochastic Optimization
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04933v1
PDF	http://arxiv.org/pdf/1805.04933v1.pdf
PWC	https://paperswithcode.com/paper/dyna-a-method-of-momentum-for-stochastic
Repo
Framework

Robust Classification of Financial Risk


Title	Robust Classification of Financial Risk
Authors	Suproteem K. Sarkar, Kojin Oshiba, Daniel Giebisch, Yaron Singer
Abstract	Algorithms are increasingly common components of high-impact decision-making, and a growing body of literature on adversarial examples in laboratory settings indicates that standard machine learning models are not robust. This suggests that real-world systems are also susceptible to manipulation or misclassification, which especially poses a challenge to machine learning models used in financial services. We use the loan grade classification problem to explore how machine learning models are sensitive to small changes in user-reported data, using adversarial attacks documented in the literature and an original, domain-specific attack. Our work shows that a robust optimization algorithm can build models for financial services that are resistant to misclassification on perturbations. To the best of our knowledge, this is the first study of adversarial attacks and defenses for deep learning in financial services.
Tasks	Decision Making
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11079v1
PDF	http://arxiv.org/pdf/1811.11079v1.pdf
PWC	https://paperswithcode.com/paper/robust-classification-of-financial-risk
Repo
Framework

Event Coreference Resolution Using Neural Network Classifiers


Title	Event Coreference Resolution Using Neural Network Classifiers
Authors	Arun Pandian, Lamana Mulaffer, Kemal Oflazer, Amna AlZeyara
Abstract	This paper presents a neural network classifier approach to detecting both within- and cross- document event coreference effectively using only event mention based features. Our approach does not (yet) rely on any event argument features such as semantic roles or spatiotemporal arguments. Experimental results on the ECB+ dataset show that our approach produces F1 scores that significantly outperform the state-of-the-art methods for both within-document and cross-document event coreference resolution when we use B3 and CEAFe evaluation measures, but gets worse F1 score with the MUC measure. However, when we use the CoNLL measure, which is the average of these three scores, our approach has slightly better F1 for within- document event coreference resolution but is significantly better for cross-document event coreference resolution.
Tasks	Coreference Resolution
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04216v1
PDF	http://arxiv.org/pdf/1810.04216v1.pdf
PWC	https://paperswithcode.com/paper/event-coreference-resolution-using-neural
Repo
Framework

Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model


Title	Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model
Authors	Niko Brummer, Anna Silnova, Lukas Burget, Themos Stafylakis
Abstract	Embeddings in machine learning are low-dimensional representations of complex input patterns, with the property that simple geometric operations like Euclidean distances and dot products can be used for classification and comparison tasks. The proposed meta-embeddings are special embeddings that live in more general inner product spaces. They are designed to propagate uncertainty to the final output in speaker recognition and similar applications. The familiar Gaussian PLDA model (GPLDA) can be re-formulated as an extractor for Gaussian meta-embeddings (GMEs), such that likelihood ratio scores are given by Hilbert space inner products between Gaussian likelihood functions. GMEs extracted by the GPLDA model have fixed precisions and do not propagate uncertainty. We show that a generalization to heavy-tailed PLDA gives GMEs with variable precisions, which do propagate uncertainty. Experiments on NIST SRE 2010 and 2016 show that the proposed method applied to i-vectors without length normalization is up to 20% more accurate than GPLDA applied to length-normalized ivectors.
Tasks	Speaker Recognition
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09777v1
PDF	http://arxiv.org/pdf/1802.09777v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-meta-embeddings-for-efficient
Repo
Framework

Predicting Infant Motor Development Status using Day Long Movement Data from Wearable Sensors


Title	Predicting Infant Motor Development Status using Day Long Movement Data from Wearable Sensors
Authors	David Goodfellow, Ruoyu Zhi, Rebecca Funke, Jose Carlos Pulido, Maja Mataric, Beth A. Smith
Abstract	Infants with a variety of complications at or before birth are classified as being at risk for developmental delays (AR). As they grow older, they are followed by healthcare providers in an effort to discern whether they are on a typical or impaired developmental trajectory. Often, it is difficult to make an accurate determination early in infancy as infants with typical development (TD) display high variability in their developmental trajectories both in content and timing. Studies have shown that spontaneous movements have the potential to differentiate typical and atypical trajectories early in life using sensors and kinematic analysis systems. In this study, machine learning classification algorithms are used to take inertial movement from wearable sensors placed on an infant for a day and predict if the infant is AR or TD, thus further establishing the connection between early spontaneous movement and developmental trajectory.
Tasks
Published	2018-07-07
URL	http://arxiv.org/abs/1807.02617v2
PDF	http://arxiv.org/pdf/1807.02617v2.pdf
PWC	https://paperswithcode.com/paper/predicting-infant-motor-development-status
Repo
Framework

Analyzing Uncertainty in Neural Machine Translation


Title	Analyzing Uncertainty in Neural Machine Translation
Authors	Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato
Abstract	Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some shortcomings of current models. As part of this study, we release multiple human reference translations for two popular benchmarks.
Tasks	Calibration, Machine Translation
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00047v4
PDF	http://arxiv.org/pdf/1803.00047v4.pdf
PWC	https://paperswithcode.com/paper/analyzing-uncertainty-in-neural-machine
Repo
Framework