April 2, 2020

3103 words 15 mins read

Paper Group ANR 160

Towards Universal Representation Learning for Deep Face Recognition. Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization. Social Media Attributions in the Context of Water Crisis. Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks. Communication-Efficient Distributed SVD via …

Towards Universal Representation Learning for Deep Face Recognition


Title	Towards Universal Representation Learning for Deep Face Recognition
Authors	Yichun Shi, Xiang Yu, Kihyuk Sohn, Manmohan Chandraker, Anil K. Jain
Abstract	Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge. We firstly synthesize training data alongside some semantically meaningful variations, such as low resolution, occlusion and head pose. However, directly feeding the augmented data for training will not converge well as the newly introduced samples are mostly hard examples. We propose to split the feature embedding into multiple sub-embeddings, and associate different confidence values for each sub-embedding to smooth the training procedure. The sub-embeddings are further decorrelated by regularizing variation classification loss and variation adversarial loss on different partitions of them. Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace, while significantly better on extreme benchmarks such as TinyFace and IJB-S.
Tasks	Face Recognition, Representation Learning
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11841v1
PDF	https://arxiv.org/pdf/2002.11841v1.pdf
PWC	https://paperswithcode.com/paper/towards-universal-representation-learning-for
Repo
Framework

Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization


Title	Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization
Authors	Stefan Vlaski, Ali H. Sayed
Abstract	Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in general, in the sense that even simply verifying that a given point is a local minimum can be NP-hard [1]. Still, some relatively simple algorithms have been shown to lead to surprisingly good empirical results in many contexts of interest. Perhaps the most prominent example is the success of the backpropagation algorithm for training neural networks. Several recent works have pursued rigorous analytical justification for this phenomenon by studying the structure of the nonconvex optimization problems and establishing that simple algorithms, such as gradient descent and its variations, perform well in converging towards local minima and avoiding saddle-points. A key insight in these analyses is that gradient perturbations play a critical role in allowing local descent algorithms to efficiently distinguish desirable from undesirable stationary points and escape from the latter. In this article, we cover recent results on second-order guarantees for stochastic first-order optimization algorithms in centralized, federated, and decentralized architectures.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14366v1
PDF	https://arxiv.org/pdf/2003.14366v1.pdf
PWC	https://paperswithcode.com/paper/second-order-guarantees-in-centralized
Repo
Framework


Title	Social Media Attributions in the Context of Water Crisis
Authors	Rupak Sarkar, Hirak Sarkar, Sayantan Mahinder, Ashiqur R. KhudaBukhsh
Abstract	Attribution of natural disasters/collective misfortune is a widely-studied political science problem. However, such studies are typically survey-centric or rely on a handful of experts to weigh in on the matter. In this paper, we explore how can we use social media data and an AI-driven approach to complement traditional surveys and automatically extract attribution factors. We focus on the most-recent Chennai water crisis which started off as a regional issue but rapidly escalated into a discussion topic with global importance following alarming water-crisis statistics. Specifically, we present a novel prediction task of attribution tie detection which identifies the factors held responsible for the crisis (e.g., poor city planning, exploding population etc.). On a challenging data set constructed from YouTube comments (72,098 comments posted by 43,859 users on 623 relevant videos to the crisis), we present a neural classifier to extract attribution ties that achieved a reasonable performance (Accuracy: 81.34% on attribution detection and 71.19% on attribution resolution).
Tasks
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01697v1
PDF	https://arxiv.org/pdf/2001.01697v1.pdf
PWC	https://paperswithcode.com/paper/social-media-attributions-in-the-context-of
Repo
Framework

Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks


Title	Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks
Authors	Jun Chen, Yong Liu, Hao Zhang, Shengnan Hou, Jian Yang
Abstract	The quantized neural networks (QNNs) can be useful for neural network acceleration and compression, but during the training process they pose a challenge: how to propagate the gradient of loss function through the graph flow with a derivative of 0 almost everywhere. In response to this non-differentiable situation, we propose a novel Asymptotic-Quantized Estimator (AQE) to estimate the gradient. In particular, during back-propagation, the graph that relates inputs to output remains smoothness and differentiability. At the end of training, the weights and activations have been quantized to low-precision because of the asymptotic behaviour of AQE. Meanwhile, we propose a M-bit Inputs and N-bit Weights Network (MINW-Net) trained by AQE, a quantized neural network with 1-3 bits weights and activations. In the inference phase, we can use XNOR or SHIFT operations instead of convolution operations to accelerate the MINW-Net. Our experiments on CIFAR datasets demonstrate that our AQE is well defined, and the QNNs with AQE perform better than that with Straight-Through Estimator (STE). For example, in the case of the same ConvNet that has 1-bit weights and activations, our MINW-Net with AQE can achieve a prediction accuracy 1.5% higher than the Binarized Neural Network (BNN) with STE. The MINW-Net, which is trained from scratch by AQE, can achieve comparable classification accuracy as 32-bit counterparts on CIFAR test sets. Extensive experimental results on ImageNet dataset show great superiority of the proposed AQE and our MINW-Net achieves comparable results with other state-of-the-art QNNs.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.04296v1
PDF	https://arxiv.org/pdf/2003.04296v1.pdf
PWC	https://paperswithcode.com/paper/propagating-asymptotic-estimated-gradients
Repo
Framework

Communication-Efficient Distributed SVD via Local Power Iterations


Title	Communication-Efficient Distributed SVD via Local Power Iterations
Authors	Xiang Li, Shusen Wang, Kun Chen, Zhihua Zhang
Abstract	We study the distributed computing of the truncated singular value decomposition (SVD). We develop an algorithm that we call \texttt{LocalPower} for improving the communication efficiency. Specifically, we uniformly partition the dataset among $m$ nodes and alternate between multiple (precisely $p$) local power iterations and one global aggregation. We theoretically show that under certain assumptions, \texttt{LocalPower} lowers the required number of communications by a factor of $p$ to reach a certain accuracy. We also show that the strategy of periodically decaying $p$ helps improve the performance of \texttt{LocalPower}. We conduct experiments to demonstrate the effectiveness of \texttt{LocalPower}.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08014v1
PDF	https://arxiv.org/pdf/2002.08014v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-svd-via
Repo
Framework

Improved Spectral Imaging Microscopy for Cultural Heritage through Oblique Illumination


Title	Improved Spectral Imaging Microscopy for Cultural Heritage through Oblique Illumination
Authors	Lindsay Oakley, Stephanie Zaleski, Billie Males, Ollie Cossairt, Marc Walton
Abstract	This work presents the development of a flexible microscopic chemical imaging platform for cultural heritage that utilizes wavelength-tunable oblique illumination from a point source to obtain per-pixel reflectance spectra in the VIS-NIR range. The microscope light source can be adjusted on two axes allowing for a hemisphere of possible illumination directions. The synthesis of multiple illumination angles allows for the calculation of surface normal vectors, similar to phase gradients, and axial optical sectioning. The extraction of spectral reflectance images with high spatial resolutions from these data is demonstrated through the analysis of a replica cross-section, created from known painting reference materials, as well as a sample extracted from a painting by Pablo Picasso entitled La Mis'ereuse accroupie (1902). These case studies show the rich microscale molecular information that may be obtained using this microscope and how the instrument overcomes challenges for spectral analysis commonly encountered on works of art with complex matrices composed of both inorganic minerals and organic lakes.
Tasks
Published	2020-01-01
URL	https://arxiv.org/abs/2001.00817v1
PDF	https://arxiv.org/pdf/2001.00817v1.pdf
PWC	https://paperswithcode.com/paper/improved-spectral-imaging-microscopy-for
Repo
Framework

Inverting Gradients – How easy is it to break privacy in federated learning?


Title	Inverting Gradients – How easy is it to break privacy in federated learning?
Authors	Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, Michael Moeller
Abstract	The idea of federated learning is to collaboratively train a neural network on a server. Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data. This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only parameter gradients are shared. In this paper we show that sharing parameter gradients is by no means secure: By exploiting a cosine similarity loss along with optimization methods from adversarial attacks, we are able to faithfully reconstruct images at high resolution from the knowledge of their parameter gradients, and demonstrate that such a break of privacy is possible even for trained deep networks. Moreover, we analyze the effects of architecture as well as parameters on the difficulty of reconstructing the input image, prove that any input to a fully connected layer can be reconstructed analytically independent of the remaining architecture, and show numerically that even averaging gradients over several iterations or several images does not protect the user’s privacy in federated learning applications in computer vision.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14053v1
PDF	https://arxiv.org/pdf/2003.14053v1.pdf
PWC	https://paperswithcode.com/paper/inverting-gradients-how-easy-is-it-to-break
Repo
Framework

Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense


Title	Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense
Authors	Guanlin Liu, Lifeng lai
Abstract	Due to the broad range of applications of stochastic multi-armed bandit model, understanding the effects of adversarial attacks and designing bandit algorithms robust to attacks are essential for the safe applications of this model. In this paper, we introduce a new class of attack named action-manipulation attack. In this attack, an adversary can change the action signal selected by the user. We show that without knowledge of mean rewards of arms, our proposed attack can manipulate Upper Confidence Bound (UCB) algorithm, a widely used bandit algorithm, into pulling a target arm very frequently by spending only logarithmic cost. To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks when an upper bound for the total attack cost is given. We prove that our algorithm has a pseudo-regret upper bounded by $\mathcal{O}(\max{\log T,A})$, where $T$ is the total number of rounds and $A$ is the upper bound of the total attack cost.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08000v2
PDF	https://arxiv.org/pdf/2002.08000v2.pdf
PWC	https://paperswithcode.com/paper/action-manipulation-attacks-against
Repo
Framework

Inverse learning in Hilbert scales


Title	Inverse learning in Hilbert scales
Authors	Abhishake Rastogi, Peter Mathé
Abstract	We study the linear ill-posed inverse problem with noisy data in the statistical learning setting. Approximate reconstructions from random noisy data are sought with general regularization schemes in Hilbert scale. We discuss the rates of convergence for the regularized solution under the prior assumptions and a certain link condition. We express the error in terms of certain distance functions. For regression functions with smoothness given in terms of source conditions the error bound can then be explicitly established.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10208v1
PDF	https://arxiv.org/pdf/2002.10208v1.pdf
PWC	https://paperswithcode.com/paper/inverse-learning-in-hilbert-scales
Repo
Framework

Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal Learning of Deep Spiking Neural Network


Title	Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal Learning of Deep Spiking Neural Network
Authors	Haowen Fang, Amar Shrestha, Ziyi Zhao, Qinru Qiu
Abstract	The recent discovered spatial-temporal information processing capability of bio-inspired Spiking neural networks (SNN) has enabled some interesting models and applications. However designing large-scale and high-performance model is yet a challenge due to the lack of robust training algorithms. A bio-plausible SNN model with spatial-temporal property is a complex dynamic system. Each synapse and neuron behave as filters capable of preserving temporal information. As such neuron dynamics and filter effects are ignored in existing training algorithms, the SNN downgrades into a memoryless system and loses the ability of temporal signal processing. Furthermore, spike timing plays an important role in information representation, but conventional rate-based spike coding models only consider spike trains statistically, and discard information carried by its temporal structures. To address the above issues, and exploit the temporal dynamics of SNNs, we formulate SNN as a network of infinite impulse response (IIR) filters with neuron nonlinearity. We proposed a training algorithm that is capable to learn spatial-temporal patterns by searching for the optimal synapse filter kernels and weights. The proposed model and training algorithm are applied to construct associative memories and classifiers for synthetic and public datasets including MNIST, NMNIST, DVS 128 etc.; and their accuracy outperforms state-of-art approaches.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2003.02944v1
PDF	https://arxiv.org/pdf/2003.02944v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-neuron-and-synapse-filter-dynamics
Repo
Framework

Towards Quantifying the Distance between Opinions


Title	Towards Quantifying the Distance between Opinions
Authors	Saket Gurukar, Deepak Ajwani, Sourav Dutta, Juho Lauri, Srinivasan Parthasarathy, Alessandra Sala
Abstract	Increasingly, critical decisions in public policy, governance, and business strategy rely on a deeper understanding of the needs and opinions of constituent members (e.g. citizens, shareholders). While it has become easier to collect a large number of opinions on a topic, there is a necessity for automated tools to help navigate the space of opinions. In such contexts understanding and quantifying the similarity between opinions is key. We find that measures based solely on text similarity or on overall sentiment often fail to effectively capture the distance between opinions. Thus, we propose a new distance measure for capturing the similarity between opinions that leverages the nuanced observation – similar opinions express similar sentiment polarity on specific relevant entities-of-interest. Specifically, in an unsupervised setting, our distance measure achieves significantly better Adjusted Rand Index scores (up to 56x) and Silhouette coefficients (up to 21x) compared to existing approaches. Similarly, in a supervised setting, our opinion distance measure achieves considerably better accuracy (up to 20% increase) compared to extant approaches that rely on text similarity, stance similarity, and sentiment similarity
Tasks
Published	2020-01-27
URL	https://arxiv.org/abs/2001.09879v1
PDF	https://arxiv.org/pdf/2001.09879v1.pdf
PWC	https://paperswithcode.com/paper/towards-quantifying-the-distance-between
Repo
Framework

Investigation and Analysis of Hyper and Hypo neuron pruning to selectively update neurons during Unsupervised Adaptation


Title	Investigation and Analysis of Hyper and Hypo neuron pruning to selectively update neurons during Unsupervised Adaptation
Authors	Vikramjit Mitra, Horacio Franco
Abstract	Unseen or out-of-domain data can seriously degrade the performance of a neural network model, indicating the model’s failure to generalize to unseen data. Neural net pruning can not only help to reduce a model’s size but can improve the model’s generalization capacity as well. Pruning approaches look for low-salient neurons that are less contributive to a model’s decision and hence can be removed from the model. This work investigates if pruning approaches are successful in detecting neurons that are either high-salient (mostly active or hyper) or low-salient (barely active or hypo), and whether removal of such neurons can help to improve the model’s generalization capacity. Traditional blind adaptation techniques update either the whole or a subset of layers, but have never explored selectively updating individual neurons across one or more layers. Focusing on the fully connected layers of a convolutional neural network (CNN), this work shows that it may be possible to selectively adapt certain neurons (consisting of the hyper and the hypo neurons) first, followed by a full-network fine tuning. Using the task of automatic speech recognition, this work demonstrates how the removal of hyper and hypo neurons from a model can improve the model’s performance on out-of-domain speech data and how selective neuron adaptation can ensure improved performance when compared to traditional blind model adaptation.
Tasks	Speech Recognition
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01755v1
PDF	https://arxiv.org/pdf/2001.01755v1.pdf
PWC	https://paperswithcode.com/paper/investigation-and-analysis-of-hyper-and-hypo
Repo
Framework

Universal Domain Adaptation through Self Supervision


Title	Universal Domain Adaptation through Self Supervision
Authors	Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Kate Saenko
Abstract	Unsupervised domain adaptation methods traditionally assume that all source categories are present in the target domain. In practice, little may be known about the category overlap between the two domains. While some methods address target settings with either partial or open-set categories, they assume that the particular setting is known a priori. We propose a more universally applicable domain adaptation approach that can handle arbitrary category shift, called Domain Adaptative Neighborhood Clustering via Entropy optimization (DANCE). DANCE combines two novel ideas: First, as we cannot fully rely on source categories to learn features discriminative for the target, we propose a novel neighborhood clustering technique to learn the structure of the target domain in a self-supervised way. Second, we use entropy-based feature alignment and rejection to align target features with the source, or reject them as unknown categories based on their entropy. We show through extensive experiments that DANCE outperforms baselines across open-set, open-partial and partial domain adaptation settings.
Tasks	Domain Adaptation, Partial Domain Adaptation, Unsupervised Domain Adaptation
Published	2020-02-19
URL	https://arxiv.org/abs/2002.07953v2
PDF	https://arxiv.org/pdf/2002.07953v2.pdf
PWC	https://paperswithcode.com/paper/universal-domain-adaptation-through-self
Repo
Framework

Transition-Based Dependency Parsing using Perceptron Learner


Title	Transition-Based Dependency Parsing using Perceptron Learner
Authors	Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking
Abstract	Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learner, outperforms a baseline arc-standard parser. We beat the UAS of the MALT and LSTM parsers. We also give possible ways to address parsing of non-projective trees.
Tasks	Dependency Parsing, Transition-Based Dependency Parsing
Published	2020-01-22
URL	https://arxiv.org/abs/2001.08279v2
PDF	https://arxiv.org/pdf/2001.08279v2.pdf
PWC	https://paperswithcode.com/paper/transition-based-dependency-parsing-using
Repo
Framework

Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks


Title	Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
Authors	Jiangke Lin, Yi Yuan, Tianjia Shao, Kun Zhou
Abstract	3D Morphable Model (3DMM) based methods have achieved great success in recovering 3D face shapes from single-view images. However, the facial textures recovered by such methods lack the fidelity as exhibited in the input images. Recent work demonstrates high-quality facial texture recovering with generative networks trained from a large-scale database of high-resolution UV maps of face textures, which is hard to prepare and not publicly available. In this paper, we introduce a method to reconstruct 3D facial shapes with high-fidelity textures from single-view images in-the-wild, without the need to capture a large-scale face texture database. The main idea is to refine the initial texture generated by a 3DMM based method with facial details from the input image. To this end, we propose to use graph convolutional networks to reconstruct the detailed colors for the mesh vertices instead of reconstructing the UV map. Experiments show that our method can generate high-quality results and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.
Tasks	3D Face Reconstruction, Face Reconstruction
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05653v1
PDF	https://arxiv.org/pdf/2003.05653v1.pdf
PWC	https://paperswithcode.com/paper/towards-high-fidelity-3d-face-reconstruction
Repo
Framework