Paper Group ANR 160
Towards Universal Representation Learning for Deep Face Recognition. Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization. Social Media Attributions in the Context of Water Crisis. Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks. Communication-Efficient Distributed SVD via …
Towards Universal Representation Learning for Deep Face Recognition
Title | Towards Universal Representation Learning for Deep Face Recognition |
Authors | Yichun Shi, Xiang Yu, Kihyuk Sohn, Manmohan Chandraker, Anil K. Jain |
Abstract | Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge. We firstly synthesize training data alongside some semantically meaningful variations, such as low resolution, occlusion and head pose. However, directly feeding the augmented data for training will not converge well as the newly introduced samples are mostly hard examples. We propose to split the feature embedding into multiple sub-embeddings, and associate different confidence values for each sub-embedding to smooth the training procedure. The sub-embeddings are further decorrelated by regularizing variation classification loss and variation adversarial loss on different partitions of them. Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace, while significantly better on extreme benchmarks such as TinyFace and IJB-S. |
Tasks | Face Recognition, Representation Learning |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11841v1 |
https://arxiv.org/pdf/2002.11841v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-representation-learning-for |
Repo | |
Framework | |
Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization
Title | Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization |
Authors | Stefan Vlaski, Ali H. Sayed |
Abstract | Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in general, in the sense that even simply verifying that a given point is a local minimum can be NP-hard [1]. Still, some relatively simple algorithms have been shown to lead to surprisingly good empirical results in many contexts of interest. Perhaps the most prominent example is the success of the backpropagation algorithm for training neural networks. Several recent works have pursued rigorous analytical justification for this phenomenon by studying the structure of the nonconvex optimization problems and establishing that simple algorithms, such as gradient descent and its variations, perform well in converging towards local minima and avoiding saddle-points. A key insight in these analyses is that gradient perturbations play a critical role in allowing local descent algorithms to efficiently distinguish desirable from undesirable stationary points and escape from the latter. In this article, we cover recent results on second-order guarantees for stochastic first-order optimization algorithms in centralized, federated, and decentralized architectures. |
Tasks | |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14366v1 |
https://arxiv.org/pdf/2003.14366v1.pdf | |
PWC | https://paperswithcode.com/paper/second-order-guarantees-in-centralized |
Repo | |
Framework | |
Social Media Attributions in the Context of Water Crisis
Title | Social Media Attributions in the Context of Water Crisis |
Authors | Rupak Sarkar, Hirak Sarkar, Sayantan Mahinder, Ashiqur R. KhudaBukhsh |
Abstract | Attribution of natural disasters/collective misfortune is a widely-studied political science problem. However, such studies are typically survey-centric or rely on a handful of experts to weigh in on the matter. In this paper, we explore how can we use social media data and an AI-driven approach to complement traditional surveys and automatically extract attribution factors. We focus on the most-recent Chennai water crisis which started off as a regional issue but rapidly escalated into a discussion topic with global importance following alarming water-crisis statistics. Specifically, we present a novel prediction task of attribution tie detection which identifies the factors held responsible for the crisis (e.g., poor city planning, exploding population etc.). On a challenging data set constructed from YouTube comments (72,098 comments posted by 43,859 users on 623 relevant videos to the crisis), we present a neural classifier to extract attribution ties that achieved a reasonable performance (Accuracy: 81.34% on attribution detection and 71.19% on attribution resolution). |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01697v1 |
https://arxiv.org/pdf/2001.01697v1.pdf | |
PWC | https://paperswithcode.com/paper/social-media-attributions-in-the-context-of |
Repo | |
Framework | |
Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks
Title | Propagating Asymptotic-Estimated Gradients for Low Bitwidth Quantized Neural Networks |
Authors | Jun Chen, Yong Liu, Hao Zhang, Shengnan Hou, Jian Yang |
Abstract | The quantized neural networks (QNNs) can be useful for neural network acceleration and compression, but during the training process they pose a challenge: how to propagate the gradient of loss function through the graph flow with a derivative of 0 almost everywhere. In response to this non-differentiable situation, we propose a novel Asymptotic-Quantized Estimator (AQE) to estimate the gradient. In particular, during back-propagation, the graph that relates inputs to output remains smoothness and differentiability. At the end of training, the weights and activations have been quantized to low-precision because of the asymptotic behaviour of AQE. Meanwhile, we propose a M-bit Inputs and N-bit Weights Network (MINW-Net) trained by AQE, a quantized neural network with 1-3 bits weights and activations. In the inference phase, we can use XNOR or SHIFT operations instead of convolution operations to accelerate the MINW-Net. Our experiments on CIFAR datasets demonstrate that our AQE is well defined, and the QNNs with AQE perform better than that with Straight-Through Estimator (STE). For example, in the case of the same ConvNet that has 1-bit weights and activations, our MINW-Net with AQE can achieve a prediction accuracy 1.5% higher than the Binarized Neural Network (BNN) with STE. The MINW-Net, which is trained from scratch by AQE, can achieve comparable classification accuracy as 32-bit counterparts on CIFAR test sets. Extensive experimental results on ImageNet dataset show great superiority of the proposed AQE and our MINW-Net achieves comparable results with other state-of-the-art QNNs. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.04296v1 |
https://arxiv.org/pdf/2003.04296v1.pdf | |
PWC | https://paperswithcode.com/paper/propagating-asymptotic-estimated-gradients |
Repo | |
Framework | |
Communication-Efficient Distributed SVD via Local Power Iterations
Title | Communication-Efficient Distributed SVD via Local Power Iterations |
Authors | Xiang Li, Shusen Wang, Kun Chen, Zhihua Zhang |
Abstract | We study the distributed computing of the truncated singular value decomposition (SVD). We develop an algorithm that we call \texttt{LocalPower} for improving the communication efficiency. Specifically, we uniformly partition the dataset among $m$ nodes and alternate between multiple (precisely $p$) local power iterations and one global aggregation. We theoretically show that under certain assumptions, \texttt{LocalPower} lowers the required number of communications by a factor of $p$ to reach a certain accuracy. We also show that the strategy of periodically decaying $p$ helps improve the performance of \texttt{LocalPower}. We conduct experiments to demonstrate the effectiveness of \texttt{LocalPower}. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08014v1 |
https://arxiv.org/pdf/2002.08014v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-distributed-svd-via |
Repo | |
Framework | |
Improved Spectral Imaging Microscopy for Cultural Heritage through Oblique Illumination
Title | Improved Spectral Imaging Microscopy for Cultural Heritage through Oblique Illumination |
Authors | Lindsay Oakley, Stephanie Zaleski, Billie Males, Ollie Cossairt, Marc Walton |
Abstract | This work presents the development of a flexible microscopic chemical imaging platform for cultural heritage that utilizes wavelength-tunable oblique illumination from a point source to obtain per-pixel reflectance spectra in the VIS-NIR range. The microscope light source can be adjusted on two axes allowing for a hemisphere of possible illumination directions. The synthesis of multiple illumination angles allows for the calculation of surface normal vectors, similar to phase gradients, and axial optical sectioning. The extraction of spectral reflectance images with high spatial resolutions from these data is demonstrated through the analysis of a replica cross-section, created from known painting reference materials, as well as a sample extracted from a painting by Pablo Picasso entitled La Mis'ereuse accroupie (1902). These case studies show the rich microscale molecular information that may be obtained using this microscope and how the instrument overcomes challenges for spectral analysis commonly encountered on works of art with complex matrices composed of both inorganic minerals and organic lakes. |
Tasks | |
Published | 2020-01-01 |
URL | https://arxiv.org/abs/2001.00817v1 |
https://arxiv.org/pdf/2001.00817v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-spectral-imaging-microscopy-for |
Repo | |
Framework | |
Inverting Gradients – How easy is it to break privacy in federated learning?
Title | Inverting Gradients – How easy is it to break privacy in federated learning? |
Authors | Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, Michael Moeller |
Abstract | The idea of federated learning is to collaboratively train a neural network on a server. Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data. This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only parameter gradients are shared. In this paper we show that sharing parameter gradients is by no means secure: By exploiting a cosine similarity loss along with optimization methods from adversarial attacks, we are able to faithfully reconstruct images at high resolution from the knowledge of their parameter gradients, and demonstrate that such a break of privacy is possible even for trained deep networks. Moreover, we analyze the effects of architecture as well as parameters on the difficulty of reconstructing the input image, prove that any input to a fully connected layer can be reconstructed analytically independent of the remaining architecture, and show numerically that even averaging gradients over several iterations or several images does not protect the user’s privacy in federated learning applications in computer vision. |
Tasks | |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14053v1 |
https://arxiv.org/pdf/2003.14053v1.pdf | |
PWC | https://paperswithcode.com/paper/inverting-gradients-how-easy-is-it-to-break |
Repo | |
Framework | |
Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense
Title | Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense |
Authors | Guanlin Liu, Lifeng lai |
Abstract | Due to the broad range of applications of stochastic multi-armed bandit model, understanding the effects of adversarial attacks and designing bandit algorithms robust to attacks are essential for the safe applications of this model. In this paper, we introduce a new class of attack named action-manipulation attack. In this attack, an adversary can change the action signal selected by the user. We show that without knowledge of mean rewards of arms, our proposed attack can manipulate Upper Confidence Bound (UCB) algorithm, a widely used bandit algorithm, into pulling a target arm very frequently by spending only logarithmic cost. To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks when an upper bound for the total attack cost is given. We prove that our algorithm has a pseudo-regret upper bounded by $\mathcal{O}(\max{\log T,A})$, where $T$ is the total number of rounds and $A$ is the upper bound of the total attack cost. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08000v2 |
https://arxiv.org/pdf/2002.08000v2.pdf | |
PWC | https://paperswithcode.com/paper/action-manipulation-attacks-against |
Repo | |
Framework | |
Inverse learning in Hilbert scales
Title | Inverse learning in Hilbert scales |
Authors | Abhishake Rastogi, Peter Mathé |
Abstract | We study the linear ill-posed inverse problem with noisy data in the statistical learning setting. Approximate reconstructions from random noisy data are sought with general regularization schemes in Hilbert scale. We discuss the rates of convergence for the regularized solution under the prior assumptions and a certain link condition. We express the error in terms of certain distance functions. For regression functions with smoothness given in terms of source conditions the error bound can then be explicitly established. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10208v1 |
https://arxiv.org/pdf/2002.10208v1.pdf | |
PWC | https://paperswithcode.com/paper/inverse-learning-in-hilbert-scales |
Repo | |
Framework | |
Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal Learning of Deep Spiking Neural Network
Title | Exploiting Neuron and Synapse Filter Dynamics in Spatial Temporal Learning of Deep Spiking Neural Network |
Authors | Haowen Fang, Amar Shrestha, Ziyi Zhao, Qinru Qiu |
Abstract | The recent discovered spatial-temporal information processing capability of bio-inspired Spiking neural networks (SNN) has enabled some interesting models and applications. However designing large-scale and high-performance model is yet a challenge due to the lack of robust training algorithms. A bio-plausible SNN model with spatial-temporal property is a complex dynamic system. Each synapse and neuron behave as filters capable of preserving temporal information. As such neuron dynamics and filter effects are ignored in existing training algorithms, the SNN downgrades into a memoryless system and loses the ability of temporal signal processing. Furthermore, spike timing plays an important role in information representation, but conventional rate-based spike coding models only consider spike trains statistically, and discard information carried by its temporal structures. To address the above issues, and exploit the temporal dynamics of SNNs, we formulate SNN as a network of infinite impulse response (IIR) filters with neuron nonlinearity. We proposed a training algorithm that is capable to learn spatial-temporal patterns by searching for the optimal synapse filter kernels and weights. The proposed model and training algorithm are applied to construct associative memories and classifiers for synthetic and public datasets including MNIST, NMNIST, DVS 128 etc.; and their accuracy outperforms state-of-art approaches. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2003.02944v1 |
https://arxiv.org/pdf/2003.02944v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-neuron-and-synapse-filter-dynamics |
Repo | |
Framework | |
Towards Quantifying the Distance between Opinions
Title | Towards Quantifying the Distance between Opinions |
Authors | Saket Gurukar, Deepak Ajwani, Sourav Dutta, Juho Lauri, Srinivasan Parthasarathy, Alessandra Sala |
Abstract | Increasingly, critical decisions in public policy, governance, and business strategy rely on a deeper understanding of the needs and opinions of constituent members (e.g. citizens, shareholders). While it has become easier to collect a large number of opinions on a topic, there is a necessity for automated tools to help navigate the space of opinions. In such contexts understanding and quantifying the similarity between opinions is key. We find that measures based solely on text similarity or on overall sentiment often fail to effectively capture the distance between opinions. Thus, we propose a new distance measure for capturing the similarity between opinions that leverages the nuanced observation – similar opinions express similar sentiment polarity on specific relevant entities-of-interest. Specifically, in an unsupervised setting, our distance measure achieves significantly better Adjusted Rand Index scores (up to 56x) and Silhouette coefficients (up to 21x) compared to existing approaches. Similarly, in a supervised setting, our opinion distance measure achieves considerably better accuracy (up to 20% increase) compared to extant approaches that rely on text similarity, stance similarity, and sentiment similarity |
Tasks | |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09879v1 |
https://arxiv.org/pdf/2001.09879v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-quantifying-the-distance-between |
Repo | |
Framework | |
Investigation and Analysis of Hyper and Hypo neuron pruning to selectively update neurons during Unsupervised Adaptation
Title | Investigation and Analysis of Hyper and Hypo neuron pruning to selectively update neurons during Unsupervised Adaptation |
Authors | Vikramjit Mitra, Horacio Franco |
Abstract | Unseen or out-of-domain data can seriously degrade the performance of a neural network model, indicating the model’s failure to generalize to unseen data. Neural net pruning can not only help to reduce a model’s size but can improve the model’s generalization capacity as well. Pruning approaches look for low-salient neurons that are less contributive to a model’s decision and hence can be removed from the model. This work investigates if pruning approaches are successful in detecting neurons that are either high-salient (mostly active or hyper) or low-salient (barely active or hypo), and whether removal of such neurons can help to improve the model’s generalization capacity. Traditional blind adaptation techniques update either the whole or a subset of layers, but have never explored selectively updating individual neurons across one or more layers. Focusing on the fully connected layers of a convolutional neural network (CNN), this work shows that it may be possible to selectively adapt certain neurons (consisting of the hyper and the hypo neurons) first, followed by a full-network fine tuning. Using the task of automatic speech recognition, this work demonstrates how the removal of hyper and hypo neurons from a model can improve the model’s performance on out-of-domain speech data and how selective neuron adaptation can ensure improved performance when compared to traditional blind model adaptation. |
Tasks | Speech Recognition |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01755v1 |
https://arxiv.org/pdf/2001.01755v1.pdf | |
PWC | https://paperswithcode.com/paper/investigation-and-analysis-of-hyper-and-hypo |
Repo | |
Framework | |
Universal Domain Adaptation through Self Supervision
Title | Universal Domain Adaptation through Self Supervision |
Authors | Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Kate Saenko |
Abstract | Unsupervised domain adaptation methods traditionally assume that all source categories are present in the target domain. In practice, little may be known about the category overlap between the two domains. While some methods address target settings with either partial or open-set categories, they assume that the particular setting is known a priori. We propose a more universally applicable domain adaptation approach that can handle arbitrary category shift, called Domain Adaptative Neighborhood Clustering via Entropy optimization (DANCE). DANCE combines two novel ideas: First, as we cannot fully rely on source categories to learn features discriminative for the target, we propose a novel neighborhood clustering technique to learn the structure of the target domain in a self-supervised way. Second, we use entropy-based feature alignment and rejection to align target features with the source, or reject them as unknown categories based on their entropy. We show through extensive experiments that DANCE outperforms baselines across open-set, open-partial and partial domain adaptation settings. |
Tasks | Domain Adaptation, Partial Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.07953v2 |
https://arxiv.org/pdf/2002.07953v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-domain-adaptation-through-self |
Repo | |
Framework | |
Transition-Based Dependency Parsing using Perceptron Learner
Title | Transition-Based Dependency Parsing using Perceptron Learner |
Authors | Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking |
Abstract | Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learner, outperforms a baseline arc-standard parser. We beat the UAS of the MALT and LSTM parsers. We also give possible ways to address parsing of non-projective trees. |
Tasks | Dependency Parsing, Transition-Based Dependency Parsing |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08279v2 |
https://arxiv.org/pdf/2001.08279v2.pdf | |
PWC | https://paperswithcode.com/paper/transition-based-dependency-parsing-using |
Repo | |
Framework | |
Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
Title | Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks |
Authors | Jiangke Lin, Yi Yuan, Tianjia Shao, Kun Zhou |
Abstract | 3D Morphable Model (3DMM) based methods have achieved great success in recovering 3D face shapes from single-view images. However, the facial textures recovered by such methods lack the fidelity as exhibited in the input images. Recent work demonstrates high-quality facial texture recovering with generative networks trained from a large-scale database of high-resolution UV maps of face textures, which is hard to prepare and not publicly available. In this paper, we introduce a method to reconstruct 3D facial shapes with high-fidelity textures from single-view images in-the-wild, without the need to capture a large-scale face texture database. The main idea is to refine the initial texture generated by a 3DMM based method with facial details from the input image. To this end, we propose to use graph convolutional networks to reconstruct the detailed colors for the mesh vertices instead of reconstructing the UV map. Experiments show that our method can generate high-quality results and outperforms state-of-the-art methods in both qualitative and quantitative comparisons. |
Tasks | 3D Face Reconstruction, Face Reconstruction |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05653v1 |
https://arxiv.org/pdf/2003.05653v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-high-fidelity-3d-face-reconstruction |
Repo | |
Framework | |