Paper Group ANR 696
Is Attention All What You Need? – An Empirical Investigation on Convolution-Based Active Memory and Self-Attention. Stochastic Recursive Variance-Reduced Cubic Regularization Methods. Overlapping Community Detection with Graph Neural Networks. Three dimensional blind image deconvolution for fluorescence microscopy using generative adversarial netw …
Is Attention All What You Need? – An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Title | Is Attention All What You Need? – An Empirical Investigation on Convolution-Based Active Memory and Self-Attention |
Authors | Thomas Dowdell, Hongyu Zhang |
Abstract | The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by RNNs could be replaced by active-memory mechanisms. In this work, we evaluate whether various active-memory mechanisms could replace self-attention in a Transformer. Our experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling, but optimal results are mostly achieved by using both active-memory and self-attention mechanisms together. We also note that, for some specific algorithmic tasks, active-memory mechanisms alone outperform both self-attention and a combination of the two. |
Tasks | Language Modelling |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.11959v2 |
https://arxiv.org/pdf/1912.11959v2.pdf | |
PWC | https://paperswithcode.com/paper/is-attention-all-what-you-need-an-empirical |
Repo | |
Framework | |
Stochastic Recursive Variance-Reduced Cubic Regularization Methods
Title | Stochastic Recursive Variance-Reduced Cubic Regularization Methods |
Authors | Dongruo Zhou, Quanquan Gu |
Abstract | Stochastic Variance-Reduced Cubic regularization (SVRC) algorithms have received increasing attention due to its improved gradient/Hessian complexities (i.e., number of queries to stochastic gradient/Hessian oracles) to find local minima for nonconvex finite-sum optimization. However, it is unclear whether existing SVRC algorithms can be further improved. Moreover, the semi-stochastic Hessian estimator adopted in existing SVRC algorithms prevents the use of Hessian-vector product-based fast cubic subproblem solvers, which makes SVRC algorithms computationally intractable for high-dimensional problems. In this paper, we first present a Stochastic Recursive Variance-Reduced Cubic regularization method (SRVRC) using a recursively updated semi-stochastic gradient and Hessian estimators. It enjoys improved gradient and Hessian complexities to find an $(\epsilon, \sqrt{\epsilon})$-approximate local minimum, and outperforms the state-of-the-art SVRC algorithms. Built upon SRVRC, we further propose a Hessian-free SRVRC algorithm, namely SRVRC$_{\text{free}}$, which only requires stochastic gradient and Hessian-vector product computations, and achieves $\tilde O(dn\epsilon^{-2} \land d\epsilon^{-3})$ runtime complexity, where $n$ is the number of component functions in the finite-sum structure, $d$ is the problem dimension, and $\epsilon$ is the optimization precision. This outperforms the best-known runtime complexity $\tilde O(d\epsilon^{-3.5})$ achieved by stochastic cubic regularization algorithm proposed in Tripuraneni et al. 2018. |
Tasks | |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1901.11518v2 |
https://arxiv.org/pdf/1901.11518v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-recursive-variance-reduced-cubic |
Repo | |
Framework | |
Overlapping Community Detection with Graph Neural Networks
Title | Overlapping Community Detection with Graph Neural Networks |
Authors | Oleksandr Shchur, Stephan Günnemann |
Abstract | Community detection is a fundamental problem in machine learning. While deep learning has shown great promise in many graphrelated tasks, developing neural models for community detection has received surprisingly little attention. The few existing approaches focus on detecting disjoint communities, even though communities in real graphs are well known to be overlapping. We address this shortcoming and propose a graph neural network (GNN) based model for overlapping community detection. Despite its simplicity, our model outperforms the existing baselines by a large margin in the task of community recovery. We establish through an extensive experimental evaluation that the proposed model is effective, scalable and robust to hyperparameter settings. We also perform an ablation study that confirms that GNN is the key ingredient to the power of the proposed model. |
Tasks | Community Detection |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12201v1 |
https://arxiv.org/pdf/1909.12201v1.pdf | |
PWC | https://paperswithcode.com/paper/overlapping-community-detection-with-graph-1 |
Repo | |
Framework | |
Three dimensional blind image deconvolution for fluorescence microscopy using generative adversarial networks
Title | Three dimensional blind image deconvolution for fluorescence microscopy using generative adversarial networks |
Authors | Soonam Lee, Shuo Han, Paul Salama, Kenneth W. Dunn, Edward J. Delp |
Abstract | Due to image blurring image deconvolution is often used for studying biological structures in fluorescence microscopy. Fluorescence microscopy image volumes inherently suffer from intensity inhomogeneity, blur, and are corrupted by various types of noise which exacerbate image quality at deeper tissue depth. Therefore, quantitative analysis of fluorescence microscopy in deeper tissue still remains a challenge. This paper presents a three dimensional blind image deconvolution method for fluorescence microscopy using 3-way spatially constrained cycle-consistent adversarial networks. The restored volumes of the proposed deconvolution method and other well-known deconvolution methods, denoising methods, and an inhomogeneity correction method are visually and numerically evaluated. Experimental results indicate that the proposed method can restore and improve the quality of blurred and noisy deep depth microscopy image visually and quantitatively. |
Tasks | Denoising, Image Deconvolution |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09974v1 |
http://arxiv.org/pdf/1904.09974v1.pdf | |
PWC | https://paperswithcode.com/paper/three-dimensional-blind-image-deconvolution |
Repo | |
Framework | |
Few-Shot Learning via Saliency-guided Hallucination of Samples
Title | Few-Shot Learning via Saliency-guided Hallucination of Samples |
Authors | Hongguang Zhang, Jing Zhang, Piotr Koniusz |
Abstract | Learning new concepts from a few of samples is a standard challenge in computer vision. The main directions to improve the learning ability of few-shot training models include (i) a robust similarity learning and (ii) generating or hallucinating additional data from the limited existing samples. In this paper, we follow the latter direction and present a novel data hallucination model. Currently, most datapoint generators contain a specialized network (i.e., GAN) tasked with hallucinating new datapoints, thus requiring large numbers of annotated data for their training in the first place. In this paper, we propose a novel less-costly hallucination method for few-shot learning which utilizes saliency maps. To this end, we employ a saliency network to obtain the foregrounds and backgrounds of available image samples and feed the resulting maps into a two-stream network to hallucinate datapoints directly in the feature space from viable foreground-background combinations. To the best of our knowledge, we are the first to leverage saliency maps for such a task and we demonstrate their usefulness in hallucinating additional datapoints for few-shot learning. Our proposed network achieves the state of the art on publicly available datasets. |
Tasks | Few-Shot Learning |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03472v1 |
http://arxiv.org/pdf/1904.03472v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-via-saliency-guided |
Repo | |
Framework | |
Additional Baseline Metrics for the paper “Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification”
Title | Additional Baseline Metrics for the paper “Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification” |
Authors | Claudio Ferrari, Stefano Berretti, Alberto Del Bimbo |
Abstract | In this report, we provide additional and corrected results for the paper “Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification”. After further investigations, we discovered and corrected wrongly labeled images and incorrect identities. This forced us to re-generate the evaluation protocol for the new data; in doing so, we also reproduced and extended the experimental results with other standard metrics and measures used in the literature. The reader can refer to the original paper for additional details regarding the data collection procedure and recognition pipeline. |
Tasks | Face Identification |
Published | 2019-02-11 |
URL | http://arxiv.org/abs/1902.03804v1 |
http://arxiv.org/pdf/1902.03804v1.pdf | |
PWC | https://paperswithcode.com/paper/additional-baseline-metrics-for-the-paper |
Repo | |
Framework | |
A Deep Optimization Approach for Image Deconvolution
Title | A Deep Optimization Approach for Image Deconvolution |
Authors | Zhijian Luo, Siyu Chen, Yuntao Qian |
Abstract | In blind image deconvolution, priors are often leveraged to constrain the solution space, so as to alleviate the under-determinacy. Priors which are trained separately from the task of deconvolution tend to be instable, or ineffective. We propose the Golf Optimizer, a novel but simple form of network that learns deep priors from data with better propagation behavior. Like playing golf, our method first estimates an aggressive propagation towards optimum using one network, and recurrently applies a residual CNN to learn the gradient of prior for delicate correction on restoration. Experiments show that our network achieves competitive performance on GoPro dataset, and our model is extremely lightweight compared with the state-of-art works. |
Tasks | Image Deconvolution |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07516v1 |
http://arxiv.org/pdf/1904.07516v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-optimization-approach-for-image |
Repo | |
Framework | |
DECoVaC: Design of Experiments with Controlled Variability Components
Title | DECoVaC: Design of Experiments with Controlled Variability Components |
Authors | Thomas Boquet, Laure Delisle, Denis Kochetkov, Nathan Schucher, Parmida Atighehchian, Boris Oreshkin, Julien Cornebise |
Abstract | Reproducible research in Machine Learning has seen a salutary abundance of progress lately: workflows, transparency, and statistical analysis of validation and test performance. We build on these efforts and take them further. We offer a principled experimental design methodology, based on linear mixed models, to study and separate the effects of multiple factors of variation in machine learning experiments. This approach allows to account for the effects of architecture, optimizer, hyper-parameters, intentional randomization, as well as unintended lack of determinism across reruns. We illustrate that methodology by analyzing Matching Networks, Prototypical Networks and TADAM on the miniImagenet dataset. |
Tasks | |
Published | 2019-09-21 |
URL | https://arxiv.org/abs/1909.09859v1 |
https://arxiv.org/pdf/1909.09859v1.pdf | |
PWC | https://paperswithcode.com/paper/190909859 |
Repo | |
Framework | |
On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects
Title | On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects |
Authors | Linshan Jiang, Rui Tan, Xin Lou, Guosheng Lin |
Abstract | The Internet of Things (IoT) will be a main data generation infrastructure for achieving better system intelligence. This paper considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator. Existing distributed machine learning and data encryption approaches incur significant computation and communication overhead, rendering them ill-suited for resource-constrained IoT objects. We study an approach that applies independent Gaussian random projection at each IoT object to obfuscate data and trains a deep neural network at the coordinator based on the projected data from the IoT objects. This approach introduces light computation overhead to the IoT objects and moves most workload to the coordinator that can have sufficient computing resources. Although the independent projections performed by the IoT objects address the potential collusion between the curious coordinator and some compromised IoT objects, they significantly increase the complexity of the projected data. In this paper, we leverage the superior learning capability of deep learning in capturing sophisticated patterns to maintain good learning performance. Extensive comparative evaluation shows that this approach outperforms other lightweight approaches that apply additive noisification for differential privacy and/or support vector machines for learning in the applications with light data pattern complexities. |
Tasks | |
Published | 2019-02-13 |
URL | http://arxiv.org/abs/1902.05197v1 |
http://arxiv.org/pdf/1902.05197v1.pdf | |
PWC | https://paperswithcode.com/paper/on-lightweight-privacy-preserving |
Repo | |
Framework | |
A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces
Title | A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces |
Authors | Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić |
Abstract | Distributional word vectors have recently been shown to encode many of the human biases, most notably gender and racial biases, and models for attenuating such biases have consequently been proposed. However, existing models and studies (1) operate on under-specified and mutually differing bias definitions, (2) are tailored for a particular bias (e.g., gender bias) and (3) have been evaluated inconsistently and non-rigorously. In this work, we introduce a general framework for debiasing word embeddings. We operationalize the definition of a bias by discerning two types of bias specification: explicit and implicit. We then propose three debiasing models that operate on explicit or implicit bias specifications and that can be composed towards more robust debiasing. Finally, we devise a full-fledged evaluation framework in which we couple existing bias metrics with newly proposed ones. Experimental findings across three embedding methods suggest that the proposed debiasing models are robust and widely applicable: they often completely remove the bias both implicitly and explicitly without degradation of semantic information encoded in any of the input distributional spaces. Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications. |
Tasks | Word Embeddings |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06092v2 |
https://arxiv.org/pdf/1909.06092v2.pdf | |
PWC | https://paperswithcode.com/paper/a-general-framework-for-implicit-and-explicit |
Repo | |
Framework | |
Learning with Collaborative Neural Network Group by Reflection
Title | Learning with Collaborative Neural Network Group by Reflection |
Authors | Liyao Gao, Zehua Cheng |
Abstract | For the present engineering of neural systems, the preparing of extensive scale learning undertakings generally not just requires a huge neural system with a mind boggling preparing process yet additionally troublesome discover a clarification for genuine applications. In this paper, we might want to present the Collaborative Neural Network Group (CNNG). CNNG is a progression of neural systems that work cooperatively to deal with various errands independently in a similar learning framework. It is advanced from a solitary neural system by reflection. Along these lines, in light of various circumstances removed by the calculation, the CNNG can perform diverse techniques when handling the information. The examples of chose methodology can be seen by human to make profound adapting more reasonable. In our execution, the CNNG is joined by a few moderately little neural systems. We give a progression of examinations to assess the execution of CNNG contrasted with other learning strategies. The CNNG is able to get a higher accuracy with a much lower training cost. We can reduce the error rate by 74.5% and reached the accuracy of 99.45% in MNIST with three feedforward networks (4 layers) in one training epoch. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02433v2 |
http://arxiv.org/pdf/1901.02433v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-collaborative-neural-network |
Repo | |
Framework | |
Invisible Backdoor Attacks Against Deep Neural Networks
Title | Invisible Backdoor Attacks Against Deep Neural Networks |
Authors | Shaofeng Li, Benjamin Zi Hao Zhao, Jiahao Yu, Minhui Xue, Dali Kaafar, Haojin Zhu |
Abstract | Deep neural networks (DNNs) have been proven vulnerable to backdoor attacks, where hidden features (patterns) trained to a normal model, and only activated by some specific input (called triggers), trick the model into producing unexpected behavior. In this paper, we design an optimization framework to create covert and scattered triggers for backdoor attacks, \textit{invisible backdoors}, where triggers can amplify the specific neuron activation, while being invisible to both backdoor detection methods and human inspection. We use the Perceptual Adversarial Similarity Score (PASS)~\cite{rozsa2016adversarial} to define invisibility for human users and apply $L_2$ and $L_0$ regularization into the optimization process to hide the trigger within the input data. We show that the proposed invisible backdoors can be fairly effective across various DNN models as well as three datasets CIFAR-10, CIFAR-100, and GTSRB, by measuring their attack success rates and invisibility scores. |
Tasks | |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02742v1 |
https://arxiv.org/pdf/1909.02742v1.pdf | |
PWC | https://paperswithcode.com/paper/invisible-backdoor-attacks-against-deep |
Repo | |
Framework | |
Learning Generative Models of Structured Signals from Their Superposition Using GANs with Application to Denoising and Demixing
Title | Learning Generative Models of Structured Signals from Their Superposition Using GANs with Application to Denoising and Demixing |
Authors | Mohammadreza Soltani, Swayambhoo Jain, Abhinav Sambasivan |
Abstract | Recently, Generative Adversarial Networks (GANs) have emerged as a popular alternative for modeling complex high dimensional distributions. Most of the existing works implicitly assume that the clean samples from the target distribution are easily available. However, in many applications, this assumption is violated. In this paper, we consider the observation setting when the samples from target distribution are given by the superposition of two structured components and leverage GANs for learning the structure of the components. We propose two novel frameworks: denoising-GAN and demixing-GAN. The denoising-GAN assumes access to clean samples from the second component and try to learn the other distribution, whereas demixing-GAN learns the distribution of the components at the same time. Through extensive numerical experiments, we demonstrate that proposed frameworks can generate clean samples from unknown distributions, and provide competitive performance in tasks such as denoising, demixing, and compressive sensing. |
Tasks | Compressive Sensing, Denoising |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04664v1 |
http://arxiv.org/pdf/1902.04664v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-generative-models-of-structured |
Repo | |
Framework | |
A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques
Title | A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques |
Authors | Sukhada Chokkadi, Sannidhan M S, Sudeepa K B, Abhir Bhandary |
Abstract | Considering the existence of very large amount of available data repositories and reach to the very advanced system of hardware, systems meant for facial identification ave evolved enormously over the past few decades. Sketch recognition is one of the most important areas that have evolved as an integral component adopted by the agencies of law administration in current trends of forensic science. Matching of derived sketches to photo images of face is also a difficult assignment as the considered sketches are produced upon the verbal explanation depicted by the eye witness of the crime scene and may have scarcity of sensitive elements that exist in the photograph as one can accurately depict due to the natural human error. Substantial amount of the novel research work carried out in this area up late used recognition system through traditional extraction and classification models. But very recently, few researches work focused on using deep learning techniques to take an advantage of learning models for the feature extraction and classification to rule out potential domain challenges. The first part of this review paper basically focuses on deep learning techniques used in face recognition and matching which as improved the accuracy of face recognition technique with training of huge sets of data. This paper also includes a survey on different techniques used to match composite sketches to human images which includes component-based representation approach, automatic composite sketch recognition technique etc. |
Tasks | Face Recognition, Sketch Recognition |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08426v1 |
https://arxiv.org/pdf/1911.08426v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-on-various-state-of-the-art-of-the |
Repo | |
Framework | |
Joint Emotion Label Space Modelling for Affect Lexica
Title | Joint Emotion Label Space Modelling for Affect Lexica |
Authors | Luna De Bruyne, Pepa Atanasova, Isabelle Augenstein |
Abstract | Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection. However, methodological issues emerge when employing them: lexica are often not very extensive, and the way they are constructed can vary widely – from lab conditions to crowdsourced approaches and distant supervision. Furthermore, both categorical frameworks and dimensional frameworks coexist, in which theorists provide many different sets of categorical labels or dimensional axes. The heterogenous nature of the resulting emotion detection resources results in a need for a unified approach to utilising them. This paper contributes to the field of emotion analysis in NLP by a) presenting the first study to unify existing emotion detection resources automatically and thus learn more about the relationships between them; b) exploring the use of existing lexica for the above-mentioned task; c) presenting an approach to automatically combining emotion lexica, namely by a multi-view variational auto-encoder (VAE), which facilitates the mapping of datasets into a joint emotion label space. We test the utility of joint emotion lexica by using them as additional features in state-of-the art emotion detection models. Our overall findings are that emotion lexica can offer complementary information to even extremely large pre-trained models such as BERT. The performance of our models is comparable to state-of-the art models that are specifically engineered for certain datasets, and even outperform the state-of-the art on four datasets. |
Tasks | Emotion Recognition |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08782v1 |
https://arxiv.org/pdf/1911.08782v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-emotion-label-space-modelling-for |
Repo | |
Framework | |