Paper Group ANR 518
Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees. FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale Context Aggregation and Feature Space Super-resolution. The large learning rate phase of deep learning: the catapult mechanism. Semi-Supervised StyleGAN for Disentanglement Learning. Global and Local Feature L …
Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees
Title | Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees |
Authors | Richeng Jin, Yufan Huang, Xiaofan He, Huaiyu Dai, Tianfu Wu |
Abstract | Federated learning (FL) has emerged as a prominent distributed learning paradigm. FL entails some pressing needs for developing novel parameter estimation approaches with theoretical guarantees of convergence, which are also communication efficient, differentially private and Byzantine resilient in the heterogeneous data distribution settings. Quantization-based SGD solvers have been widely adopted in FL and the recently proposed SIGNSGD with majority vote shows a promising direction. However, no existing methods enjoy all the aforementioned properties. In this paper, we propose an intuitively-simple yet theoretically-sound method based on SIGNSGD to bridge the gap. We present Stochastic-Sign SGD which utilizes novel stochastic-sign based gradient compressors enabling the aforementioned properties in a unified framework. We also present an error-feedback variant of the proposed Stochastic-Sign SGD which further improves the learning performance in FL. We test the proposed method with extensive experiments using deep neural networks on the MNIST dataset. The experimental results corroborate the effectiveness of the proposed method. |
Tasks | Quantization |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10940v1 |
https://arxiv.org/pdf/2002.10940v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-sign-sgd-for-federated-learning |
Repo | |
Framework | |
FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale Context Aggregation and Feature Space Super-resolution
Title | FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale Context Aggregation and Feature Space Super-resolution |
Authors | Zhanpeng Zhang, Kaipeng Zhang |
Abstract | Real-time semantic segmentation is desirable in many robotic applications with limited computation resources. One challenge of semantic segmentation is to deal with the object scale variations and leverage the context. How to perform multi-scale context aggregation within limited computation budget is important. In this paper, firstly, we introduce a novel and efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP). It is a lightweight cascaded structure for Convolutional Neural Networks (CNNs) to efficiently leverage context information. On the other hand, for runtime efficiency, state-of-the-art methods will quickly decrease the spatial size of the inputs or feature maps in the early network stages. The final high-resolution result is usually obtained by non-parametric up-sampling operation (e.g. bilinear interpolation). Differently, we rethink this pipeline and treat it as a super-resolution process. We use optimized super-resolution operation in the up-sampling step and improve the accuracy, especially in sub-sampled input image scenario for real-time applications. By fusing the above two improvements, our methods provide better latency-accuracy trade-off than the other state-of-the-art methods. In particular, we achieve 68.4% mIoU at 84 fps on the Cityscapes test set with a single Nivida Titan X (Maxwell) GPU card. The proposed module can be plugged into any feature extraction CNN and benefits from the CNN structure development. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation, Super-Resolution |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.03913v1 |
https://arxiv.org/pdf/2003.03913v1.pdf | |
PWC | https://paperswithcode.com/paper/farsee-net-real-time-semantic-segmentation-by |
Repo | |
Framework | |
The large learning rate phase of deep learning: the catapult mechanism
Title | The large learning rate phase of deep learning: the catapult mechanism |
Authors | Aitor Lewkowycz, Yasaman Bahri, Ethan Dyer, Jascha Sohl-Dickstein, Guy Gur-Ari |
Abstract | The choice of initial learning rate can have a profound effect on the performance of deep networks. We present a class of neural networks with solvable training dynamics, and confirm their predictions empirically in practical deep learning settings. The networks exhibit sharply distinct behaviors at small and large learning rates. The two regimes are separated by a phase transition. In the small learning rate phase, training can be understood using the existing theory of infinitely wide neural networks. At large learning rates the model captures qualitatively distinct phenomena, including the convergence of gradient descent dynamics to flatter minima. One key prediction of our model is a narrow range of large, stable learning rates. We find good agreement between our model’s predictions and training dynamics in realistic deep learning settings. Furthermore, we find that the optimal performance in such settings is often found in the large learning rate phase. We believe our results shed light on characteristics of models trained at different learning rates. In particular, they fill a gap between existing wide neural network theory, and the nonlinear, large learning rate, training dynamics relevant to practice. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02218v1 |
https://arxiv.org/pdf/2003.02218v1.pdf | |
PWC | https://paperswithcode.com/paper/the-large-learning-rate-phase-of-deep |
Repo | |
Framework | |
Semi-Supervised StyleGAN for Disentanglement Learning
Title | Semi-Supervised StyleGAN for Disentanglement Learning |
Authors | Weili Nie, Tero Karras, Animesh Garg, Shoubhik Debhath, Anjul Patney, Ankit B. Patel, Anima Anandkumar |
Abstract | Disentanglement learning is crucial for obtaining disentangled representations and controllable generation. Current disentanglement methods face several inherent limitations: difficulty with high-resolution images, primarily on learning disentangled representations, and non-identifiability due to the unsupervised setting. To alleviate these limitations, we design new architectures and loss functions based on StyleGAN (Karras et al., 2019), for semi-supervised high-resolution disentanglement learning. We create two complex high-resolution synthetic datasets for systematic testing. We investigate the impact of limited supervision and find that using only 0.25%~2.5% of labeled data is sufficient for good disentanglement on both synthetic and real datasets. We propose new metrics to quantify generator controllability, and observe there may exist a crucial trade-off between disentangled representation learning and controllable generation. We also consider semantic fine-grained image editing to achieve better generalization to unseen images. |
Tasks | Representation Learning |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03461v1 |
https://arxiv.org/pdf/2003.03461v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-stylegan-for-disentanglement |
Repo | |
Framework | |
Global and Local Feature Learning for Ego-Network Analysis
Title | Global and Local Feature Learning for Ego-Network Analysis |
Authors | Fatemeh Salehi Rizi, Michael Granitzer, Konstantin Ziegler |
Abstract | In an ego-network, an individual (ego) organizes its friends (alters) in different groups (social circles). This social network can be efficiently analyzed after learning representations of the ego and its alters in a low-dimensional, real vector space. These representations are then easily exploited via statistical models for tasks such as social circle detection and prediction. Recent advances in language modeling via deep learning have inspired new methods for learning network representations. These methods can capture the global structure of networks. In this paper, we evolve these techniques to also encode the local structure of neighborhoods. Therefore, our local representations capture network features that are hidden in the global representation of large networks. We show that the task of social circle prediction benefits from a combination of global and local features generated by our technique. |
Tasks | Language Modelling, Learning Network Representations |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06685v1 |
https://arxiv.org/pdf/2002.06685v1.pdf | |
PWC | https://paperswithcode.com/paper/global-and-local-feature-learning-for-ego |
Repo | |
Framework | |
Joint Embedding in Named Entity Linking on Sentence Level
Title | Joint Embedding in Named Entity Linking on Sentence Level |
Authors | Wei Shi, Siyuan Zhang, Zhiwei Zhang, Hong Cheng, Jeffrey Xu Yu |
Abstract | Named entity linking is to map an ambiguous mention in documents to an entity in a knowledge base. The named entity linking is challenging, given the fact that there are multiple candidate entities for a mention in a document. It is difficult to link a mention when it appears multiple times in a document, since there are conflicts by the contexts around the appearances of the mention. In addition, it is difficult since the given training dataset is small due to the reason that it is done manually to link a mention to its mapping entity. In the literature, there are many reported studies among which the recent embedding methods learn vectors of entities from the training dataset at document level. To address these issues, we focus on how to link entity for mentions at a sentence level, which reduces the noises introduced by different appearances of the same mention in a document at the expense of insufficient information to be used. We propose a new unified embedding method by maximizing the relationships learned from knowledge graphs. We confirm the effectiveness of our method in our experimental studies. |
Tasks | Entity Linking, Knowledge Graphs |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04936v1 |
https://arxiv.org/pdf/2002.04936v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-embedding-in-named-entity-linking-on |
Repo | |
Framework | |
PEL-BERT: A Joint Model for Protocol Entity Linking
Title | PEL-BERT: A Joint Model for Protocol Entity Linking |
Authors | Shoubin Li, Wenzao Cui, Yujiang Liu, Xuran Ming, Jun Hu, YuanzheHu, Qing Wang |
Abstract | Pre-trained models such as BERT are widely used in NLP tasks and are fine-tuned to improve the performance of various NLP tasks consistently. Nevertheless, the fine-tuned BERT model trained on our protocol corpus still has a weak performance on the Entity Linking (EL) task. In this paper, we propose a model that joints a fine-tuned language model with an RFC Domain Model. Firstly, we design a Protocol Knowledge Base as the guideline for protocol EL. Secondly, we propose a novel model, PEL-BERT, to link named entities in protocols to categories in Protocol Knowledge Base. Finally, we conduct a comprehensive study on the performance of pre-trained language models on descriptive texts and abstract concepts. Experimental results demonstrate that our model achieves state-of-the-art performance in EL on our annotated dataset, outperforming all the baselines. |
Tasks | Entity Linking, Language Modelling |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2002.00744v1 |
https://arxiv.org/pdf/2002.00744v1.pdf | |
PWC | https://paperswithcode.com/paper/pel-bert-a-joint-model-for-protocol-entity |
Repo | |
Framework | |
Random-walk Based Generative Model for Classifying Document Networks
Title | Random-walk Based Generative Model for Classifying Document Networks |
Authors | Takafumi J. Suzuki |
Abstract | Document networks are found in various collections of real-world data, such as citation networks, hyperlinked web pages, and online social networks. A large number of generative models have been proposed because they offer intuitive and useful pictures for analyzing document networks. Prominent examples are relational topic models, where documents are linked according to their topic similarities. However, existing generative models do not make full use of network structures because they are largely dependent on topic modeling of documents. In particular, centrality of graph nodes is missing in generative processes of previous models. In this paper, we propose a novel generative model for document networks by introducing random walkers on networks to integrate the node centrality into link generation processes. The developed method is evaluated in semi-supervised classification tasks with real-world citation networks. We show that the proposed model outperforms existing probabilistic approaches especially in detecting communities in connected networks. |
Tasks | Topic Models |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07380v1 |
https://arxiv.org/pdf/2001.07380v1.pdf | |
PWC | https://paperswithcode.com/paper/random-walk-based-generative-model-for |
Repo | |
Framework | |
A Comprehensive Review for Breast Histopathology Image Analysis Using Classical and Deep Neural Networks
Title | A Comprehensive Review for Breast Histopathology Image Analysis Using Classical and Deep Neural Networks |
Authors | Xiaomin Zhou, Chen Li, Md Mamunur Rahaman, Yudong Yao, Shiliang Ai, Changhao Sun, Xiaoyan Li, Qian Wang, Tao Jiang |
Abstract | Breast cancer is one of the most common and deadliest cancers among women. Since histopathological images contain sufficient phenotypic information, they play an indispensable role in the diagnosis and treatment of breast cancers. To improve the accuracy and objectivity of Breast Histopathological Image Analysis (BHIA), Artificial Neural Network (ANN) approaches are widely used in the segmentation and classification tasks of breast histopathological images. In this review, we present a comprehensive overview of the BHIA techniques based on ANNs. First of all, we categorize the BHIA systems into classical and deep neural networks for in-depth investigation. Then, the relevant studies based on BHIA systems are presented. After that, we analyze the existing models to discover the most suitable algorithms. Finally, publicly accessible datasets, along with their download links, are provided for the convenience of future researchers. |
Tasks | |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12255v1 |
https://arxiv.org/pdf/2003.12255v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comprehensive-review-for-breast |
Repo | |
Framework | |
Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda
Title | Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda |
Authors | Jake Gutierrez, Jacob Schrum |
Abstract | Generative Adversarial Networks (GANs) have demonstrated their ability to learn patterns in data and produce new exemplars similar to, but different from, their training set in several domains, including video games. However, GANs have a fixed output size, so creating levels of arbitrary size for a dungeon crawling game is difficult. GANs also have trouble encoding semantic requirements that make levels interesting and playable. This paper combines a GAN approach to generating individual rooms with a graph grammar approach to combining rooms into a dungeon. The GAN captures design principles of individual rooms, but the graph grammar organizes rooms into a global layout with a sequence of obstacles determined by a designer. Room data from The Legend of Zelda is used to train the GAN. This approach is validated by a user study, showing that GAN dungeons are as enjoyable to play as a level from the original game, and levels generated with a graph grammar alone. However, GAN dungeons have rooms considered more complex, and plain graph grammar’s dungeons are considered least complex and challenging. Only the GAN approach creates an extensive supply of both layouts and rooms, where rooms span across the spectrum of those seen in the training set to new creations merging design principles from multiple rooms. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.05065v1 |
https://arxiv.org/pdf/2001.05065v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-network-rooms-in |
Repo | |
Framework | |
Adaptive Teaching of Temporal Logic Formulas to Learners with Preferences
Title | Adaptive Teaching of Temporal Logic Formulas to Learners with Preferences |
Authors | Zhe Xu, Yuxin Chen, Ufuk Topcu |
Abstract | Machine teaching is an algorithmic framework for teaching a target hypothesis via a sequence of examples or demonstrations. We investigate machine teaching for temporal logic formulas – a novel and expressive hypothesis class amenable to time-related task specifications. In the context of teaching temporal logic formulas, an exhaustive search even for a myopic solution takes exponential time (with respect to the time span of the task). We propose an efficient approach for teaching parametric linear temporal logic formulas. Concretely, we derive a necessary condition for the minimal time length of a demonstration to eliminate a set of hypotheses. Utilizing this condition, we propose a myopic teaching algorithm by solving a sequence of integer programming problems. We further show that, under two notions of teaching complexity, the proposed algorithm has near-optimal performance. The results strictly generalize the previous results on teaching preference-based version space learners. We evaluate our algorithm extensively under a variety of learner types (i.e., learners with different preference models) and interactive protocols (e.g., batched and adaptive). The results show that the proposed algorithms can efficiently teach a given target temporal logic formula under various settings, and that there are significant gains of teaching efficacy when the teacher adapts to the learner’s current hypotheses or uses oracles. |
Tasks | |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09956v1 |
https://arxiv.org/pdf/2001.09956v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-teaching-of-temporal-logic-formulas |
Repo | |
Framework | |
Unsupervised Hierarchical Graph Representation Learning by Mutual Information Maximization
Title | Unsupervised Hierarchical Graph Representation Learning by Mutual Information Maximization |
Authors | Fei Ding, Xiaohong Zhang, Justin Sybrandt, Ilya Safro |
Abstract | Graph representation learning based on graph neural networks (GNNs) can greatly improve the performance of downstream tasks, such as node and graph classification. However, the general GNN models cannot aggregate node information in a hierarchical manner, and thus cannot effectively capture the structural features of graphs. In addition, most of the existing hierarchical graph representation learning are supervised, which are limited by the extreme cost of acquiring labeled data. To address these issues, we present an unsupervised graph representation learning method, Unsupervised Hierarchical Graph Representation UHGR, which can generate hierarchical representations of graphs. This contrastive learning technique focuses on maximizing mutual information between “local” and high-level “global” representations, which enables us to learn the node embeddings and graph embeddings without any labeled data. To demonstrate the effectiveness of the proposed method, we perform the node and graph classification using the learned node and graph embeddings. The results show that the proposed method achieves comparable results to state-of-the-art supervised methods on several benchmarks. In addition, our visualization of hierarchical representations indicates that our method can capture meaningful and interpretable clusters. |
Tasks | Graph Classification, Graph Representation Learning, Representation Learning |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08420v1 |
https://arxiv.org/pdf/2003.08420v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-hierarchical-graph-1 |
Repo | |
Framework | |
Universal Function Approximation on Graphs using Multivalued Functions
Title | Universal Function Approximation on Graphs using Multivalued Functions |
Authors | Rickard Brüel-Gabrielsson |
Abstract | In this work we produce a framework for constructing universal function approximators on graph isomorphism classes. Additionally, we prove how this framework comes with a collection of theoretically desirable properties and enables novel analysis. We show how this allows us to outperform state of the art on four different well known datasets in graph classification and how our method can separate classes of graphs that other graph-learning methods cannot. Our approach is inspired by persistence homology, dependency parsing for Natural Language Processing, and multivalued functions. The complexity of the underlying algorithm is O(mn) and code is publicly available. |
Tasks | Dependency Parsing, Graph Classification |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06706v1 |
https://arxiv.org/pdf/2003.06706v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-function-approximation-on-graphs |
Repo | |
Framework | |
An End-to-End Graph Convolutional Kernel Support Vector Machine
Title | An End-to-End Graph Convolutional Kernel Support Vector Machine |
Authors | Padraig Corcoran |
Abstract | A novel kernel-based support vector machine (SVM) for graph classification is proposed. The SVM feature space mapping consists of a sequence of graph convolutional layers, which generates a vector space representation for each vertex, followed by a pooling layer which generates a reproducing kernel Hilbert space (RKHS) representation for the graph. The use of a RKHS offers the ability to implicitly operate in this space using a kernel function without the computational complexity of explicitly mapping into it. The proposed model is trained in a supervised end-to-end manner whereby the convolutional layers, the kernel function and SVM parameters are jointly optimized with respect to a regularized classification loss. This approach is distinct from existing kernel-based graph classification models which instead either use feature engineering or unsupervised learning to define the kernel function. Experimental results demonstrate that the proposed model outperforms existing deep learning baseline models on a number of datasets. |
Tasks | Feature Engineering, Graph Classification |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00226v1 |
https://arxiv.org/pdf/2003.00226v1.pdf | |
PWC | https://paperswithcode.com/paper/an-end-to-end-graph-convolutional-kernel |
Repo | |
Framework | |
3D dynamic hand gestures recognition using the Leap Motion sensor and convolutional neural networks
Title | 3D dynamic hand gestures recognition using the Leap Motion sensor and convolutional neural networks |
Authors | Katia Lupinetti, Andrea Ranieri, Franca Giannini, Marina Monti |
Abstract | Defining methods for the automatic understanding of gestures is of paramount importance in many application contexts and in Virtual Reality applications for creating more natural and easy-to-use human-computer interaction methods. In this paper, we present a method for the recognition of a set of non-static gestures acquired through the Leap Motion sensor. The acquired gesture information is converted in color images, where the variation of hand joint positions during the gesture are projected on a plane and temporal information is represented with color intensity of the projected points. The classification of the gestures is performed using a deep Convolutional Neural Network (CNN). A modified version of the popular ResNet-50 architecture is adopted, obtained by removing the last fully connected layer and adding a new layer with as many neurons as the considered gesture classes. The method has been successfully applied to the existing reference dataset and preliminary tests have already been performed for the real-time recognition of dynamic gestures performed by users. |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01450v2 |
https://arxiv.org/pdf/2003.01450v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-dynamic-hand-gestures-recognition-using |
Repo | |
Framework | |