Paper Group ANR 1078
Federated Learning with Non-IID Data. MAGAN: Aligning Biological Manifolds. Meta-Embedding as Auxiliary Task Regularization. A Deep Neural Network Sentence Level Classification Method with Context Information. A Hybrid Instance-based Transfer Learning Method. Nonlinear Dimensionality Reduction for Discriminative Analytics of Multiple Datasets. Self …
Federated Learning with Non-IID Data
Title | Federated Learning with Non-IID Data |
Authors | Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vikas Chandra |
Abstract | Federated learning enables resource-constrained edge compute devices, such as mobile phones and IoT devices, to learn a shared model for prediction, while keeping the training data local. This decentralized approach to train models provides privacy, security, regulatory and economic benefits. In this work, we focus on the statistical challenge of federated learning when local data is non-IID. We first show that the accuracy of federated learning reduces significantly, by up to 55% for neural networks trained for highly skewed non-IID data, where each client device trains only on a single class of data. We further show that this accuracy reduction can be explained by the weight divergence, which can be quantified by the earth mover’s distance (EMD) between the distribution over classes on each device and the population distribution. As a solution, we propose a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices. Experiments show that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data. |
Tasks | |
Published | 2018-06-02 |
URL | http://arxiv.org/abs/1806.00582v1 |
http://arxiv.org/pdf/1806.00582v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-learning-with-non-iid-data |
Repo | |
Framework | |
MAGAN: Aligning Biological Manifolds
Title | MAGAN: Aligning Biological Manifolds |
Authors | Matthew Amodio, Smita Krishnaswamy |
Abstract | It is increasingly common in many types of natural and physical systems (especially biological systems) to have different types of measurements performed on the same underlying system. In such settings, it is important to align the manifolds arising from each measurement in order to integrate such data and gain an improved picture of the system. We tackle this problem using generative adversarial networks (GANs). Recently, GANs have been utilized to try to find correspondences between sets of samples. However, these GANs are not explicitly designed for proper alignment of manifolds. We present a new GAN called the Manifold-Aligning GAN (MAGAN) that aligns two manifolds such that related points in each measurement space are aligned together. We demonstrate applications of MAGAN in single-cell biology in integrating two different measurement types together. In our demonstrated examples, cells from the same tissue are measured with both genomic (single-cell RNA-sequencing) and proteomic (mass cytometry) technologies. We show that the MAGAN successfully aligns them such that known correlations between measured markers are improved compared to other recently proposed models. |
Tasks | |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1803.00385v1 |
http://arxiv.org/pdf/1803.00385v1.pdf | |
PWC | https://paperswithcode.com/paper/magan-aligning-biological-manifolds |
Repo | |
Framework | |
Meta-Embedding as Auxiliary Task Regularization
Title | Meta-Embedding as Auxiliary Task Regularization |
Authors | James O’ Neill, Danushka Bollegala |
Abstract | Word embeddings have been shown to benefit from ensambling several word embedding sources, often carried out using straightforward mathematical operations over the set of word vectors. More recently, self-supervised learning has been used to find a lower-dimensional representation, similar in size to the individual word embeddings within the ensemble. However, these methods do not use the available manually labeled datasets that are often used solely for the purpose of evaluation. We propose to reconstruct an ensemble of word embeddings as an auxiliary task that regularises a main task while both tasks share the learned meta-embedding layer. We carry out intrinsic evaluation (6 word similarity datasets and 3 analogy datasets) and extrinsic evaluation (4 downstream tasks). For intrinsic task evaluation, supervision comes from various labeled word similarity datasets. Our experimental results show that the performance is improved for all word similarity datasets when compared to self-supervised learning methods with a mean increase of $11.33$ in Spearman correlation. Specifically, the proposed method shows the best performance in 4 out of 6 of word similarity datasets when using a cosine reconstruction loss and Brier’s word similarity loss. Moreover, improvements are also made when performing word meta-embedding reconstruction in sequence tagging and sentence meta-embedding for sentence classification. |
Tasks | Sentence Classification, Word Embeddings |
Published | 2018-09-16 |
URL | https://arxiv.org/abs/1809.05886v2 |
https://arxiv.org/pdf/1809.05886v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-multi-task-word-embeddings |
Repo | |
Framework | |
A Deep Neural Network Sentence Level Classification Method with Context Information
Title | A Deep Neural Network Sentence Level Classification Method with Context Information |
Authors | Xingyi Song, Johann Petrak, Angus Roberts |
Abstract | In the sentence classification task, context formed from sentences adjacent to the sentence being classified can provide important information for classification. This context is, however, often ignored. Where methods do make use of context, only small amounts are considered, making it difficult to scale. We present a new method for sentence classification, Context-LSTM-CNN, that makes use of potentially large contexts. The method also utilizes long-range dependencies within the sentence being classified, using an LSTM, and short-span features, using a stacked CNN. Our experiments demonstrate that this approach consistently improves over previous methods on two different datasets. |
Tasks | Sentence Classification |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1809.00934v1 |
http://arxiv.org/pdf/1809.00934v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-neural-network-sentence-level |
Repo | |
Framework | |
A Hybrid Instance-based Transfer Learning Method
Title | A Hybrid Instance-based Transfer Learning Method |
Authors | Azin Asgarian, Parinaz Sobhani, Ji Chao Zhang, Madalin Mihailescu, Ariel Sibilia, Ahmed Bilal Ashraf, Babak Taati |
Abstract | In recent years, supervised machine learning models have demonstrated tremendous success in a variety of application domains. Despite the promising results, these successful models are data hungry and their performance relies heavily on the size of training data. However, in many healthcare applications it is difficult to collect sufficiently large training datasets. Transfer learning can help overcome this issue by transferring the knowledge from readily available datasets (source) to a new dataset (target). In this work, we propose a hybrid instance-based transfer learning method that outperforms a set of baselines including state-of-the-art instance-based transfer learning approaches. Our method uses a probabilistic weighting strategy to fuse information from the source domain to the model learned in the target domain. Our method is generic, applicable to multiple source domains, and robust with respect to negative transfer. We demonstrate the effectiveness of our approach through extensive experiments for two different applications. |
Tasks | Transfer Learning |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.01063v1 |
http://arxiv.org/pdf/1812.01063v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-instance-based-transfer-learning |
Repo | |
Framework | |
Nonlinear Dimensionality Reduction for Discriminative Analytics of Multiple Datasets
Title | Nonlinear Dimensionality Reduction for Discriminative Analytics of Multiple Datasets |
Authors | Jia Chen, Gang Wang, Georgios B. Giannakis |
Abstract | Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. Standard PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the component vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, corroborating dimensionality reduction tests using both synthetic and real datasets are provided to validate the effectiveness of the proposed methods. |
Tasks | Dimensionality Reduction |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05502v5 |
http://arxiv.org/pdf/1805.05502v5.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-dimensionality-reduction-for |
Repo | |
Framework | |
Self-Supervised Monocular Image Depth Learning and Confidence Estimation
Title | Self-Supervised Monocular Image Depth Learning and Confidence Estimation |
Authors | Long Chen, Wen Tang, Nigel John |
Abstract | Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks. We propose a novel framework for depth estimation from monocular images with corresponding confidence in a self-supervised manner. A fully differential patch-based cost function is proposed by using the Zero-Mean Normalized Cross Correlation (ZNCC) that takes multi-scale patches as a matching strategy. This approach greatly increases the accuracy and robustness of the depth learning. In addition, the proposed patch-based cost function can provide a 0 to 1 confidence, which is then used to supervise the training of a parallel network for confidence map learning and estimation. Evaluation on KITTI dataset shows that our method outperforms the state-of-the-art results. |
Tasks | Depth Estimation |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05530v1 |
http://arxiv.org/pdf/1803.05530v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-monocular-image-depth |
Repo | |
Framework | |
Algorithms and Theory for Multiple-Source Adaptation
Title | Algorithms and Theory for Multiple-Source Adaptation |
Authors | Judy Hoffman, Mehryar Mohri, Ningshan Zhang |
Abstract | This work includes a number of novel contributions for the multiple-source adaptation problem. We present new normalized solutions with strong theoretical guarantees for the cross-entropy loss and other similar losses. We also provide new guarantees that hold in the case where the conditional probabilities for the source domains are distinct. Moreover, we give new algorithms for determining the distribution-weighted combination solution for the cross-entropy loss and other losses. We report the results of a series of experiments with real-world datasets. We find that our algorithm outperforms competing approaches by producing a single robust model that performs well on any target mixture distribution. Altogether, our theory, algorithms, and empirical results provide a full solution for the multiple-source adaptation problem with very practical benefits. |
Tasks | |
Published | 2018-05-20 |
URL | http://arxiv.org/abs/1805.08727v1 |
http://arxiv.org/pdf/1805.08727v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-and-theory-for-multiple-source |
Repo | |
Framework | |
Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning
Title | Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning |
Authors | Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter |
Abstract | In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input. Our neural architecture is composed of a critic and an actor network. Both networks receive the hidden representation of a deep convolutional autoencoder which is trained to reconstruct the visual input, while the centre-most hidden representation is also optimized to estimate the state value. Separately, an ensemble of predictive world models generates, based on its learning progress, an intrinsic reward signal which is combined with the extrinsic reward to guide the exploration of the actor-critic learner. Our approach is more data-efficient and inherently more stable than the existing actor-critic methods for continuous control from pixel data. We evaluate our algorithm for the task of learning robotic reaching and grasping skills on a realistic physics simulator and on a humanoid robot. The results show that the control policies learned with our approach can achieve better performance than the compared state-of-the-art and baseline algorithms in both dense-reward and challenging sparse-reward settings. |
Tasks | Continuous Control |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11388v2 |
http://arxiv.org/pdf/1810.11388v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-intrinsically-motivated-continuous-actor |
Repo | |
Framework | |
Virtual Class Enhanced Discriminative Embedding Learning
Title | Virtual Class Enhanced Discriminative Embedding Learning |
Authors | Binghui Chen, Weihong Deng, Haifeng Shen |
Abstract | Recently, learning discriminative features to improve the recognition performances gradually becomes the primary goal of deep learning, and numerous remarkable works have emerged. In this paper, we propose a novel yet extremely simple method \textbf{Virtual Softmax} to enhance the discriminative property of learned features by injecting a dynamic virtual negative class into the original softmax. Injecting virtual class aims to enlarge inter-class margin and compress intra-class distribution by strengthening the decision boundary constraint. Although it seems weird to optimize with this additional virtual class, we show that our method derives from an intuitive and clear motivation, and it indeed encourages the features to be more compact and separable. This paper empirically and experimentally demonstrates the superiority of Virtual Softmax, improving the performances on a variety of object classification and face verification tasks. |
Tasks | Face Verification, Object Classification |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12611v1 |
http://arxiv.org/pdf/1811.12611v1.pdf | |
PWC | https://paperswithcode.com/paper/virtual-class-enhanced-discriminative |
Repo | |
Framework | |
HSD-CNN: Hierarchically self decomposing CNN architecture using class specific filter sensitivity analysis
Title | HSD-CNN: Hierarchically self decomposing CNN architecture using class specific filter sensitivity analysis |
Authors | K. Sai Ram, Jayanta Mukherjee, Amit Patra, Partha Pratim Das |
Abstract | Conventional Convolutional neural networks (CNN) are trained on large domain datasets and are hence typically over-represented and inefficient in limited class applications. An efficient way to convert such large many-class pre-trained networks into small few-class networks is through a hierarchical decomposition of its feature maps. To alleviate this issue, we propose an automated framework for such decomposition in Hierarchically Self Decomposing CNN (HSD-CNN), in four steps. HSD-CNN is derived automatically using a class-specific filter sensitivity analysis that quantifies the impact of specific features on a class prediction. The decomposed hierarchical network can be utilized and deployed directly to obtain sub-networks for a subset of classes, and it is shown to perform better without the requirement of retraining these sub-networks. Experimental results show that HSD-CNN generally does not degrade accuracy if the full set of classes are used. Interestingly, when operating on known subsets of classes, HSD-CNN has an improvement in accuracy with a much smaller model size, requiring much fewer operations. HSD-CNN flow is verified on the CIFAR10, CIFAR100 and CALTECH101 data sets. We report accuracies up to $85.6%$ ( $94.75%$ ) on scenarios with 13 ( 4 ) classes of CIFAR100, using a pre-trained VGG-16 network on the full data set. In this case, the proposed HSD-CNN requires $3.97 \times$ fewer parameters and has $71.22%$ savings in operations, in comparison to baseline VGG-16 containing features for all 100 classes. |
Tasks | |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04406v2 |
http://arxiv.org/pdf/1811.04406v2.pdf | |
PWC | https://paperswithcode.com/paper/hsd-cnn-hierarchically-self-decomposing-cnn |
Repo | |
Framework | |
Event2Mind: Commonsense Inference on Events, Intents, and Reactions
Title | Event2Mind: Commonsense Inference on Events, Intents, and Reactions |
Authors | Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, Yejin Choi |
Abstract | We investigate a new commonsense inference task: given an event described in a short free-form text (“X drinks coffee in the morning”), a system reasons about the likely intents (“X wants to stay awake”) and reactions (“X feels alert”) of the event’s participants. To support this study, we construct a new crowdsourced corpus of 25,000 event phrases covering a diverse range of everyday events and situations. We report baseline performance on this task, demonstrating that neural encoder-decoder models can successfully compose embedding representations of previously unseen events and reason about the likely intents and reactions of the event participants. In addition, we demonstrate how commonsense inference on people’s intents and reactions can help unveil the implicit gender inequality prevalent in modern movie scripts. |
Tasks | Common Sense Reasoning |
Published | 2018-05-17 |
URL | https://arxiv.org/abs/1805.06939v2 |
https://arxiv.org/pdf/1805.06939v2.pdf | |
PWC | https://paperswithcode.com/paper/event2mind-commonsense-inference-on-events |
Repo | |
Framework | |
A Graph-CNN for 3D Point Cloud Classification
Title | A Graph-CNN for 3D Point Cloud Classification |
Authors | Yingxue Zhang, Michael Rabbat |
Abstract | Graph convolutional neural networks (Graph-CNNs) extend traditional CNNs to handle data that is supported on a graph. Major challenges when working with data on graphs are that the support set (the vertices of the graph) do not typically have a natural ordering, and in general, the topology of the graph is not regular (i.e., vertices do not all have the same number of neighbors). Thus, Graph-CNNs have huge potential to deal with 3D point cloud data which has been obtained from sampling a manifold. In this paper, we develop a Graph-CNN for classifying 3D point cloud data, called PointGCN. The architecture combines localized graph convolutions with two types of graph downsampling operations (also known as pooling). By the effective exploration of the point cloud local structure using the Graph-CNN, the proposed architecture achieves competitive performance on the 3D object classification benchmark ModelNet, and our architecture is more stable than competing schemes. |
Tasks | 3D Object Classification, Object Classification |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1812.01711v1 |
http://arxiv.org/pdf/1812.01711v1.pdf | |
PWC | https://paperswithcode.com/paper/181201711 |
Repo | |
Framework | |
CIFAR10 to Compare Visual Recognition Performance between Deep Neural Networks and Humans
Title | CIFAR10 to Compare Visual Recognition Performance between Deep Neural Networks and Humans |
Authors | Tien Ho-Phuoc |
Abstract | Visual object recognition plays an essential role in human daily life. This ability is so efficient that we can recognize a face or an object seemingly without effort, though they may vary in position, scale, pose, and illumination. In the field of computer vision, a large number of studies have been carried out to build a human-like object recognition system. Recently, deep neural networks have shown impressive progress in object classification performance, and have been reported to surpass humans. Yet there is still lack of thorough and fair comparison between humans and artificial recognition systems. While some studies consider artificially degraded images, human recognition performance on dataset widely used for deep neural networks has not been fully evaluated. The present paper carries out an extensive experiment to evaluate human classification accuracy on CIFAR10, a well-known dataset of natural images. This then allows for a fair comparison with the state-of-the-art deep neural networks. Our CIFAR10-based evaluations show very efficient object recognition of recent CNNs but, at the same time, prove that they are still far from human-level capability of generalization. Moreover, a detailed investigation using multiple levels of difficulty reveals that easy images for humans may not be easy for deep neural networks. Such images form a subset of CIFAR10 that can be employed to evaluate and improve future neural networks. |
Tasks | Object Classification, Object Recognition |
Published | 2018-11-18 |
URL | https://arxiv.org/abs/1811.07270v2 |
https://arxiv.org/pdf/1811.07270v2.pdf | |
PWC | https://paperswithcode.com/paper/cifar10-to-compare-visual-recognition |
Repo | |
Framework | |
Minibatch Gibbs Sampling on Large Graphical Models
Title | Minibatch Gibbs Sampling on Large Graphical Models |
Authors | Christopher De Sa, Vincent Chen, Wing Wong |
Abstract | Gibbs sampling is the de facto Markov chain Monte Carlo method used for inference and learning on large scale graphical models. For complicated factor graphs with lots of factors, the performance of Gibbs sampling can be limited by the computational cost of executing a single update step of the Markov chain. This cost is proportional to the degree of the graph, the number of factors adjacent to each variable. In this paper, we show how this cost can be reduced by using minibatching: subsampling the factors to form an estimate of their sum. We introduce several minibatched variants of Gibbs, show that they can be made unbiased, prove bounds on their convergence rates, and show that under some conditions they can result in asymptotic single-update-run-time speedups over plain Gibbs sampling. |
Tasks | |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.06086v1 |
http://arxiv.org/pdf/1806.06086v1.pdf | |
PWC | https://paperswithcode.com/paper/minibatch-gibbs-sampling-on-large-graphical |
Repo | |
Framework | |