Paper Group ANR 243
Federated Visual Classification with Real-World Data Distribution. A Scalable Framework for Sparse Clustering Without Shrinkage. A Proof of Useful Work for Artificial Intelligence on the Blockchain. MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?. Implicitly Defined Layers in Neural Networks. Heterogeneity Loss to Handle Int …
Federated Visual Classification with Real-World Data Distribution
Title | Federated Visual Classification with Real-World Data Distribution |
Authors | Tzu-Ming Harry Hsu, Hang Qi, Matthew Brown |
Abstract | Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. Furthermore, differing quantities of data are typically available at each device (imbalance). In this work, we characterize the effect these real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm. To do so, we introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits that simulate real-world edge learning scenarios. We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08082v1 |
https://arxiv.org/pdf/2003.08082v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-visual-classification-with-real |
Repo | |
Framework | |
A Scalable Framework for Sparse Clustering Without Shrinkage
Title | A Scalable Framework for Sparse Clustering Without Shrinkage |
Authors | Zhiyue Zhang, Kenneth Lange, Jason Xu |
Abstract | Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters. This has motivated the development of sparse clustering techniques that typically rely on k-means within outer algorithms of high computational complexity. Current techniques also require careful tuning of shrinkage parameters, further limiting their scalability. In this paper, we propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms. We show that our algorithm enjoys consistency and convergence guarantees. Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings. We showcase these contributions via simulated experiments and benchmark datasets, as well as a case study on mouse protein expression. |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08541v1 |
https://arxiv.org/pdf/2002.08541v1.pdf | |
PWC | https://paperswithcode.com/paper/a-scalable-framework-for-sparse-clustering |
Repo | |
Framework | |
A Proof of Useful Work for Artificial Intelligence on the Blockchain
Title | A Proof of Useful Work for Artificial Intelligence on the Blockchain |
Authors | Andrei Lihu, Jincheng Du, Igor Barjaktarevic, Patrick Gerzanics, Mark Harvilla |
Abstract | Bitcoin mining is a wasteful and resource-intensive process. To add a block of transactions to the blockchain, miners spend a considerable amount of energy. The Bitcoin protocol, named ‘proof of work’ (PoW), resembles a lottery and the underlying computational work is not useful otherwise. In this paper, we describe a novel ‘proof of useful work’ (PoUW) protocol based on training a machine learning model on the blockchain. Miners get a chance to create new coins after performing honest ML training work. Clients submit tasks and pay all training contributors. This is an extra incentive to participate in the network because the system does not rely only on the lottery procedure. Using our consensus protocol, interested parties can order, complete, and verify useful work in a distributed environment. We outline mechanisms to reward useful work and punish malicious actors. We aim to build better AI systems using the security of the blockchain. |
Tasks | |
Published | 2020-01-25 |
URL | https://arxiv.org/abs/2001.09244v1 |
https://arxiv.org/pdf/2001.09244v1.pdf | |
PWC | https://paperswithcode.com/paper/a-proof-of-useful-work-for-artificial |
Repo | |
Framework | |
MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?
Title | MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? |
Authors | Joseph Bethge, Christian Bartz, Haojin Yang, Ying Chen, Christoph Meinel |
Abstract | Binary Neural Networks (BNNs) are neural networks which use binary weights and activations instead of the typical 32-bit floating point values. They have reduced model sizes and allow for efficient inference on mobile or embedded devices with limited power and computational resources. However, the binarization of weights and activations leads to feature maps of lower quality and lower capacity and thus a drop in accuracy compared to traditional networks. Previous work has increased the number of channels or used multiple binary bases to alleviate these problems. In this paper, we instead present an architectural approach: MeliusNet. It consists of alternating a DenseBlock, which increases the feature capacity, and our proposed ImprovementBlock, which increases the feature quality. Experiments on the ImageNet dataset demonstrate the superior performance of our MeliusNet over a variety of popular binary architectures with regards to both computation savings and accuracy. Furthermore, with our method we trained BNN models, which for the first time can match the accuracy of the popular compact network MobileNet-v1 in terms of model size, number of operations and accuracy. Our code is published online at https://github.com/hpi-xnor/BMXNet-v2 |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05936v2 |
https://arxiv.org/pdf/2001.05936v2.pdf | |
PWC | https://paperswithcode.com/paper/meliusnet-can-binary-neural-networks-achieve |
Repo | |
Framework | |
Implicitly Defined Layers in Neural Networks
Title | Implicitly Defined Layers in Neural Networks |
Authors | Qianggong Zhang, Yanyang Gu, Michalkiewicz Mateusz, Mahsa Baktashmotlagh, Anders Eriksson |
Abstract | In conventional formulations of multilayer feedforward neural networks, the individual layers are customarily defined by explicit functions. In this paper we demonstrate that defining individual layers in a neural network \emph{implicitly} provide much richer representations over the standard explicit one, consequently enabling a vastly broader class of end-to-end trainable architectures. We present a general framework of implicitly defined layers, where much of the theoretical analysis of such layers can be addressed through the implicit function theorem. We also show how implicitly defined layers can be seamlessly incorporated into existing machine learning libraries. In particular with respect to current automatic differentiation techniques for use in backpropagation based training. Finally, we demonstrate the versatility and relevance of our proposed approach on a number of diverse example problems with promising results. |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01822v1 |
https://arxiv.org/pdf/2003.01822v1.pdf | |
PWC | https://paperswithcode.com/paper/implicitly-defined-layers-in-neural-networks |
Repo | |
Framework | |
Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer
Title | Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer |
Authors | Shubham Goswami, Suril Mehta, Dhruva Sahrawat, Anubha Gupta, Ritu Gupta |
Abstract | Developing nations lack adequate number of hospitals with modern equipment and skilled doctors. Hence, a significant proportion of these nations’ population, particularly in rural areas, is not able to avail specialized and timely healthcare facilities. In recent years, deep learning (DL) models, a class of artificial intelligence (AI) methods, have shown impressive results in medical domain. These AI methods can provide immense support to developing nations as affordable healthcare solutions. This work is focused on one such application of blood cancer diagnosis. However, there are some challenges to DL models in cancer research because of the unavailability of a large data for adequate training and the difficulty of capturing heterogeneity in data at different levels ranging from acquisition characteristics, session, to subject-level (within subjects and across subjects). These challenges render DL models prone to overfitting and hence, models lack generalization on prospective subjects’ data. In this work, we address these problems in the application of B-cell Acute Lymphoblastic Leukemia (B-ALL) diagnosis using deep learning. We propose heterogeneity loss that captures subject-level heterogeneity, thereby, forcing the neural network to learn subject-independent features. We also propose an unorthodox ensemble strategy that helps us in providing improved classification over models trained on 7-folds giving a weighted-$F_1$ score of 95.26% on unseen (test) subjects’ data that are, so far, the best results on the C-NMC 2019 dataset for B-ALL classification. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03295v2 |
https://arxiv.org/pdf/2003.03295v2.pdf | |
PWC | https://paperswithcode.com/paper/heterogeneity-loss-to-handle-intersubject-and |
Repo | |
Framework | |
Symbolic Querying of Vector Spaces: Probabilistic Databases Meets Relational Embeddings
Title | Symbolic Querying of Vector Spaces: Probabilistic Databases Meets Relational Embeddings |
Authors | Tal Friedman, Guy Van den Broeck |
Abstract | To deal with increasing amounts of uncertainty and incompleteness in relational data, we propose unifying techniques from probabilistic databases and relational embedding models. We use probabilistic databases as our formalism to define the probabilistic model with respect to which all queries are done. This allows us to leverage the rich literature of theory and algorithms from probabilistic databases for solving problems. While this formalization can be used with any relational embedding model, the lack of a well defined joint probability distribution causes simple problems to become provably hard. With this in mind, we introduce \TO, a relational embedding model designed in terms of probabilistic databases to exploit typical embedding assumptions within the probabilistic framework. Using principled, efficient inference algorithms that can be derived from its definition, we empirically demonstrate that \TOs is an effective and general model for these tasks. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10029v1 |
https://arxiv.org/pdf/2002.10029v1.pdf | |
PWC | https://paperswithcode.com/paper/symbolic-querying-of-vector-spaces |
Repo | |
Framework | |
Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks
Title | Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks |
Authors | Théodore Bluche, Maël Primet, Thibault Gisselbrecht |
Abstract | We explore a keyword-based spoken language understanding system, in which the intent of the user can directly be derived from the detection of a sequence of keywords in the query. In this paper, we focus on an open-vocabulary keyword spotting method, allowing the user to define their own keywords without having to retrain the whole model. We describe the different design choices leading to a fast and small-footprint system, able to run on tiny devices, for any arbitrary set of user-defined keywords, without training data specific to those keywords. The model, based on a quantized long short-term memory (LSTM) neural network, trained with connectionist temporal classification (CTC), weighs less than 500KB. Our approach takes advantage of some properties of the predictions of CTC-trained networks to calibrate the confidence scores and implement a fast detection algorithm. The proposed system outperforms a standard keyword-filler model approach. |
Tasks | Keyword Spotting, Spoken Language Understanding |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10851v1 |
https://arxiv.org/pdf/2002.10851v1.pdf | |
PWC | https://paperswithcode.com/paper/small-footprint-open-vocabulary-keyword |
Repo | |
Framework | |
Multi-wavelet residual dense convolutional neural network for image denoising
Title | Multi-wavelet residual dense convolutional neural network for image denoising |
Authors | Shuo-Fei Wang, Wen-Kai Yu, Ya-Xin Li |
Abstract | Networks with large receptive field (RF) have shown advanced fitting ability in recent years. In this work, we utilize the short-term residual learning method to improve the performance and robustness of networks for image denoising tasks. Here, we choose a multi-wavelet convolutional neural network (MWCNN), one of the state-of-art networks with large RF, as the backbone, and insert residual dense blocks (RDBs) in its each layer. We call this scheme multi-wavelet residual dense convolutional neural network (MWRDCNN). Compared with other RDB-based networks, it can extract more features of the object from adjacent layers, preserve the large RF, and boost the computing efficiency. Meanwhile, this approach also provides a possibility of absorbing advantages of multiple architectures in a single network without conflicts. The performance of the proposed method has been demonstrated in extensive experiments with a comparison with existing techniques. |
Tasks | Denoising, Image Denoising |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08301v1 |
https://arxiv.org/pdf/2002.08301v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-wavelet-residual-dense-convolutional |
Repo | |
Framework | |
Application of ERA5 and MENA simulations to predict offshore wind energy potential
Title | Application of ERA5 and MENA simulations to predict offshore wind energy potential |
Authors | Shahab Shamshirband, Amir Mosavi, Narjes Nabipour, Kwok-wing Chau |
Abstract | This study explores wind energy resources in different locations through the Gulf of Oman and also their future variability due climate change impacts. In this regard, EC-EARTH near surface wind outputs obtained from CORDEX-MENA simulations are used for historical and future projection of the energy. The ERA5 wind data are employed to assess suitability of the climate model. Moreover, the ERA5 wave data over the study area are applied to compute sea surface roughness as an important variable for converting near surface wind speeds to those of wind speed at turbine hub-height. Considering the power distribution, bathymetry and distance from the coats, some spots as tentative energy hotspots to provide detailed assessment of directional and temporal variability and also to investigate climate change impact studies. RCP8.5 as a common climatic scenario is used to project and extract future variation of the energy in the selected sites. The results of this study demonstrate that the selected locations have a suitable potential for wind power turbine plan and constructions. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10022v1 |
https://arxiv.org/pdf/2002.10022v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-era5-and-mena-simulations-to |
Repo | |
Framework | |
Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss
Title | Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss |
Authors | Yi Luo, Nima Mesgarani |
Abstract | Many recent source separation systems are designed to separate a fixed number of sources out of a mixture. In the cases where the source activation patterns are unknown, such systems have to either adjust the number of outputs or to identify invalid outputs from the valid ones. Iterative separation methods have gain much attention in the community as they can flexibly decide the number of outputs, however (1) they typically rely on long-term information to determine the stopping time for the iterations, which makes them hard to operate in a causal setting; (2) they lack a “fault tolerance” mechanism when the estimated number of sources is different from the actual number. In this paper, we propose a simple training method, the auxiliary autoencoding permutation invariant training (A2PIT), to alleviate the two issues. A2PIT assumes a fixed number of outputs and uses auxiliary autoencoding losses to force the invalid outputs to be the copies of the input mixture, and detects invalid outputs in a fully unsupervised way during inference phase. Experiment results show that A2PIT is able to improve the separation performance across various numbers of speakers and effectively detect the number of speakers in a mixture. |
Tasks | |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12326v1 |
https://arxiv.org/pdf/2003.12326v1.pdf | |
PWC | https://paperswithcode.com/paper/separating-varying-numbers-of-sources-with |
Repo | |
Framework | |
Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo
Title | Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo |
Authors | Yin Tat Lee, Ruoqi Shen, Kevin Tian |
Abstract | We show that the gradient norm $\nabla f(x)$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean. This removes a barrier in the prior state-of-the-art analysis for the well-studied Metropolized Hamiltonian Monte Carlo (HMC) algorithm for sampling from a strongly logconcave distribution. We correspondingly demonstrate that Metropolized HMC mixes in $\tilde{O}(\kappa d)$ iterations, improving upon the $\tilde{O}(\kappa^{1.5}\sqrt{d} + \kappa d)$ runtime of (Dwivedi et. al. ‘18, Chen et. al. ‘19) by a factor $(\kappa/d)^{1/2}$ when the condition number $\kappa$ is large. Our mixing time analysis introduces several techniques which to our knowledge have not appeared in the literature and may be of independent interest, including restrictions to a nonconvex set with good conductance behavior, and a new reduction technique for boosting a constant-accuracy total variation guarantee under weak warmness assumptions. This is the first mixing time result for logconcave distributions using only first-order function information which achieves linear dependence on $\kappa$; we also give evidence that this dependence is likely to be necessary for standard Metropolized first-order methods. |
Tasks | Art Analysis |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.04121v2 |
https://arxiv.org/pdf/2002.04121v2.pdf | |
PWC | https://paperswithcode.com/paper/logsmooth-gradient-concentration-and-tighter |
Repo | |
Framework | |
Mapping the Landscape of Artificial Intelligence Applications against COVID-19
Title | Mapping the Landscape of Artificial Intelligence Applications against COVID-19 |
Authors | Joseph Bullock, Alexandra, Luccioni, Katherine Hoffmann Pham, Cynthia Sin Nga Lam, Miguel Luengo-Oroz |
Abstract | COVID-19, the disease caused by the SARS-CoV-2 virus, has been declared a pandemic by the World Health Organization, with over 294,000 cases as of March 22nd 2020. In this review, we present an overview of recent studies using Machine Learning and, more broadly, Artificial Intelligence, to tackle many aspects of the COVID-19 crisis at different scales including molecular, medical and epidemiological applications. We finish with a discussion of promising future directions of research and the tools and resources needed to facilitate AI research. |
Tasks | |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11336v1 |
https://arxiv.org/pdf/2003.11336v1.pdf | |
PWC | https://paperswithcode.com/paper/mapping-the-landscape-of-artificial |
Repo | |
Framework | |
Building Networks for Image Segmentation using Particle Competition and Cooperation
Title | Building Networks for Image Segmentation using Particle Competition and Cooperation |
Authors | Fabricio Breve |
Abstract | Particle competition and cooperation (PCC) is a graph-based semi-supervised learning approach. When PCC is applied to interactive image segmentation tasks, pixels are converted into network nodes, and each node is connected to its k-nearest neighbors, according to the distance between a set of features extracted from the image. Building a proper network to feed PCC is crucial to achieve good segmentation results. However, some features may be more important than others to identify the segments, depending on the characteristics of the image to be segmented. In this paper, an index to evaluate candidate networks is proposed. Thus, building the network becomes a problem of optimizing some feature weights based on the proposed index. Computer simulations are performed on some real-world images from the Microsoft GrabCut database, and the segmentation results related in this paper show the effectiveness of the proposed method. |
Tasks | Semantic Segmentation |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06001v1 |
https://arxiv.org/pdf/2002.06001v1.pdf | |
PWC | https://paperswithcode.com/paper/building-networks-for-image-segmentation |
Repo | |
Framework | |
Conditional Path Analysis in Singly-Connected Path Diagrams
Title | Conditional Path Analysis in Singly-Connected Path Diagrams |
Authors | Jose M. Peña |
Abstract | We extend the classical path analysis by showing that, for a singly-connected path diagram, the partial covariance of two random variables factorizes over the nodes and edges in the path between the variables. This result allows us to give an alternative explanation to some causal phenomena previously discussed by Pearl (2013), and to show that Simpson’s paradox cannot occur in singly-connected path diagrams. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05226v3 |
https://arxiv.org/pdf/2002.05226v3.pdf | |
PWC | https://paperswithcode.com/paper/conditional-path-analysis-in-singly-connected |
Repo | |
Framework | |