April 2, 2020

2864 words 14 mins read

Paper Group ANR 243

Paper Group ANR 243

Federated Visual Classification with Real-World Data Distribution. A Scalable Framework for Sparse Clustering Without Shrinkage. A Proof of Useful Work for Artificial Intelligence on the Blockchain. MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?. Implicitly Defined Layers in Neural Networks. Heterogeneity Loss to Handle Int …

Federated Visual Classification with Real-World Data Distribution

Title Federated Visual Classification with Real-World Data Distribution
Authors Tzu-Ming Harry Hsu, Hang Qi, Matthew Brown
Abstract Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. Furthermore, differing quantities of data are typically available at each device (imbalance). In this work, we characterize the effect these real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm. To do so, we introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits that simulate real-world edge learning scenarios. We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
Published 2020-03-18
URL https://arxiv.org/abs/2003.08082v1
PDF https://arxiv.org/pdf/2003.08082v1.pdf
PWC https://paperswithcode.com/paper/federated-visual-classification-with-real

A Scalable Framework for Sparse Clustering Without Shrinkage

Title A Scalable Framework for Sparse Clustering Without Shrinkage
Authors Zhiyue Zhang, Kenneth Lange, Jason Xu
Abstract Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters. This has motivated the development of sparse clustering techniques that typically rely on k-means within outer algorithms of high computational complexity. Current techniques also require careful tuning of shrinkage parameters, further limiting their scalability. In this paper, we propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms. We show that our algorithm enjoys consistency and convergence guarantees. Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings. We showcase these contributions via simulated experiments and benchmark datasets, as well as a case study on mouse protein expression.
Published 2020-02-20
URL https://arxiv.org/abs/2002.08541v1
PDF https://arxiv.org/pdf/2002.08541v1.pdf
PWC https://paperswithcode.com/paper/a-scalable-framework-for-sparse-clustering

A Proof of Useful Work for Artificial Intelligence on the Blockchain

Title A Proof of Useful Work for Artificial Intelligence on the Blockchain
Authors Andrei Lihu, Jincheng Du, Igor Barjaktarevic, Patrick Gerzanics, Mark Harvilla
Abstract Bitcoin mining is a wasteful and resource-intensive process. To add a block of transactions to the blockchain, miners spend a considerable amount of energy. The Bitcoin protocol, named ‘proof of work’ (PoW), resembles a lottery and the underlying computational work is not useful otherwise. In this paper, we describe a novel ‘proof of useful work’ (PoUW) protocol based on training a machine learning model on the blockchain. Miners get a chance to create new coins after performing honest ML training work. Clients submit tasks and pay all training contributors. This is an extra incentive to participate in the network because the system does not rely only on the lottery procedure. Using our consensus protocol, interested parties can order, complete, and verify useful work in a distributed environment. We outline mechanisms to reward useful work and punish malicious actors. We aim to build better AI systems using the security of the blockchain.
Published 2020-01-25
URL https://arxiv.org/abs/2001.09244v1
PDF https://arxiv.org/pdf/2001.09244v1.pdf
PWC https://paperswithcode.com/paper/a-proof-of-useful-work-for-artificial

MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?

Title MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy?
Authors Joseph Bethge, Christian Bartz, Haojin Yang, Ying Chen, Christoph Meinel
Abstract Binary Neural Networks (BNNs) are neural networks which use binary weights and activations instead of the typical 32-bit floating point values. They have reduced model sizes and allow for efficient inference on mobile or embedded devices with limited power and computational resources. However, the binarization of weights and activations leads to feature maps of lower quality and lower capacity and thus a drop in accuracy compared to traditional networks. Previous work has increased the number of channels or used multiple binary bases to alleviate these problems. In this paper, we instead present an architectural approach: MeliusNet. It consists of alternating a DenseBlock, which increases the feature capacity, and our proposed ImprovementBlock, which increases the feature quality. Experiments on the ImageNet dataset demonstrate the superior performance of our MeliusNet over a variety of popular binary architectures with regards to both computation savings and accuracy. Furthermore, with our method we trained BNN models, which for the first time can match the accuracy of the popular compact network MobileNet-v1 in terms of model size, number of operations and accuracy. Our code is published online at https://github.com/hpi-xnor/BMXNet-v2
Published 2020-01-16
URL https://arxiv.org/abs/2001.05936v2
PDF https://arxiv.org/pdf/2001.05936v2.pdf
PWC https://paperswithcode.com/paper/meliusnet-can-binary-neural-networks-achieve

Implicitly Defined Layers in Neural Networks

Title Implicitly Defined Layers in Neural Networks
Authors Qianggong Zhang, Yanyang Gu, Michalkiewicz Mateusz, Mahsa Baktashmotlagh, Anders Eriksson
Abstract In conventional formulations of multilayer feedforward neural networks, the individual layers are customarily defined by explicit functions. In this paper we demonstrate that defining individual layers in a neural network \emph{implicitly} provide much richer representations over the standard explicit one, consequently enabling a vastly broader class of end-to-end trainable architectures. We present a general framework of implicitly defined layers, where much of the theoretical analysis of such layers can be addressed through the implicit function theorem. We also show how implicitly defined layers can be seamlessly incorporated into existing machine learning libraries. In particular with respect to current automatic differentiation techniques for use in backpropagation based training. Finally, we demonstrate the versatility and relevance of our proposed approach on a number of diverse example problems with promising results.
Published 2020-03-03
URL https://arxiv.org/abs/2003.01822v1
PDF https://arxiv.org/pdf/2003.01822v1.pdf
PWC https://paperswithcode.com/paper/implicitly-defined-layers-in-neural-networks

Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer

Title Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer
Authors Shubham Goswami, Suril Mehta, Dhruva Sahrawat, Anubha Gupta, Ritu Gupta
Abstract Developing nations lack adequate number of hospitals with modern equipment and skilled doctors. Hence, a significant proportion of these nations’ population, particularly in rural areas, is not able to avail specialized and timely healthcare facilities. In recent years, deep learning (DL) models, a class of artificial intelligence (AI) methods, have shown impressive results in medical domain. These AI methods can provide immense support to developing nations as affordable healthcare solutions. This work is focused on one such application of blood cancer diagnosis. However, there are some challenges to DL models in cancer research because of the unavailability of a large data for adequate training and the difficulty of capturing heterogeneity in data at different levels ranging from acquisition characteristics, session, to subject-level (within subjects and across subjects). These challenges render DL models prone to overfitting and hence, models lack generalization on prospective subjects’ data. In this work, we address these problems in the application of B-cell Acute Lymphoblastic Leukemia (B-ALL) diagnosis using deep learning. We propose heterogeneity loss that captures subject-level heterogeneity, thereby, forcing the neural network to learn subject-independent features. We also propose an unorthodox ensemble strategy that helps us in providing improved classification over models trained on 7-folds giving a weighted-$F_1$ score of 95.26% on unseen (test) subjects’ data that are, so far, the best results on the C-NMC 2019 dataset for B-ALL classification.
Published 2020-03-06
URL https://arxiv.org/abs/2003.03295v2
PDF https://arxiv.org/pdf/2003.03295v2.pdf
PWC https://paperswithcode.com/paper/heterogeneity-loss-to-handle-intersubject-and

Symbolic Querying of Vector Spaces: Probabilistic Databases Meets Relational Embeddings

Title Symbolic Querying of Vector Spaces: Probabilistic Databases Meets Relational Embeddings
Authors Tal Friedman, Guy Van den Broeck
Abstract To deal with increasing amounts of uncertainty and incompleteness in relational data, we propose unifying techniques from probabilistic databases and relational embedding models. We use probabilistic databases as our formalism to define the probabilistic model with respect to which all queries are done. This allows us to leverage the rich literature of theory and algorithms from probabilistic databases for solving problems. While this formalization can be used with any relational embedding model, the lack of a well defined joint probability distribution causes simple problems to become provably hard. With this in mind, we introduce \TO, a relational embedding model designed in terms of probabilistic databases to exploit typical embedding assumptions within the probabilistic framework. Using principled, efficient inference algorithms that can be derived from its definition, we empirically demonstrate that \TOs is an effective and general model for these tasks.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10029v1
PDF https://arxiv.org/pdf/2002.10029v1.pdf
PWC https://paperswithcode.com/paper/symbolic-querying-of-vector-spaces

Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks

Title Small-Footprint Open-Vocabulary Keyword Spotting with Quantized LSTM Networks
Authors Théodore Bluche, Maël Primet, Thibault Gisselbrecht
Abstract We explore a keyword-based spoken language understanding system, in which the intent of the user can directly be derived from the detection of a sequence of keywords in the query. In this paper, we focus on an open-vocabulary keyword spotting method, allowing the user to define their own keywords without having to retrain the whole model. We describe the different design choices leading to a fast and small-footprint system, able to run on tiny devices, for any arbitrary set of user-defined keywords, without training data specific to those keywords. The model, based on a quantized long short-term memory (LSTM) neural network, trained with connectionist temporal classification (CTC), weighs less than 500KB. Our approach takes advantage of some properties of the predictions of CTC-trained networks to calibrate the confidence scores and implement a fast detection algorithm. The proposed system outperforms a standard keyword-filler model approach.
Tasks Keyword Spotting, Spoken Language Understanding
Published 2020-02-25
URL https://arxiv.org/abs/2002.10851v1
PDF https://arxiv.org/pdf/2002.10851v1.pdf
PWC https://paperswithcode.com/paper/small-footprint-open-vocabulary-keyword

Multi-wavelet residual dense convolutional neural network for image denoising

Title Multi-wavelet residual dense convolutional neural network for image denoising
Authors Shuo-Fei Wang, Wen-Kai Yu, Ya-Xin Li
Abstract Networks with large receptive field (RF) have shown advanced fitting ability in recent years. In this work, we utilize the short-term residual learning method to improve the performance and robustness of networks for image denoising tasks. Here, we choose a multi-wavelet convolutional neural network (MWCNN), one of the state-of-art networks with large RF, as the backbone, and insert residual dense blocks (RDBs) in its each layer. We call this scheme multi-wavelet residual dense convolutional neural network (MWRDCNN). Compared with other RDB-based networks, it can extract more features of the object from adjacent layers, preserve the large RF, and boost the computing efficiency. Meanwhile, this approach also provides a possibility of absorbing advantages of multiple architectures in a single network without conflicts. The performance of the proposed method has been demonstrated in extensive experiments with a comparison with existing techniques.
Tasks Denoising, Image Denoising
Published 2020-02-19
URL https://arxiv.org/abs/2002.08301v1
PDF https://arxiv.org/pdf/2002.08301v1.pdf
PWC https://paperswithcode.com/paper/multi-wavelet-residual-dense-convolutional

Application of ERA5 and MENA simulations to predict offshore wind energy potential

Title Application of ERA5 and MENA simulations to predict offshore wind energy potential
Authors Shahab Shamshirband, Amir Mosavi, Narjes Nabipour, Kwok-wing Chau
Abstract This study explores wind energy resources in different locations through the Gulf of Oman and also their future variability due climate change impacts. In this regard, EC-EARTH near surface wind outputs obtained from CORDEX-MENA simulations are used for historical and future projection of the energy. The ERA5 wind data are employed to assess suitability of the climate model. Moreover, the ERA5 wave data over the study area are applied to compute sea surface roughness as an important variable for converting near surface wind speeds to those of wind speed at turbine hub-height. Considering the power distribution, bathymetry and distance from the coats, some spots as tentative energy hotspots to provide detailed assessment of directional and temporal variability and also to investigate climate change impact studies. RCP8.5 as a common climatic scenario is used to project and extract future variation of the energy in the selected sites. The results of this study demonstrate that the selected locations have a suitable potential for wind power turbine plan and constructions.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10022v1
PDF https://arxiv.org/pdf/2002.10022v1.pdf
PWC https://paperswithcode.com/paper/application-of-era5-and-mena-simulations-to

Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss

Title Separating Varying Numbers of Sources with Auxiliary Autoencoding Loss
Authors Yi Luo, Nima Mesgarani
Abstract Many recent source separation systems are designed to separate a fixed number of sources out of a mixture. In the cases where the source activation patterns are unknown, such systems have to either adjust the number of outputs or to identify invalid outputs from the valid ones. Iterative separation methods have gain much attention in the community as they can flexibly decide the number of outputs, however (1) they typically rely on long-term information to determine the stopping time for the iterations, which makes them hard to operate in a causal setting; (2) they lack a “fault tolerance” mechanism when the estimated number of sources is different from the actual number. In this paper, we propose a simple training method, the auxiliary autoencoding permutation invariant training (A2PIT), to alleviate the two issues. A2PIT assumes a fixed number of outputs and uses auxiliary autoencoding losses to force the invalid outputs to be the copies of the input mixture, and detects invalid outputs in a fully unsupervised way during inference phase. Experiment results show that A2PIT is able to improve the separation performance across various numbers of speakers and effectively detect the number of speakers in a mixture.
Published 2020-03-27
URL https://arxiv.org/abs/2003.12326v1
PDF https://arxiv.org/pdf/2003.12326v1.pdf
PWC https://paperswithcode.com/paper/separating-varying-numbers-of-sources-with

Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

Title Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo
Authors Yin Tat Lee, Ruoqi Shen, Kevin Tian
Abstract We show that the gradient norm $\nabla f(x)$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean. This removes a barrier in the prior state-of-the-art analysis for the well-studied Metropolized Hamiltonian Monte Carlo (HMC) algorithm for sampling from a strongly logconcave distribution. We correspondingly demonstrate that Metropolized HMC mixes in $\tilde{O}(\kappa d)$ iterations, improving upon the $\tilde{O}(\kappa^{1.5}\sqrt{d} + \kappa d)$ runtime of (Dwivedi et. al. ‘18, Chen et. al. ‘19) by a factor $(\kappa/d)^{1/2}$ when the condition number $\kappa$ is large. Our mixing time analysis introduces several techniques which to our knowledge have not appeared in the literature and may be of independent interest, including restrictions to a nonconvex set with good conductance behavior, and a new reduction technique for boosting a constant-accuracy total variation guarantee under weak warmness assumptions. This is the first mixing time result for logconcave distributions using only first-order function information which achieves linear dependence on $\kappa$; we also give evidence that this dependence is likely to be necessary for standard Metropolized first-order methods.
Tasks Art Analysis
Published 2020-02-10
URL https://arxiv.org/abs/2002.04121v2
PDF https://arxiv.org/pdf/2002.04121v2.pdf
PWC https://paperswithcode.com/paper/logsmooth-gradient-concentration-and-tighter

Mapping the Landscape of Artificial Intelligence Applications against COVID-19

Title Mapping the Landscape of Artificial Intelligence Applications against COVID-19
Authors Joseph Bullock, Alexandra, Luccioni, Katherine Hoffmann Pham, Cynthia Sin Nga Lam, Miguel Luengo-Oroz
Abstract COVID-19, the disease caused by the SARS-CoV-2 virus, has been declared a pandemic by the World Health Organization, with over 294,000 cases as of March 22nd 2020. In this review, we present an overview of recent studies using Machine Learning and, more broadly, Artificial Intelligence, to tackle many aspects of the COVID-19 crisis at different scales including molecular, medical and epidemiological applications. We finish with a discussion of promising future directions of research and the tools and resources needed to facilitate AI research.
Published 2020-03-25
URL https://arxiv.org/abs/2003.11336v1
PDF https://arxiv.org/pdf/2003.11336v1.pdf
PWC https://paperswithcode.com/paper/mapping-the-landscape-of-artificial

Building Networks for Image Segmentation using Particle Competition and Cooperation

Title Building Networks for Image Segmentation using Particle Competition and Cooperation
Authors Fabricio Breve
Abstract Particle competition and cooperation (PCC) is a graph-based semi-supervised learning approach. When PCC is applied to interactive image segmentation tasks, pixels are converted into network nodes, and each node is connected to its k-nearest neighbors, according to the distance between a set of features extracted from the image. Building a proper network to feed PCC is crucial to achieve good segmentation results. However, some features may be more important than others to identify the segments, depending on the characteristics of the image to be segmented. In this paper, an index to evaluate candidate networks is proposed. Thus, building the network becomes a problem of optimizing some feature weights based on the proposed index. Computer simulations are performed on some real-world images from the Microsoft GrabCut database, and the segmentation results related in this paper show the effectiveness of the proposed method.
Tasks Semantic Segmentation
Published 2020-02-14
URL https://arxiv.org/abs/2002.06001v1
PDF https://arxiv.org/pdf/2002.06001v1.pdf
PWC https://paperswithcode.com/paper/building-networks-for-image-segmentation

Conditional Path Analysis in Singly-Connected Path Diagrams

Title Conditional Path Analysis in Singly-Connected Path Diagrams
Authors Jose M. Peña
Abstract We extend the classical path analysis by showing that, for a singly-connected path diagram, the partial covariance of two random variables factorizes over the nodes and edges in the path between the variables. This result allows us to give an alternative explanation to some causal phenomena previously discussed by Pearl (2013), and to show that Simpson’s paradox cannot occur in singly-connected path diagrams.
Published 2020-02-12
URL https://arxiv.org/abs/2002.05226v3
PDF https://arxiv.org/pdf/2002.05226v3.pdf
PWC https://paperswithcode.com/paper/conditional-path-analysis-in-singly-connected
comments powered by Disqus