April 1, 2020

3288 words 16 mins read

Paper Group ANR 494

Paper Group ANR 494

Duality of Width and Depth of Neural Networks. The GraphNet Zoo: A Plug-and-Play Framework for Deep Semi-Supervised Classification. Exploiting Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network. ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos. D2D-Enabled Data Sh …

Duality of Width and Depth of Neural Networks

Title Duality of Width and Depth of Neural Networks
Authors Fenglei-Lei Fan, Ge Wang
Abstract Here, we report that the depth and the width of a neural network are dual from two perspectives. First, we employ the partially separable representation to determine the width and depth. Second, we use the De Morgan law to guide the conversion between a deep network and a wide network. Furthermore, we suggest the generalized De Morgan law to promote duality to network equivalency.
Tasks
Published 2020-02-06
URL https://arxiv.org/abs/2002.02515v1
PDF https://arxiv.org/pdf/2002.02515v1.pdf
PWC https://paperswithcode.com/paper/duality-of-width-and-depth-of-neural-networks
Repo
Framework

The GraphNet Zoo: A Plug-and-Play Framework for Deep Semi-Supervised Classification

Title The GraphNet Zoo: A Plug-and-Play Framework for Deep Semi-Supervised Classification
Authors Marianne de Vriendt, Philip Sellars, Angelica I Aviles-Rivero
Abstract We consider the problem of classifying a medical image dataset when we have a limited amounts of labels. This is very common yet challenging setting as labelled data is expensive, time consuming to collect and may require expert knowledge. The current classification go-to of deep supervised learning is unable to cope with such a problem setup. However, using semi-supervised learning, one can produce accurate classifications using a significantly reduced amount of labelled data. Therefore, semi-supervised learning is perfectly suited for medical image classification. However, there has almost been no uptake of semi-supervised methods in the medical domain. In this work, we propose a plug-and-play framework for deep semi-supervised classification focusing on graph based approaches, which up to our knowledge it is the first time that an approach with minimal labels has been shown to such an unprecedented scale. We introduce the concept of hybrid models by defining a classifier as a combination between a model-based functional and a deep net. We demonstrate, through extensive numerical comparisons, that our approach readily compete with fully-supervised state-of-the-art techniques for the applications of Malaria Cells, Mammograms and Chest X-ray classification whilst using far fewer labels.
Tasks Image Classification
Published 2020-03-13
URL https://arxiv.org/abs/2003.06451v1
PDF https://arxiv.org/pdf/2003.06451v1.pdf
PWC https://paperswithcode.com/paper/the-graphnet-zoo-a-plug-and-play-framework
Repo
Framework

Exploiting Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network

Title Exploiting Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network
Authors Xuefeng Bai, Pengbo Liu, Yue Zhang
Abstract Targeted sentiment classification predicts the sentiment polarity on given target mentions in input texts. Dominant methods employ neural networks for encoding the input sentence and extracting relations between target mentions and their contexts. Recently, graph neural network has been investigated for integrating dependency syntax for the task, achieving the state-of-the-art results. However, existing methods do not consider dependency label information, which can be intuitively useful. To solve the problem, we investigate a novel relational graph attention network that integrates typed syntactic dependency information. Results on standard benchmarks show that our method can effectively leverage label information for improving targeted sentiment classification performances. Our final model significantly outperforms state-of-the-art syntax-based approaches.
Tasks Sentiment Analysis
Published 2020-02-22
URL https://arxiv.org/abs/2002.09685v1
PDF https://arxiv.org/pdf/2002.09685v1.pdf
PWC https://paperswithcode.com/paper/exploiting-typed-syntactic-dependencies-for
Repo
Framework

ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos

Title ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos
Authors Lichao Mou, Yuansheng Hua, Pu Jin, Xiao Xiang Zhu
Abstract Along with the increasing use of unmanned aerial vehicles (UAVs), large volumes of aerial videos have been produced. It is unrealistic for humans to screen such big data and understand their contents. Hence methodological research on the automatic understanding of UAV videos is of paramount importance. In this paper, we introduce a novel problem of event recognition in unconstrained aerial videos in the remote sensing community and present a large-scale, human-annotated dataset, named ERA (Event Recognition in Aerial videos), consisting of 2,864 videos each with a label from 25 different classes corresponding to an event unfolding 5 seconds. The ERA dataset is designed to have a significant intra-class variation and inter-class similarity and captures dynamic events in various circumstances and at dramatically various scales. Moreover, to offer a benchmark for this task, we extensively validate existing deep networks. We expect that the ERA dataset will facilitate further progress in automatic aerial video comprehension. The website is https://lcmou.github.io/ERA_Dataset/
Tasks
Published 2020-01-30
URL https://arxiv.org/abs/2001.11394v3
PDF https://arxiv.org/pdf/2001.11394v3.pdf
PWC https://paperswithcode.com/paper/era-a-dataset-and-deep-learning-benchmark-for
Repo
Framework

D2D-Enabled Data Sharing for Distributed Machine Learning at Wireless Network Edge

Title D2D-Enabled Data Sharing for Distributed Machine Learning at Wireless Network Edge
Authors Xiaoran Cai, Xiaopeng Mo, Junyang Chen, Jie Xu
Abstract Mobile edge learning is an emerging technique that enables distributed edge devices to collaborate in training shared machine learning models by exploiting their local data samples and communication and computation resources. To deal with the straggler dilemma issue faced in this technique, this paper proposes a new device to device enabled data sharing approach, in which different edge devices share their data samples among each other over communication links, in order to properly adjust their computation loads for increasing the training speed. Under this setup, we optimize the radio resource allocation for both data sharing and distributed training, with the objective of minimizing the total training delay under fixed numbers of local and global iterations. Numerical results show that the proposed data sharing design significantly reduces the training delay, and also enhances the training accuracy when the data samples are non independent and identically distributed among edge devices.
Tasks
Published 2020-01-28
URL https://arxiv.org/abs/2001.11342v1
PDF https://arxiv.org/pdf/2001.11342v1.pdf
PWC https://paperswithcode.com/paper/d2d-enabled-data-sharing-for-distributed
Repo
Framework

TiFL: A Tier-based Federated Learning System

Title TiFL: A Tier-based Federated Learning System
Authors Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, Yue Cheng
Abstract Federated Learning (FL) enables learning a shared model across many clients without violating the privacy requirements. One of the key attributes in FL is the heterogeneity that exists in both resource and data due to the differences in computation and communication capacity, as well as the quantity and content of data among different clients. We conduct a case study to show that heterogeneity in resource and data has a significant impact on training time and model accuracy in conventional FL systems. To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource and data quantity. To further tame the heterogeneity caused by non-IID (Independent and Identical Distribution) data and resources, TiFL employs an adaptive tier selection approach to update the tiering on-the-fly based on the observed training performance and accuracy overtime. We prototype TiFL in a FL testbed following Google’s FL architecture and evaluate it using popular benchmarks and the state-of-the-art FL benchmark LEAF. Experimental evaluation shows that TiFL outperforms the conventional FL in various heterogeneous conditions. With the proposed adaptive tier selection policy, we demonstrate that TiFL achieves much faster training performance while keeping the same (and in some cases - better) test accuracy across the board.
Tasks
Published 2020-01-25
URL https://arxiv.org/abs/2001.09249v1
PDF https://arxiv.org/pdf/2001.09249v1.pdf
PWC https://paperswithcode.com/paper/tifl-a-tier-based-federated-learning-system
Repo
Framework

Metrics and methods for robustness evaluation of neural networks with generative models

Title Metrics and methods for robustness evaluation of neural networks with generative models
Authors Igor Buzhinsky, Arseny Nerinovsky, Stavros Tripakis
Abstract Recent studies have shown that modern deep neural network classifiers are easy to fool, assuming that an adversary is able to slightly modify their inputs. Many papers have proposed adversarial attacks, defenses and methods to measure robustness to such adversarial perturbations. However, most commonly considered adversarial examples are based on $\ell_p$-bounded perturbations in the input space of the neural network, which are unlikely to arise naturally. Recently, especially in computer vision, researchers discovered “natural” or “semantic” perturbations, such as rotations, changes of brightness, or more high-level changes, but these perturbations have not yet been systematically utilized to measure the performance of classifiers. In this paper, we propose several metrics to measure robustness of classifiers to natural adversarial examples, and methods to evaluate them. These metrics, called latent space performance metrics, are based on the ability of generative models to capture probability distributions, and are defined in their latent spaces. On three image classification case studies, we evaluate the proposed metrics for several classifiers, including ones trained in conventional and robust ways. We find that the latent counterparts of adversarial robustness are associated with the accuracy of the classifier rather than its conventional adversarial robustness, but the latter is still reflected on the properties of found latent perturbations. In addition, our novel method of finding latent adversarial perturbations demonstrates that these perturbations are often perceptually small.
Tasks Image Classification
Published 2020-03-04
URL https://arxiv.org/abs/2003.01993v2
PDF https://arxiv.org/pdf/2003.01993v2.pdf
PWC https://paperswithcode.com/paper/metrics-and-methods-for-robustness-evaluation
Repo
Framework

SMT + ILP

Title SMT + ILP
Authors Vaishak Belle
Abstract Inductive logic programming (ILP) has been a deeply influential paradigm in AI, enjoying decades of research on its theory and implementations. As a natural descendent of the fields of logic programming and machine learning, it admits the incorporation of background knowledge, which can be very useful in domains where prior knowledge from experts is available and can lead to a more data-efficient learning regime. Be that as it may, the limitation to Horn clauses composed over Boolean variables is a very serious one. Many phenomena occurring in the real-world are best characterized using continuous entities, and more generally, mixtures of discrete and continuous entities. In this position paper, we motivate a reconsideration of inductive declarative programming by leveraging satisfiability modulo theory technology.
Tasks
Published 2020-01-15
URL https://arxiv.org/abs/2001.05208v1
PDF https://arxiv.org/pdf/2001.05208v1.pdf
PWC https://paperswithcode.com/paper/smt-ilp
Repo
Framework

Translating Web Search Queries into Natural Language Questions

Title Translating Web Search Queries into Natural Language Questions
Authors Adarsh Kumar, Sandipan Dandapat, Sushil Chordia
Abstract Users often query a search engine with a specific question in mind and often these queries are keywords or sub-sentential fragments. For example, if the users want to know the answer for “What’s the capital of USA”, they will most probably query “capital of USA” or “USA capital” or some keyword-based variation of this. For example, for the user entered query “capital of USA”, the most probable question intent is “What’s the capital of USA?". In this paper, we are proposing a method to generate well-formed natural language question from a given keyword-based query, which has the same question intent as the query. Conversion of keyword-based web query into a well-formed question has lots of applications, with some of them being in search engines, Community Question Answering (CQA) website and bots communication. We found a synergy between query-to-question problem with standard machine translation(MT) task. We have used both Statistical MT (SMT) and Neural MT (NMT) models to generate the questions from the query. We have observed that MT models perform well in terms of both automatic and human evaluation.
Tasks Community Question Answering, Machine Translation, Question Answering
Published 2020-02-07
URL https://arxiv.org/abs/2002.02631v1
PDF https://arxiv.org/pdf/2002.02631v1.pdf
PWC https://paperswithcode.com/paper/translating-web-search-queries-into-natural-2
Repo
Framework

Representation Learning Through Latent Canonicalizations

Title Representation Learning Through Latent Canonicalizations
Authors Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman
Abstract We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision. Many prior approaches to this problem have focused on learning “disentangled” representations so that as individual factors vary in a new domain, only a portion of the representation need be updated. In this work, we seek the generalization power of disentangled representations, but relax the requirement of explicit latent disentanglement and instead encourage linearity of individual factors of variation by requiring them to be manipulable by learned linear transformations. We dub these transformations latent canonicalizers, as they aim to modify the value of a factor to a pre-determined (but arbitrary) canonical value (e.g., recoloring the image foreground to black). Assuming a source domain with access to meta-labels specifying the factors of variation within an image, we demonstrate experimentally that our method helps reduce the number of observations needed to generalize to a similar target domain when compared to a number of supervised baselines.
Tasks Representation Learning
Published 2020-02-26
URL https://arxiv.org/abs/2002.11829v1
PDF https://arxiv.org/pdf/2002.11829v1.pdf
PWC https://paperswithcode.com/paper/representation-learning-through-latent
Repo
Framework

Fault Handling in Large Water Networks with Online Dictionary Learning

Title Fault Handling in Large Water Networks with Online Dictionary Learning
Authors Paul Irofti, Florin Stoican, Vicenç Puig
Abstract Fault detection and isolation in water distribution networks is an active topic due to its model’s mathematical complexity and increased data availability through sensor placement. Here we simplify the model by offering a data driven alternative that takes the network topology into account when performing sensor placement and then proceeds to build a network model through online dictionary learning based on the incoming sensor data. Online learning is fast and allows tackling large networks as it processes small batches of signals at a time and has the benefit of continuous integration of new data into the existing network model, be it in the beginning for training or in production when new data samples are encountered. The algorithms show good performance when tested on both small and large-scale networks.
Tasks Dictionary Learning, Fault Detection
Published 2020-03-18
URL https://arxiv.org/abs/2003.08483v1
PDF https://arxiv.org/pdf/2003.08483v1.pdf
PWC https://paperswithcode.com/paper/fault-handling-in-large-water-networks-with
Repo
Framework

Case Study: Predictive Fairness to Reduce Misdemeanor Recidivism Through Social Service Interventions

Title Case Study: Predictive Fairness to Reduce Misdemeanor Recidivism Through Social Service Interventions
Authors Kit T. Rodolfa, Erika Salomon, Lauren Haynes, Ivan Higuera Mendieta, Jamie Larson, Rayid Ghani
Abstract The criminal justice system is currently ill-equipped to improve outcomes of individuals who cycle in and out of the system with a series of misdemeanor offenses. Often due to constraints of caseload and poor record linkage, prior interactions with an individual may not be considered when an individual comes back into the system, let alone in a proactive manner through the application of diversion programs. The Los Angeles City Attorney’s Office recently created a new Recidivism Reduction and Drug Diversion unit (R2D2) tasked with reducing recidivism in this population. Here we describe a collaboration with this new unit as a case study for the incorporation of predictive equity into machine learning based decision making in a resource-constrained setting. The program seeks to improve outcomes by developing individually-tailored social service interventions (i.e., diversions, conditional plea agreements, stayed sentencing, or other favorable case disposition based on appropriate social service linkage rather than traditional sentencing methods) for individuals likely to experience subsequent interactions with the criminal justice system, a time and resource-intensive undertaking that necessitates an ability to focus resources on individuals most likely to be involved in a future case. Seeking to achieve both efficiency (through predictive accuracy) and equity (improving outcomes in traditionally under-served communities and working to mitigate existing disparities in criminal justice outcomes), we discuss the equity outcomes we seek to achieve, describe the corresponding choice of a metric for measuring predictive fairness in this context, and explore a set of options for balancing equity and efficiency when building and selecting machine learning models in an operational public policy setting.
Tasks Decision Making
Published 2020-01-24
URL https://arxiv.org/abs/2001.09233v1
PDF https://arxiv.org/pdf/2001.09233v1.pdf
PWC https://paperswithcode.com/paper/case-study-predictive-fairness-to-reduce
Repo
Framework

Keyword-Attentive Deep Semantic Matching

Title Keyword-Attentive Deep Semantic Matching
Authors Changyu Miao, Zhen Cao, Yik-Cheung Tam
Abstract Deep Semantic Matching is a crucial component in various natural language processing applications such as question and answering (QA), where an input query is compared to each candidate question in a QA corpus in terms of relevance. Measuring similarities between a query-question pair in an open domain scenario can be challenging due to diverse word tokens in the queryquestion pair. We propose a keyword-attentive approach to improve deep semantic matching. We first leverage domain tags from a large corpus to generate a domain-enhanced keyword dictionary. Built upon BERT, we stack a keyword-attentive transformer layer to highlight the importance of keywords in the query-question pair. During model training, we propose a new negative sampling approach based on keyword coverage between the input pair. We evaluate our approach on a Chinese QA corpus using various metrics, including precision of retrieval candidates and accuracy of semantic matching. Experiments show that our approach outperforms existing strong baselines. Our approach is general and can be applied to other text matching tasks with little adaptation.
Tasks Text Matching
Published 2020-03-11
URL https://arxiv.org/abs/2003.11516v1
PDF https://arxiv.org/pdf/2003.11516v1.pdf
PWC https://paperswithcode.com/paper/keyword-attentive-deep-semantic-matching
Repo
Framework

Learning Theory for Estimation of Animal Motion Submanifolds

Title Learning Theory for Estimation of Animal Motion Submanifolds
Authors Nathan Powell, Andrew Kurdila
Abstract This paper describes the formulation and experimental testing of a novel method for the estimation and approximation of submanifold models of animal motion. It is assumed that the animal motion is supported on a configuration manifold $Q$ that is a smooth, connected, regularly embedded Riemannian submanifold of Euclidean space $X\approx \mathbb{R}^d$ for some $d>0$, and that the manifold $Q$ is homeomorphic to a known smooth, Riemannian manifold $S$. Estimation of the manifold is achieved by finding an unknown mapping $\gamma:S\rightarrow Q\subset X$ that maps the manifold $S$ into $Q$. The overall problem is cast as a distribution-free learning problem over the manifold of measurements $\mathbb{Z}=S\times X$. That is, it is assumed that experiments generate a finite sets ${(s_i,x_i)}{i=1}^m\subset \mathbb{Z}^m$ of samples that are generated according to an unknown probability density $\mu$ on $\mathbb{Z}$. This paper derives approximations $\gamma{n,m}$ of $\gamma$ that are based on the $m$ samples and are contained in an $N(n)$ dimensional space of approximants. The paper defines sufficient conditions that shows that the rates of convergence in $L^2_\mu(S)$ correspond to those known for classical distribution-free learning theory over Euclidean space. Specifically, the paper derives sufficient conditions that guarantee rates of convergence that have the form $$\mathbb{E} \left (\gamma_\mu^j-\gamma_{n,m}^j_{L^2_\mu(S)}^2\right )\leq C_1 N(n)^{-r} + C_2 \frac{N(n)\log(N(n))}{m}$$for constants $C_1,C_2$ with $\gamma_\mu:={\gamma^1_\mu,\ldots,\gamma^d_\mu}$ the regressor function $\gamma_\mu:S\rightarrow Q\subset X$ and $\gamma_{n,m}:={\gamma^1_{n,j},\ldots,\gamma^d_{n,m}}$.
Tasks
Published 2020-03-30
URL https://arxiv.org/abs/2003.13811v1
PDF https://arxiv.org/pdf/2003.13811v1.pdf
PWC https://paperswithcode.com/paper/learning-theory-for-estimation-of-animal
Repo
Framework

ManifoldNorm: Extending normalizations on Riemannian Manifolds

Title ManifoldNorm: Extending normalizations on Riemannian Manifolds
Authors Rudrasis Chakraborty
Abstract Many measurements in computer vision and machine learning manifest as non-Euclidean data samples. Several researchers recently extended a number of deep neural network architectures for manifold valued data samples. Researchers have proposed models for manifold valued spatial data which are common in medical image processing including processing of diffusion tensor imaging (DTI) where images are fields of $3\times 3$ symmetric positive definite matrices or representation in terms of orientation distribution field (ODF) where the identification is in terms of field on hypersphere. There are other sequential models for manifold valued data that recently researchers have shown to be effective for group difference analysis in study for neuro-degenerative diseases. Although, several of these methods are effective to deal with manifold valued data, the bottleneck includes the instability in optimization for deeper networks. In order to deal with these instabilities, researchers have proposed residual connections for manifold valued data. One of the other remedies to deal with the instabilities including gradient explosion is to use normalization techniques including {\it batch norm} and {\it group norm} etc.. But, so far there is no normalization techniques applicable for manifold valued data. In this work, we propose a general normalization techniques for manifold valued data. We show that our proposed manifold normalization technique have special cases including popular batch norm and group norm techniques. On the experimental side, we focus on two types of manifold valued data including manifold of symmetric positive definite matrices and hypersphere. We show the performance gain in one synthetic experiment for moving MNIST dataset and one real brain image dataset where the representation is in terms of orientation distribution field (ODF).
Tasks
Published 2020-03-30
URL https://arxiv.org/abs/2003.13869v1
PDF https://arxiv.org/pdf/2003.13869v1.pdf
PWC https://paperswithcode.com/paper/manifoldnorm-extending-normalizations-on
Repo
Framework
comments powered by Disqus