April 3, 2020

3063 words 15 mins read

Paper Group ANR 83

Paper Group ANR 83

Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms. A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts. A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts. A Survey on Machine Reading Comprehension Syste …

Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms

Title Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms
Authors T. Tony Cai, Hongji Wei
Abstract We study distributed estimation of a Gaussian mean under communication constraints in a decision theoretical framework. Minimax rates of convergence, which characterize the tradeoff between the communication costs and statistical accuracy, are established in both the univariate and multivariate settings. Communication-efficient and statistically optimal procedures are developed. In the univariate case, the optimal rate depends only on the total communication budget, so long as each local machine has at least one bit. However, in the multivariate case, the minimax rate depends on the specific allocations of the communication budgets among the local machines. Although optimal estimation of a Gaussian mean is relatively simple in the conventional setting, it is quite involved under the communication constraints, both in terms of the optimal procedure design and lower bound argument. The techniques developed in this paper can be of independent interest. An essential step is the decomposition of the minimax estimation problem into two stages, localization and refinement. This critical decomposition provides a framework for both the lower bound analysis and optimal procedure design.
Published 2020-01-24
URL https://arxiv.org/abs/2001.08877v1
PDF https://arxiv.org/pdf/2001.08877v1.pdf
PWC https://paperswithcode.com/paper/distributed-gaussian-mean-estimation-under

A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts

Title A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts
Authors Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen
Abstract Machine Reading Comprehension (MRC) is the task of natural language processing that studies the ability to read and understand unstructured texts and then find the correct answers for questions. Until now, we have not yet had any MRC dataset for such a low-resource language as Vietnamese. In this paper, we introduce ViMMRC, a challenging machine comprehension corpus with multiple-choice questions, intended for research on the machine comprehension of Vietnamese text. This corpus includes 2,783 multiple-choice questions and answers based on a set of 417 Vietnamese texts used for teaching reading comprehension for 1st to 5th graders. Answers may be extracted from the contents of single or multiple sentences in the corresponding reading text. A thorough analysis of the corpus and experimental results in this paper illustrate that our corpus ViMMRC demands reasoning abilities beyond simple word matching. We proposed the method of Boosted Sliding Window (BSW) that improves 5.51% in accuracy over the best baseline method. We also measured human performance on the corpus and compared it to our MRC models. The performance gap between humans and our best experimental model indicates that significant progress can be made on Vietnamese machine reading comprehension in further research. The corpus is freely available at our website for research purposes.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2020-01-16
URL https://arxiv.org/abs/2001.05687v2
PDF https://arxiv.org/pdf/2001.05687v2.pdf
PWC https://paperswithcode.com/paper/a-pilot-study-on-multiple-choice-machine

A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts

Title A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts
Authors Lingyun Zhao, Lin Li, Xinhao Zheng
Abstract The emergence and rapid progress of the Internet have brought ever-increasing impact on financial domain. How to rapidly and accurately mine the key information from the massive negative financial texts has become one of the key issues for investors and decision makers. Aiming at the issue, we propose a sentiment analysis and key entity detection approach based on BERT, which is applied in online financial text mining and public opinion analysis in social media. By using pre-train model, we first study sentiment analysis, and then we consider key entity detection as a sentence matching or Machine Reading Comprehension (MRC) task in different granularity. Among them, we mainly focus on negative sentimental information. We detect the specific entity by using our approach, which is different from traditional Named Entity Recognition (NER). In addition, we also use ensemble learning to improve the performance of proposed approach. Experimental results show that the performance of our approach is generally higher than SVM, LR, NBM, and BERT for two financial sentiment analysis and key entity detection datasets.
Tasks Machine Reading Comprehension, Named Entity Recognition, Reading Comprehension, Sentiment Analysis
Published 2020-01-14
URL https://arxiv.org/abs/2001.05326v1
PDF https://arxiv.org/pdf/2001.05326v1.pdf
PWC https://paperswithcode.com/paper/a-bert-based-sentiment-analysis-and-key

A Survey on Machine Reading Comprehension Systems

Title A Survey on Machine Reading Comprehension Systems
Authors Razieh Baradaran, Razieh Ghiasi, Hossein Amirkhani
Abstract Machine reading comprehension is a challenging task and hot topic in natural language processing. Its goal is to develop systems to answer the questions regarding a given context. In this paper, we present a comprehensive survey on different aspects of machine reading comprehension systems, including their approaches, structures, input/outputs, and research novelties. We illustrate the recent trends in this field based on 124 reviewed papers from 2016 to 2018. Our investigations demonstrate that the focus of research has changed in recent years from answer extraction to answer generation, from single to multi-document reading comprehension, and from learning from scratch to using pre-trained embeddings. We also discuss the popular datasets and the evaluation metrics in this field. The paper ends with investigating the most cited papers and their contributions.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2020-01-06
URL https://arxiv.org/abs/2001.01582v1
PDF https://arxiv.org/pdf/2001.01582v1.pdf
PWC https://paperswithcode.com/paper/a-survey-on-machine-reading-comprehension

Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers

Title Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers
Authors Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich
Abstract We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer’s kernel and to neural networks, and report promising empirical results on all accounts.
Published 2020-02-04
URL https://arxiv.org/abs/2002.01408v1
PDF https://arxiv.org/pdf/2002.01408v1.pdf
PWC https://paperswithcode.com/paper/apportioned-margin-approach-for-cost

Adversarial-based neural networks for affect estimations in the wild

Title Adversarial-based neural networks for affect estimations in the wild
Authors Decky Aspandi, Adria Mallol-Ragolta, Björn Schuller, Xavier Binefa
Abstract There is a growing interest in affective computing research nowadays given its crucial role in bridging humans with computers. This progress has been recently accelerated due to the emergence of bigger data. One recent advance in this field is the use of adversarial learning to improve model learning through augmented samples. However, the use of latent features, which is feasible through adversarial learning, is not largely explored, yet. This technique may also improve the performance of affective models, as analogously demonstrated in related fields, such as computer vision. To expand this analysis, in this work, we explore the use of latent features through our proposed adversarial-based networks for valence and arousal recognition in the wild. Specifically, our models operate by aggregating several modalities to our discriminator, which is further conditioned to the extracted latent features by the generator. Our experiments on the recently released SEWA dataset suggest the progressive improvements of our results. Finally, we show our competitive results on the Affective Behavior Analysis in-the-Wild (ABAW) challenge dataset
Published 2020-02-03
URL https://arxiv.org/abs/2002.00883v3
PDF https://arxiv.org/pdf/2002.00883v3.pdf
PWC https://paperswithcode.com/paper/adversarial-based-neural-network-for-affect

Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs

Title Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs
Authors Pim de Haan, Maurice Weiler, Taco Cohen, Max Welling
Abstract A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs). Such GCNs utilize isotropic kernels and are therefore insensitive to the relative orientation of vertices and thus to the geometry of the mesh as a whole. We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Since the resulting features carry orientation information, we introduce a geometric message passing scheme defined by parallel transporting features over mesh edges. Our experiments validate the significantly improved expressivity of the proposed model over conventional GCNs and other methods.
Published 2020-03-11
URL https://arxiv.org/abs/2003.05425v1
PDF https://arxiv.org/pdf/2003.05425v1.pdf
PWC https://paperswithcode.com/paper/gauge-equivariant-mesh-cnns-anisotropic

Smoothing Graphons for Modelling Exchangeable Relational Data

Title Smoothing Graphons for Modelling Exchangeable Relational Data
Authors Xuhui Fan, Yaqiong Li, Ling Chen, Bin Li, Scott A. Sisson
Abstract Modelling exchangeable relational data can be described by \textit{graphon theory}. Most Bayesian methods for modelling exchangeable relational data can be attributed to this framework by exploiting different forms of graphons. However, the graphons adopted by existing Bayesian methods are either piecewise-constant functions, which are insufficiently flexible for accurate modelling of the relational data, or are complicated continuous functions, which incur heavy computational costs for inference. In this work, we introduce a smoothing procedure to piecewise-constant graphons to form {\em smoothing graphons}, which permit continuous intensity values for describing relations, but without impractically increasing computational costs. In particular, we focus on the Bayesian Stochastic Block Model (SBM) and demonstrate how to adapt the piecewise-constant SBM graphon to the smoothed version. We initially propose the Integrated Smoothing Graphon (ISG) which introduces one smoothing parameter to the SBM graphon to generate continuous relational intensity values. We then develop the Latent Feature Smoothing Graphon (LFSG), which improves on the ISG by introducing auxiliary hidden labels to decompose the calculation of the ISG intensity and enable efficient inference. Experimental results on real-world data sets validate the advantages of applying smoothing strategies to the Stochastic Block Model, demonstrating that smoothing graphons can greatly improve AUC and precision for link prediction without increasing computational complexity.
Tasks Link Prediction
Published 2020-02-25
URL https://arxiv.org/abs/2002.11159v1
PDF https://arxiv.org/pdf/2002.11159v1.pdf
PWC https://paperswithcode.com/paper/smoothing-graphons-for-modelling-exchangeable

Geometric deep learning for computational mechanics Part I: Anisotropic Hyperelasticity

Title Geometric deep learning for computational mechanics Part I: Anisotropic Hyperelasticity
Authors Nikolaos Vlassis, Ran Ma, WaiChing Sun
Abstract This paper is the first attempt to use geometric deep learning and Sobolev training to incorporate non-Euclidean microstructural data such that anisotropic hyperelastic material machine learning models can be trained in the finite deformation range. While traditional hyperelasticity models often incorporate homogenized measures of microstructural attributes, such as porosity averaged orientation of constitutes, these measures cannot reflect the topological structures of the attributes. We fill this knowledge gap by introducing the concept of weighted graph as a new mean to store topological information, such as the connectivity of anisotropic grains in assembles. Then, by leveraging a graph convolutional deep neural network architecture in the spectral domain, we introduce a mechanism to incorporate these non-Euclidean weighted graph data directly as input for training and for predicting the elastic responses of materials with complex microstructures. To ensure smoothness and prevent non-convexity of the trained stored energy functional, we introduce a Sobolev training technique for neural networks such that stress measure is obtained implicitly from taking directional derivatives of the trained energy functional. By optimizing the neural network to approximate both the energy functional output and the stress measure, we introduce a training procedure the improves efficiency and generalize the learned energy functional for different microstructures. The trained hybrid neural network model is then used to generate new stored energy functional for unseen microstructures in a parametric study to predict the influence of elastic anisotropy on the nucleation and propagation of fracture in the brittle regime.
Published 2020-01-08
URL https://arxiv.org/abs/2001.04292v1
PDF https://arxiv.org/pdf/2001.04292v1.pdf
PWC https://paperswithcode.com/paper/geometric-deep-learning-for-computational

A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning

Title A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning
Authors Christoph Lauter, Anastasia Volkova
Abstract Deep Neural Networks (DNN) represent a performance-hungry application. Floating-Point (FP) and custom floating-point-like arithmetic satisfies this hunger. While there is need for speed, inference in DNNs does not seem to have any need for precision. Many papers experimentally observe that DNNs can successfully run at almost ridiculously low precision. The aim of this paper is two-fold: first, to shed some theoretical light upon why a DNN’s FP accuracy stays high for low FP precision. We observe that the loss of relative accuracy in the convolutional steps is recovered by the activation layers, which are extremely well-conditioned. We give an interpretation for the link between precision and accuracy in DNNs. Second, the paper presents a software framework for semi-automatic FP error analysis for the inference phase of deep-learning. Compatible with common Tensorflow/Keras models, it leverages the frugally-deep Python/C++ library to transform a neural network into C++ code in order to analyze the network’s need for precision. This rigorous analysis is based on Interval and Affine arithmetics to compute absolute and relative error bounds for a DNN. We demonstrate our tool with several examples.
Published 2020-02-10
URL https://arxiv.org/abs/2002.03869v1
PDF https://arxiv.org/pdf/2002.03869v1.pdf
PWC https://paperswithcode.com/paper/a-framework-for-semi-automatic-precision-and

Toward equipping Artificial Moral Agents with multiple ethical theories

Title Toward equipping Artificial Moral Agents with multiple ethical theories
Authors George Rautenbach, C. Maria Keet
Abstract Artificial Moral Agents (AMA’s) is a field in computer science with the purpose of creating autonomous machines that can make moral decisions akin to how humans do. Researchers have proposed theoretical means of creating such machines, while philosophers have made arguments as to how these machines ought to behave, or whether they should even exist. Of the currently theorised AMA’s, all research and design has been done with either none or at most one specified normative ethical theory as basis. This is problematic because it narrows down the AMA’s functional ability and versatility which in turn causes moral outcomes that a limited number of people agree with (thereby undermining an AMA’s ability to be moral in a human sense). As solution we design a three-layer model for general normative ethical theories that can be used to serialise the ethical views of people and businesses for an AMA to use during reasoning. Four specific ethical norms (Kantianism, divine command theory, utilitarianism, and egoism) were modelled and evaluated as proof of concept for normative modelling. Furthermore, all models were serialised to XML/XSD as proof of support for computerisation.
Published 2020-03-02
URL https://arxiv.org/abs/2003.00935v1
PDF https://arxiv.org/pdf/2003.00935v1.pdf
PWC https://paperswithcode.com/paper/toward-equipping-artificial-moral-agents-with

Autoencoder-based time series clustering with energy applications

Title Autoencoder-based time series clustering with energy applications
Authors Guillaume Richard, Benoît Grossin, Guillaume Germaine, Georges Hébrail, Anne de Moliner
Abstract Time series clustering is a challenging task due to the specific nature of the data. Classical approaches do not perform well and need to be adapted either through a new distance measure or a data transformation. In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering. The convolutional autoencoder allows to extract meaningful features and reduce the dimension of the data, leading to an improvement of the subsequent clustering. Using simulation and energy related data to validate the approach, experimental results show that the clustering is robust to outliers thus leading to finer clusters than with standard methods.
Tasks Time Series, Time Series Clustering
Published 2020-02-10
URL https://arxiv.org/abs/2002.03624v1
PDF https://arxiv.org/pdf/2002.03624v1.pdf
PWC https://paperswithcode.com/paper/autoencoder-based-time-series-clustering-with

Adaptive Informative Path Planning with Multimodal Sensing

Title Adaptive Informative Path Planning with Multimodal Sensing
Authors Shushman Choudhury, Nate Gruver, Mykel J. Kochenderfer
Abstract Adaptive Informative Path Planning (AIPP) problems model an agent tasked with obtaining information subject to resource constraints in unknown, partially observable environments. Existing work on AIPP has focused on representing observations about the world as a result of agent movement. We formulate the more general setting where the agent may choose between different sensors at the cost of some energy, in addition to traversing the environment to gather information. We call this problem AIPPMS (MS for Multimodal Sensing). AIPPMS requires reasoning jointly about the effects of sensing and movement in terms of both energy expended and information gained. We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. Our approach is based on the Partially Observable Monte Carlo Planning framework with modifications to ensure constraint feasibility and a heuristic rollout policy tailored for AIPPMS. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem. We find that our approach outperforms a classic AIPP algorithm that is modified for AIPPMS, as well as online planning using a random rollout policy.
Published 2020-03-21
URL https://arxiv.org/abs/2003.09746v1
PDF https://arxiv.org/pdf/2003.09746v1.pdf
PWC https://paperswithcode.com/paper/adaptive-informative-path-planning-with

Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case

Title Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case
Authors Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, Renato Cerqueira
Abstract Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as “where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry.
Published 2020-03-10
URL https://arxiv.org/abs/2003.04915v1
PDF https://arxiv.org/pdf/2003.04915v1.pdf
PWC https://paperswithcode.com/paper/managing-data-lineage-of-og-machine-learning

Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis

Title Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Authors David Rios Insua, Roi Naveiro, Victor Gallego, Jason Poulos
Abstract Adversarial Machine Learning (AML) is emerging as a major field aimed at the protection of automated ML systems against security threats. The majority of work in this area has built upon a game-theoretic framework by modelling a conflict between an attacker and a defender. After reviewing game-theoretic approaches to AML, we discuss the benefits that a Bayesian Adversarial Risk Analysis perspective brings when defending ML based systems. A research agenda is included.
Published 2020-03-07
URL https://arxiv.org/abs/2003.03546v1
PDF https://arxiv.org/pdf/2003.03546v1.pdf
PWC https://paperswithcode.com/paper/adversarial-machine-learning-perspectives
comments powered by Disqus