Paper Group ANR 83
Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms. A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts. A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts. A Survey on Machine Reading Comprehension Syste …
Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms
Title | Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms |
Authors | T. Tony Cai, Hongji Wei |
Abstract | We study distributed estimation of a Gaussian mean under communication constraints in a decision theoretical framework. Minimax rates of convergence, which characterize the tradeoff between the communication costs and statistical accuracy, are established in both the univariate and multivariate settings. Communication-efficient and statistically optimal procedures are developed. In the univariate case, the optimal rate depends only on the total communication budget, so long as each local machine has at least one bit. However, in the multivariate case, the minimax rate depends on the specific allocations of the communication budgets among the local machines. Although optimal estimation of a Gaussian mean is relatively simple in the conventional setting, it is quite involved under the communication constraints, both in terms of the optimal procedure design and lower bound argument. The techniques developed in this paper can be of independent interest. An essential step is the decomposition of the minimax estimation problem into two stages, localization and refinement. This critical decomposition provides a framework for both the lower bound analysis and optimal procedure design. |
Tasks | |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.08877v1 |
https://arxiv.org/pdf/2001.08877v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-gaussian-mean-estimation-under |
Repo | |
Framework | |
A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts
Title | A Pilot Study on Multiple Choice Machine Reading Comprehension for Vietnamese Texts |
Authors | Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen |
Abstract | Machine Reading Comprehension (MRC) is the task of natural language processing that studies the ability to read and understand unstructured texts and then find the correct answers for questions. Until now, we have not yet had any MRC dataset for such a low-resource language as Vietnamese. In this paper, we introduce ViMMRC, a challenging machine comprehension corpus with multiple-choice questions, intended for research on the machine comprehension of Vietnamese text. This corpus includes 2,783 multiple-choice questions and answers based on a set of 417 Vietnamese texts used for teaching reading comprehension for 1st to 5th graders. Answers may be extracted from the contents of single or multiple sentences in the corresponding reading text. A thorough analysis of the corpus and experimental results in this paper illustrate that our corpus ViMMRC demands reasoning abilities beyond simple word matching. We proposed the method of Boosted Sliding Window (BSW) that improves 5.51% in accuracy over the best baseline method. We also measured human performance on the corpus and compared it to our MRC models. The performance gap between humans and our best experimental model indicates that significant progress can be made on Vietnamese machine reading comprehension in further research. The corpus is freely available at our website for research purposes. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05687v2 |
https://arxiv.org/pdf/2001.05687v2.pdf | |
PWC | https://paperswithcode.com/paper/a-pilot-study-on-multiple-choice-machine |
Repo | |
Framework | |
A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts
Title | A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts |
Authors | Lingyun Zhao, Lin Li, Xinhao Zheng |
Abstract | The emergence and rapid progress of the Internet have brought ever-increasing impact on financial domain. How to rapidly and accurately mine the key information from the massive negative financial texts has become one of the key issues for investors and decision makers. Aiming at the issue, we propose a sentiment analysis and key entity detection approach based on BERT, which is applied in online financial text mining and public opinion analysis in social media. By using pre-train model, we first study sentiment analysis, and then we consider key entity detection as a sentence matching or Machine Reading Comprehension (MRC) task in different granularity. Among them, we mainly focus on negative sentimental information. We detect the specific entity by using our approach, which is different from traditional Named Entity Recognition (NER). In addition, we also use ensemble learning to improve the performance of proposed approach. Experimental results show that the performance of our approach is generally higher than SVM, LR, NBM, and BERT for two financial sentiment analysis and key entity detection datasets. |
Tasks | Machine Reading Comprehension, Named Entity Recognition, Reading Comprehension, Sentiment Analysis |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.05326v1 |
https://arxiv.org/pdf/2001.05326v1.pdf | |
PWC | https://paperswithcode.com/paper/a-bert-based-sentiment-analysis-and-key |
Repo | |
Framework | |
A Survey on Machine Reading Comprehension Systems
Title | A Survey on Machine Reading Comprehension Systems |
Authors | Razieh Baradaran, Razieh Ghiasi, Hossein Amirkhani |
Abstract | Machine reading comprehension is a challenging task and hot topic in natural language processing. Its goal is to develop systems to answer the questions regarding a given context. In this paper, we present a comprehensive survey on different aspects of machine reading comprehension systems, including their approaches, structures, input/outputs, and research novelties. We illustrate the recent trends in this field based on 124 reviewed papers from 2016 to 2018. Our investigations demonstrate that the focus of research has changed in recent years from answer extraction to answer generation, from single to multi-document reading comprehension, and from learning from scratch to using pre-trained embeddings. We also discuss the popular datasets and the evaluation metrics in this field. The paper ends with investigating the most cited papers and their contributions. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01582v1 |
https://arxiv.org/pdf/2001.01582v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-machine-reading-comprehension |
Repo | |
Framework | |
Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers
Title | Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers |
Authors | Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich |
Abstract | We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer’s kernel and to neural networks, and report promising empirical results on all accounts. |
Tasks | |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01408v1 |
https://arxiv.org/pdf/2002.01408v1.pdf | |
PWC | https://paperswithcode.com/paper/apportioned-margin-approach-for-cost |
Repo | |
Framework | |
Adversarial-based neural networks for affect estimations in the wild
Title | Adversarial-based neural networks for affect estimations in the wild |
Authors | Decky Aspandi, Adria Mallol-Ragolta, Björn Schuller, Xavier Binefa |
Abstract | There is a growing interest in affective computing research nowadays given its crucial role in bridging humans with computers. This progress has been recently accelerated due to the emergence of bigger data. One recent advance in this field is the use of adversarial learning to improve model learning through augmented samples. However, the use of latent features, which is feasible through adversarial learning, is not largely explored, yet. This technique may also improve the performance of affective models, as analogously demonstrated in related fields, such as computer vision. To expand this analysis, in this work, we explore the use of latent features through our proposed adversarial-based networks for valence and arousal recognition in the wild. Specifically, our models operate by aggregating several modalities to our discriminator, which is further conditioned to the extracted latent features by the generator. Our experiments on the recently released SEWA dataset suggest the progressive improvements of our results. Finally, we show our competitive results on the Affective Behavior Analysis in-the-Wild (ABAW) challenge dataset |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00883v3 |
https://arxiv.org/pdf/2002.00883v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-based-neural-network-for-affect |
Repo | |
Framework | |
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs
Title | Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs |
Authors | Pim de Haan, Maurice Weiler, Taco Cohen, Max Welling |
Abstract | A common approach to define convolutions on meshes is to interpret them as a graph and apply graph convolutional networks (GCNs). Such GCNs utilize isotropic kernels and are therefore insensitive to the relative orientation of vertices and thus to the geometry of the mesh as a whole. We propose Gauge Equivariant Mesh CNNs which generalize GCNs to apply anisotropic gauge equivariant kernels. Since the resulting features carry orientation information, we introduce a geometric message passing scheme defined by parallel transporting features over mesh edges. Our experiments validate the significantly improved expressivity of the proposed model over conventional GCNs and other methods. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05425v1 |
https://arxiv.org/pdf/2003.05425v1.pdf | |
PWC | https://paperswithcode.com/paper/gauge-equivariant-mesh-cnns-anisotropic |
Repo | |
Framework | |
Smoothing Graphons for Modelling Exchangeable Relational Data
Title | Smoothing Graphons for Modelling Exchangeable Relational Data |
Authors | Xuhui Fan, Yaqiong Li, Ling Chen, Bin Li, Scott A. Sisson |
Abstract | Modelling exchangeable relational data can be described by \textit{graphon theory}. Most Bayesian methods for modelling exchangeable relational data can be attributed to this framework by exploiting different forms of graphons. However, the graphons adopted by existing Bayesian methods are either piecewise-constant functions, which are insufficiently flexible for accurate modelling of the relational data, or are complicated continuous functions, which incur heavy computational costs for inference. In this work, we introduce a smoothing procedure to piecewise-constant graphons to form {\em smoothing graphons}, which permit continuous intensity values for describing relations, but without impractically increasing computational costs. In particular, we focus on the Bayesian Stochastic Block Model (SBM) and demonstrate how to adapt the piecewise-constant SBM graphon to the smoothed version. We initially propose the Integrated Smoothing Graphon (ISG) which introduces one smoothing parameter to the SBM graphon to generate continuous relational intensity values. We then develop the Latent Feature Smoothing Graphon (LFSG), which improves on the ISG by introducing auxiliary hidden labels to decompose the calculation of the ISG intensity and enable efficient inference. Experimental results on real-world data sets validate the advantages of applying smoothing strategies to the Stochastic Block Model, demonstrating that smoothing graphons can greatly improve AUC and precision for link prediction without increasing computational complexity. |
Tasks | Link Prediction |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.11159v1 |
https://arxiv.org/pdf/2002.11159v1.pdf | |
PWC | https://paperswithcode.com/paper/smoothing-graphons-for-modelling-exchangeable |
Repo | |
Framework | |
Geometric deep learning for computational mechanics Part I: Anisotropic Hyperelasticity
Title | Geometric deep learning for computational mechanics Part I: Anisotropic Hyperelasticity |
Authors | Nikolaos Vlassis, Ran Ma, WaiChing Sun |
Abstract | This paper is the first attempt to use geometric deep learning and Sobolev training to incorporate non-Euclidean microstructural data such that anisotropic hyperelastic material machine learning models can be trained in the finite deformation range. While traditional hyperelasticity models often incorporate homogenized measures of microstructural attributes, such as porosity averaged orientation of constitutes, these measures cannot reflect the topological structures of the attributes. We fill this knowledge gap by introducing the concept of weighted graph as a new mean to store topological information, such as the connectivity of anisotropic grains in assembles. Then, by leveraging a graph convolutional deep neural network architecture in the spectral domain, we introduce a mechanism to incorporate these non-Euclidean weighted graph data directly as input for training and for predicting the elastic responses of materials with complex microstructures. To ensure smoothness and prevent non-convexity of the trained stored energy functional, we introduce a Sobolev training technique for neural networks such that stress measure is obtained implicitly from taking directional derivatives of the trained energy functional. By optimizing the neural network to approximate both the energy functional output and the stress measure, we introduce a training procedure the improves efficiency and generalize the learned energy functional for different microstructures. The trained hybrid neural network model is then used to generate new stored energy functional for unseen microstructures in a parametric study to predict the influence of elastic anisotropy on the nucleation and propagation of fracture in the brittle regime. |
Tasks | |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.04292v1 |
https://arxiv.org/pdf/2001.04292v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-deep-learning-for-computational |
Repo | |
Framework | |
A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning
Title | A Framework for Semi-Automatic Precision and Accuracy Analysis for Fast and Rigorous Deep Learning |
Authors | Christoph Lauter, Anastasia Volkova |
Abstract | Deep Neural Networks (DNN) represent a performance-hungry application. Floating-Point (FP) and custom floating-point-like arithmetic satisfies this hunger. While there is need for speed, inference in DNNs does not seem to have any need for precision. Many papers experimentally observe that DNNs can successfully run at almost ridiculously low precision. The aim of this paper is two-fold: first, to shed some theoretical light upon why a DNN’s FP accuracy stays high for low FP precision. We observe that the loss of relative accuracy in the convolutional steps is recovered by the activation layers, which are extremely well-conditioned. We give an interpretation for the link between precision and accuracy in DNNs. Second, the paper presents a software framework for semi-automatic FP error analysis for the inference phase of deep-learning. Compatible with common Tensorflow/Keras models, it leverages the frugally-deep Python/C++ library to transform a neural network into C++ code in order to analyze the network’s need for precision. This rigorous analysis is based on Interval and Affine arithmetics to compute absolute and relative error bounds for a DNN. We demonstrate our tool with several examples. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03869v1 |
https://arxiv.org/pdf/2002.03869v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-semi-automatic-precision-and |
Repo | |
Framework | |
Toward equipping Artificial Moral Agents with multiple ethical theories
Title | Toward equipping Artificial Moral Agents with multiple ethical theories |
Authors | George Rautenbach, C. Maria Keet |
Abstract | Artificial Moral Agents (AMA’s) is a field in computer science with the purpose of creating autonomous machines that can make moral decisions akin to how humans do. Researchers have proposed theoretical means of creating such machines, while philosophers have made arguments as to how these machines ought to behave, or whether they should even exist. Of the currently theorised AMA’s, all research and design has been done with either none or at most one specified normative ethical theory as basis. This is problematic because it narrows down the AMA’s functional ability and versatility which in turn causes moral outcomes that a limited number of people agree with (thereby undermining an AMA’s ability to be moral in a human sense). As solution we design a three-layer model for general normative ethical theories that can be used to serialise the ethical views of people and businesses for an AMA to use during reasoning. Four specific ethical norms (Kantianism, divine command theory, utilitarianism, and egoism) were modelled and evaluated as proof of concept for normative modelling. Furthermore, all models were serialised to XML/XSD as proof of support for computerisation. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00935v1 |
https://arxiv.org/pdf/2003.00935v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-equipping-artificial-moral-agents-with |
Repo | |
Framework | |
Autoencoder-based time series clustering with energy applications
Title | Autoencoder-based time series clustering with energy applications |
Authors | Guillaume Richard, Benoît Grossin, Guillaume Germaine, Georges Hébrail, Anne de Moliner |
Abstract | Time series clustering is a challenging task due to the specific nature of the data. Classical approaches do not perform well and need to be adapted either through a new distance measure or a data transformation. In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering. The convolutional autoencoder allows to extract meaningful features and reduce the dimension of the data, leading to an improvement of the subsequent clustering. Using simulation and energy related data to validate the approach, experimental results show that the clustering is robust to outliers thus leading to finer clusters than with standard methods. |
Tasks | Time Series, Time Series Clustering |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03624v1 |
https://arxiv.org/pdf/2002.03624v1.pdf | |
PWC | https://paperswithcode.com/paper/autoencoder-based-time-series-clustering-with |
Repo | |
Framework | |
Adaptive Informative Path Planning with Multimodal Sensing
Title | Adaptive Informative Path Planning with Multimodal Sensing |
Authors | Shushman Choudhury, Nate Gruver, Mykel J. Kochenderfer |
Abstract | Adaptive Informative Path Planning (AIPP) problems model an agent tasked with obtaining information subject to resource constraints in unknown, partially observable environments. Existing work on AIPP has focused on representing observations about the world as a result of agent movement. We formulate the more general setting where the agent may choose between different sensors at the cost of some energy, in addition to traversing the environment to gather information. We call this problem AIPPMS (MS for Multimodal Sensing). AIPPMS requires reasoning jointly about the effects of sensing and movement in terms of both energy expended and information gained. We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. Our approach is based on the Partially Observable Monte Carlo Planning framework with modifications to ensure constraint feasibility and a heuristic rollout policy tailored for AIPPMS. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem. We find that our approach outperforms a classic AIPP algorithm that is modified for AIPPMS, as well as online planning using a random rollout policy. |
Tasks | |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09746v1 |
https://arxiv.org/pdf/2003.09746v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-informative-path-planning-with |
Repo | |
Framework | |
Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case
Title | Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case |
Authors | Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, Renato Cerqueira |
Abstract | Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as “where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry. |
Tasks | |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04915v1 |
https://arxiv.org/pdf/2003.04915v1.pdf | |
PWC | https://paperswithcode.com/paper/managing-data-lineage-of-og-machine-learning |
Repo | |
Framework | |
Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Title | Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis |
Authors | David Rios Insua, Roi Naveiro, Victor Gallego, Jason Poulos |
Abstract | Adversarial Machine Learning (AML) is emerging as a major field aimed at the protection of automated ML systems against security threats. The majority of work in this area has built upon a game-theoretic framework by modelling a conflict between an attacker and a defender. After reviewing game-theoretic approaches to AML, we discuss the benefits that a Bayesian Adversarial Risk Analysis perspective brings when defending ML based systems. A research agenda is included. |
Tasks | |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03546v1 |
https://arxiv.org/pdf/2003.03546v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-machine-learning-perspectives |
Repo | |
Framework | |