January 28, 2020

3005 words 15 mins read

Paper Group ANR 807

Paper Group ANR 807

Large-Scale Statistical Survey of Magnetopause Reconnection. Optimally Compressed Nonparametric Online Learning. Deep learning languages: a key fundamental shift from probabilities to weights?. An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries. Dynamic Interaction-Aware Scene Understanding for Reinforce …

Large-Scale Statistical Survey of Magnetopause Reconnection

Title Large-Scale Statistical Survey of Magnetopause Reconnection
Authors Samantha Piatt
Abstract The Magnetospheric Multiscale Mission (MMS) seeks to study the micro-physics of reconnection, which occurs at the magnetopause boundary layer between the magnetosphere of Earth and the interplanetary magnetic field originating from the sun. Identifying this region of space automatically will allow for statistical analysis of reconnection events. The magnetopause region is difficult to identify automatically using simple models, and time consuming for scientists to classify by hand. We introduced a hierarchical Bayesian mixture model with linear and auto regressive components to identify the magnetopause. Using data from the MMS mission with the programming languages R and Stan, we modeled and predicted possible regions and evaluated our performance against a boosted regression tree model. Our model selects twice as many magnetopause regions as the comparison model, without significant over selection, achieving a 31% true positive rate and 93% true negative rate. Our method will allow scientists to study the micro-physics of reconnection events in the magnetopause using the large body of MMS data without manual classification.
Tasks
Published 2019-05-24
URL https://arxiv.org/abs/1905.11359v1
PDF https://arxiv.org/pdf/1905.11359v1.pdf
PWC https://paperswithcode.com/paper/large-scale-statistical-survey-of
Repo
Framework

Optimally Compressed Nonparametric Online Learning

Title Optimally Compressed Nonparametric Online Learning
Authors Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler
Abstract Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes’ Rule. Unfortunately, when used online, nonparametric methods suffer a “curse of dimensionality” which precludes their use: their complexity scales at least with the time index. We survey online compression tools which bring their memory under control and attain approximate convergence. The asymptotic bias depends on a compression parameter that trades off memory and accuracy. Further, the applications to robotics, communications, economics, and power are discussed, as well as extensions to multi-agent systems.
Tasks
Published 2019-09-25
URL https://arxiv.org/abs/1909.11555v2
PDF https://arxiv.org/pdf/1909.11555v2.pdf
PWC https://paperswithcode.com/paper/optimally-compressed-nonparametric-online
Repo
Framework

Deep learning languages: a key fundamental shift from probabilities to weights?

Title Deep learning languages: a key fundamental shift from probabilities to weights?
Authors François Coste
Abstract Recent successes in language modeling, notably with deep learning methods, coincide with a shift from probabilistic to weighted representations. We raise here the question of the importance of this evolution, in the light of the practical limitations of a classical and simple probabilistic modeling approach for the classification of protein sequences and in relation to the need for principled methods to learn non-probabilistic models.
Tasks Language Modelling
Published 2019-08-02
URL https://arxiv.org/abs/1908.00785v1
PDF https://arxiv.org/pdf/1908.00785v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-languages-a-key-fundamental
Repo
Framework

An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries

Title An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries
Authors Matteo Manica, Christoph Auer, Valery Weber, Federico Zipoli, Michele Dolfi, Peter Staar, Teodoro Laino, Costas Bekas, Akihiro Fujita, Hiroki Toda, Shuichi Hirose, Yasumitsu Orii
Abstract Information extraction and data mining in biochemical literature is a daunting task that demands resource-intensive computation and appropriate means to scale knowledge ingestion. Being able to leverage this immense source of technical information helps to drastically reduce costs and time to solution in multiple application fields from food safety to pharmaceutics. We present a scalable document ingestion system that integrates data from databases and publications (in PDF format) in a biochemistry knowledge graph (BCKG). The BCKG is a comprehensive source of knowledge that can be queried to retrieve known biochemical facts and to generate novel insights. After describing the knowledge ingestion framework, we showcase an application of our system in the field of carbohydrate enzymes. The BCKG represents a way to scale knowledge ingestion and automatically exploit prior knowledge to accelerate discovery in biochemical sciences.
Tasks
Published 2019-07-19
URL https://arxiv.org/abs/1907.08400v1
PDF https://arxiv.org/pdf/1907.08400v1.pdf
PWC https://paperswithcode.com/paper/an-information-extraction-and-knowledge-graph
Repo
Framework

Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving

Title Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving
Authors Maria Huegle, Gabriel Kalweit, Moritz Werling, Joschka Boedecker
Abstract The common pipeline in autonomous driving systems is highly modular and includes a perception component which extracts lists of surrounding objects and passes these lists to a high-level decision component. In this case, leveraging the benefits of deep reinforcement learning for high-level decision making requires special architectures to deal with multiple variable-length sequences of different object types, such as vehicles, lanes or traffic signs. At the same time, the architecture has to be able to cover interactions between traffic participants in order to find the optimal action to be taken. In this work, we propose the novel Deep Scenes architecture, that can learn complex interaction-aware scene representations based on extensions of either 1) Deep Sets or 2) Graph Convolutional Networks. We present the Graph-Q and DeepScene-Q off-policy reinforcement learning algorithms, both outperforming state-of-the-art methods in evaluations with the publicly available traffic simulator SUMO.
Tasks Autonomous Driving, Decision Making, Scene Understanding
Published 2019-09-30
URL https://arxiv.org/abs/1909.13582v1
PDF https://arxiv.org/pdf/1909.13582v1.pdf
PWC https://paperswithcode.com/paper/dynamic-interaction-aware-scene-understanding
Repo
Framework
Title Uplink-Downlink Tradeoff in Secure Distributed Matrix Multiplication
Authors Jaber Kakar, Anton Khristoforov, Seyedhamed Ebadifar, Aydin Sezgin
Abstract In secure distributed matrix multiplication (SDMM) the multiplication $\mathbf{A}\mathbf{B}$ from two private matrices $\mathbf{A}$ and $\mathbf{B}$ is outsourced by a user to $N$ distributed servers. In $\ell$-SDMM, the goal is to a design a joint communication-computation procedure that optimally balances conflicting communication and computation metrics without leaking any information on both $\mathbf{A}$ and $\mathbf{B}$ to any set of $\ell\leq N$ servers. To this end, the user applies coding with $\tilde{\mathbf{A}}_i$ and $\tilde{\mathbf{B}}_i$ representing encoded versions of $\mathbf{A}$ and $\mathbf{B}$ destined to the $i$-th server. Now, SDMM involves multiple tradeoffs. One such tradeoff is the tradeoff between uplink (UL) and downlink (DL) costs. To find a good balance between these two metrics, we propose two schemes which we term USCSA and GSCSA that are based on secure cross subspace alignment (SCSA). We show that there are various scenarios where they outperform existing SDMM schemes from the literature with respect to the UL-DL efficiency. Next, we implement schemes from the literature, including USCSA and GSCSA, and test their performance on Amazon EC2. Our numerical results show that USCSA and GSCSA establish a good balance between the time spend on the communication and computation in SDMMs. This is because they combine advantages of polynomial codes, namely low time for the upload of $\left(\tilde{\mathbf{A}}_i,\tilde{\mathbf{B}}i\right){i=1}^{N}$ and the computation of $\mathbf{O}_i=\tilde{\mathbf{A}}_i\tilde{\mathbf{B}}_i$, with those of SCSA, being a low timing overhead for the download of $\left(\mathbf{O}i\right){i=1}^{N}$ and the decoding of $\mathbf{A}\mathbf{B}$.
Tasks
Published 2019-10-30
URL https://arxiv.org/abs/1910.13849v3
PDF https://arxiv.org/pdf/1910.13849v3.pdf
PWC https://paperswithcode.com/paper/uplink-downlink-tradeoff-in-secure
Repo
Framework

Cleaned Similarity for Better Memory-Based Recommenders

Title Cleaned Similarity for Better Memory-Based Recommenders
Authors Farhan Khawar, Nevin L. Zhang
Abstract Memory-based collaborative filtering methods like user or item k-nearest neighbors (kNN) are a simple yet effective solution to the recommendation problem. The backbone of these methods is the estimation of the empirical similarity between users/items. In this paper, we analyze the spectral properties of the Pearson and the cosine similarity estimators, and we use tools from random matrix theory to argue that they suffer from noise and eigenvalues spreading. We argue that, unlike the Pearson correlation, the cosine similarity naturally possesses the desirable property of eigenvalue shrinkage for large eigenvalues. However, due to its zero-mean assumption, it overestimates the largest eigenvalues. We quantify this overestimation and present a simple re-scaling and noise cleaning scheme. This results in better performance of the memory-based methods compared to their vanilla counterparts.
Tasks
Published 2019-05-17
URL https://arxiv.org/abs/1905.07370v1
PDF https://arxiv.org/pdf/1905.07370v1.pdf
PWC https://paperswithcode.com/paper/cleaned-similarity-for-better-memory-based
Repo
Framework

Exascale Deep Learning for Scientific Inverse Problems

Title Exascale Deep Learning for Scientific Inverse Problems
Authors Nouamane Laanait, Joshua Romero, Junqi Yin, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson
Abstract We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. We demonstrate our gradient reduction techniques in the context of training a Fully Convolutional Neural Network to approximate the solution of a longstanding scientific inverse problem in materials imaging. The efficient distributed training on a dataset size of 0.5 PB, produces a model capable of an atomically-accurate reconstruction of materials, and in the process reaching a peak performance of 2.15(4) EFLOPS$_{16}$.
Tasks Materials Imaging
Published 2019-09-24
URL https://arxiv.org/abs/1909.11150v1
PDF https://arxiv.org/pdf/1909.11150v1.pdf
PWC https://paperswithcode.com/paper/exascale-deep-learning-for-scientific-inverse
Repo
Framework

Artificial mental phenomena: Psychophysics as a framework to detect perception biases in AI models

Title Artificial mental phenomena: Psychophysics as a framework to detect perception biases in AI models
Authors Lizhen Liang, Daniel E. Acuna
Abstract Detecting biases in artificial intelligence has become difficult because of the impenetrable nature of deep learning. The central difficulty is in relating unobservable phenomena deep inside models with observable, outside quantities that we can measure from inputs and outputs. For example, can we detect gendered perceptions of occupations (e.g., female librarian, male electrician) using questions to and answers from a word embedding-based system? Current techniques for detecting biases are often customized for a task, dataset, or method, affecting their generalization. In this work, we draw from Psychophysics in Experimental Psychology—meant to relate quantities from the real world (i.e., “Physics”) into subjective measures in the mind (i.e., “Psyche”)—to propose an intellectually coherent and generalizable framework to detect biases in AI. Specifically, we adapt the two-alternative forced choice task (2AFC) to estimate potential biases and the strength of those biases in black-box models. We successfully reproduce previously-known biased perceptions in word embeddings and sentiment analysis predictions. We discuss how concepts in experimental psychology can be naturally applied to understanding artificial mental phenomena, and how psychophysics can form a useful methodological foundation to study fairness in AI.
Tasks Sentiment Analysis, Word Embeddings
Published 2019-12-15
URL https://arxiv.org/abs/1912.10818v1
PDF https://arxiv.org/pdf/1912.10818v1.pdf
PWC https://paperswithcode.com/paper/artificial-mental-phenomena-psychophysics-as
Repo
Framework

Interpreting Neural Networks Using Flip Points

Title Interpreting Neural Networks Using Flip Points
Authors Roozbeh Yousefzadeh, Dianne P. O’Leary
Abstract Neural networks have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Here, we introduce a novel technique, interpreting a trained neural network by investigating its flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for “yes” and “no”. The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. This paper gives an overview of the uses of flip points and how they are computed. Through results on standard datasets, we demonstrate how flip points can be used to provide detailed interpretation of the output produced by a neural network. Moreover, for a given input, flip points enable us to measure confidence in the correctness of outputs much more effectively than softmax score. They also identify influential features of the inputs, identify bias, and find changes in the input that change the output of the model. We show that distance between an input and the closest flip point identifies the most influential points in the training data. Using principal component analysis (PCA) and rank-revealing QR factorization (RR-QR), the set of directions from each training input to its closest flip point provides explanations of how a trained neural network processes an entire dataset: what features are most important for classification into a given class, which features are most responsible for particular misclassifications, how an adversary might fool the network, etc. Although we investigate flip points for neural networks, their usefulness is actually model-agnostic.
Tasks
Published 2019-03-21
URL http://arxiv.org/abs/1903.08789v1
PDF http://arxiv.org/pdf/1903.08789v1.pdf
PWC https://paperswithcode.com/paper/interpreting-neural-networks-using-flip
Repo
Framework

Regression-clustering for Improved Accuracy and Training Cost with Molecular-Orbital-Based Machine Learning

Title Regression-clustering for Improved Accuracy and Training Cost with Molecular-Orbital-Based Machine Learning
Authors Lixue Cheng, Nikola B. Kovachki, Matthew Welborn, Thomas F. Miller III
Abstract Machine learning (ML) in the representation of molecular-orbital-based (MOB) features has been shown to be an accurate and transferable approach to the prediction of post-Hartree-Fock correlation energies. Previous applications of MOB-ML employed Gaussian Process Regression (GPR), which provides good prediction accuracy with small training sets; however, the cost of GPR training scales cubically with the amount of data and becomes a computational bottleneck for large training sets. In the current work, we address this problem by introducing a clustering/regression/classification implementation of MOB-ML. In a first step, regression clustering (RC) is used to partition the training data to best fit an ensemble of linear regression (LR) models; in a second step, each cluster is regressed independently, using either LR or GPR; and in a third step, a random forest classifier (RFC) is trained for the prediction of cluster assignments based on MOB feature values. Upon inspection, RC is found to recapitulate chemically intuitive groupings of the frontier molecular orbitals, and the combined RC/LR/RFC and RC/GPR/RFC implementations of MOB-ML are found to provide good prediction accuracy with greatly reduced wall-clock training times. For a dataset of thermalized geometries of 7211 organic molecules of up to seven heavy atoms, both implementations reach chemical accuracy (1 kcal/mol error) with only 300 training molecules, while providing 35000-fold and 4500-fold reductions in the wall-clock training time, respectively, compared to MOB-ML without clustering. The resulting models are also demonstrated to retain transferability for the prediction of large-molecule energies with only small-molecule training data. Finally, it is shown that capping the number of training datapoints per cluster leads to further improvements in prediction accuracy with negligible increases in wall-clock training time.
Tasks
Published 2019-09-04
URL https://arxiv.org/abs/1909.02041v4
PDF https://arxiv.org/pdf/1909.02041v4.pdf
PWC https://paperswithcode.com/paper/regression-clustering-for-improved-accuracy
Repo
Framework
Title Searching for Legal Clauses by Analogy. Few-shot Semantic Retrieval Shared Task
Authors Łukasz Borchmann, Dawid Wiśniewski, Andrzej Gretkowski, Izabela Kosmala, Dawid Jurkiewicz, Łukasz Szałkiewicz, Gabriela Pałka, Karol Kaczmarek, Agnieszka Kaliska, Filip Graliński
Abstract We introduce a novel shared task for semantic retrieval from legal texts, where one is expected to perform a so-called contract discovery – extract specified legal clauses from documents given a few examples of similar clauses from other legal acts. The task differs substantially from conventional NLI and legal information extraction shared tasks. Its specification is followed with evaluation of multiple k-NN based solutions within the unified framework proposed for this branch of methods. It is shown that state-of-the-art pre-trained encoders fail to provide satisfactory results on the task proposed, whereas Language Model based solutions perform well, especially when unsupervised fine-tuning is applied. In addition to the ablation studies, the questions regarding relevant text fragments detection accuracy depending on number of examples available were addressed. In addition to dataset and reference results, legal-specialized LMs were made publicly available.
Tasks Language Modelling
Published 2019-11-10
URL https://arxiv.org/abs/1911.03911v1
PDF https://arxiv.org/pdf/1911.03911v1.pdf
PWC https://paperswithcode.com/paper/searching-for-legal-clauses-by-analogy-few
Repo
Framework

Differentiable Architecture Search with Ensemble Gumbel-Softmax

Title Differentiable Architecture Search with Ensemble Gumbel-Softmax
Authors Jianlong Chang, Xinbang Zhang, Yiwen Guo, Gaofeng Meng, Shiming Xiang, Chunhong Pan
Abstract For network architecture search (NAS), it is crucial but challenging to simultaneously guarantee both effectiveness and efficiency. Towards achieving this goal, we develop a differentiable NAS solution, where the search space includes arbitrary feed-forward network consisting of the predefined number of connections. Benefiting from a proposed ensemble Gumbel-Softmax estimator, our method optimizes both the architecture of a deep network and its parameters in the same round of backward propagation, yielding an end-to-end mechanism of searching network architectures. Extensive experiments on a variety of popular datasets strongly evidence that our method is capable of discovering high-performance architectures, while guaranteeing the requisite efficiency during searching.
Tasks Neural Architecture Search
Published 2019-05-06
URL https://arxiv.org/abs/1905.01786v1
PDF https://arxiv.org/pdf/1905.01786v1.pdf
PWC https://paperswithcode.com/paper/differentiable-architecture-search-with
Repo
Framework

AP19-OLR Challenge: Three Tasks and Their Baselines

Title AP19-OLR Challenge: Three Tasks and Their Baselines
Authors Zhiyuan Tang, Dong Wang, Liming Song
Abstract This paper introduces the fourth oriental language recognition (OLR) challenge AP19-OLR, including the data profile, the tasks and the evaluation principles. The OLR challenge has been held successfully for three consecutive years, along with APSIPA Annual Summit and Conference (APSIPA ASC). The challenge this year still focuses on practical and challenging tasks, precisely (1) short-utterance LID, (2) cross-channel LID and (3) zero-resource LID. The event this year includes more languages and more real-life data provided by SpeechOcean and the NSFC M2ASR project. All the data is free for participants. Recipes for x-vector system and back-end evaluation are also conducted as baselines for the three tasks. The participants can refer to these online-published recipes to deploy LID systems for convenience. We report the baseline results on the three tasks and demonstrate that the three tasks are worth some efforts to achieve better performance.
Tasks
Published 2019-07-16
URL https://arxiv.org/abs/1907.07626v3
PDF https://arxiv.org/pdf/1907.07626v3.pdf
PWC https://paperswithcode.com/paper/ap19-olr-challenge-three-tasks-and-their
Repo
Framework

High-Performance Support Vector Machines and Its Applications

Title High-Performance Support Vector Machines and Its Applications
Authors Taiping He, Tao Wang, Ralph Abbey, Joshua Griffin
Abstract The support vector machines (SVM) algorithm is a popular classification technique in data mining and machine learning. In this paper, we propose a distributed SVM algorithm and demonstrate its use in a number of applications. The algorithm is named high-performance support vector machines (HPSVM). The major contribution of HPSVM is two-fold. First, HPSVM provides a new way to distribute computations to the machines in the cloud without shuffling the data. Second, HPSVM minimizes the inter-machine communications in order to maximize the performance. We apply HPSVM to some real-world classification problems and compare it with the state-of-the-art SVM technique implemented in R on several public data sets. HPSVM achieves similar or better results.
Tasks
Published 2019-05-01
URL http://arxiv.org/abs/1905.00331v1
PDF http://arxiv.org/pdf/1905.00331v1.pdf
PWC https://paperswithcode.com/paper/high-performance-support-vector-machines-and
Repo
Framework
comments powered by Disqus