Paper Group ANR 807
Large-Scale Statistical Survey of Magnetopause Reconnection. Optimally Compressed Nonparametric Online Learning. Deep learning languages: a key fundamental shift from probabilities to weights?. An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries. Dynamic Interaction-Aware Scene Understanding for Reinforce …
Large-Scale Statistical Survey of Magnetopause Reconnection
Title | Large-Scale Statistical Survey of Magnetopause Reconnection |
Authors | Samantha Piatt |
Abstract | The Magnetospheric Multiscale Mission (MMS) seeks to study the micro-physics of reconnection, which occurs at the magnetopause boundary layer between the magnetosphere of Earth and the interplanetary magnetic field originating from the sun. Identifying this region of space automatically will allow for statistical analysis of reconnection events. The magnetopause region is difficult to identify automatically using simple models, and time consuming for scientists to classify by hand. We introduced a hierarchical Bayesian mixture model with linear and auto regressive components to identify the magnetopause. Using data from the MMS mission with the programming languages R and Stan, we modeled and predicted possible regions and evaluated our performance against a boosted regression tree model. Our model selects twice as many magnetopause regions as the comparison model, without significant over selection, achieving a 31% true positive rate and 93% true negative rate. Our method will allow scientists to study the micro-physics of reconnection events in the magnetopause using the large body of MMS data without manual classification. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.11359v1 |
https://arxiv.org/pdf/1905.11359v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-statistical-survey-of |
Repo | |
Framework | |
Optimally Compressed Nonparametric Online Learning
Title | Optimally Compressed Nonparametric Online Learning |
Authors | Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler |
Abstract | Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes’ Rule. Unfortunately, when used online, nonparametric methods suffer a “curse of dimensionality” which precludes their use: their complexity scales at least with the time index. We survey online compression tools which bring their memory under control and attain approximate convergence. The asymptotic bias depends on a compression parameter that trades off memory and accuracy. Further, the applications to robotics, communications, economics, and power are discussed, as well as extensions to multi-agent systems. |
Tasks | |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11555v2 |
https://arxiv.org/pdf/1909.11555v2.pdf | |
PWC | https://paperswithcode.com/paper/optimally-compressed-nonparametric-online |
Repo | |
Framework | |
Deep learning languages: a key fundamental shift from probabilities to weights?
Title | Deep learning languages: a key fundamental shift from probabilities to weights? |
Authors | François Coste |
Abstract | Recent successes in language modeling, notably with deep learning methods, coincide with a shift from probabilistic to weighted representations. We raise here the question of the importance of this evolution, in the light of the practical limitations of a classical and simple probabilistic modeling approach for the classification of protein sequences and in relation to the need for principled methods to learn non-probabilistic models. |
Tasks | Language Modelling |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.00785v1 |
https://arxiv.org/pdf/1908.00785v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-languages-a-key-fundamental |
Repo | |
Framework | |
An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries
Title | An Information Extraction and Knowledge Graph Platform for Accelerating Biochemical Discoveries |
Authors | Matteo Manica, Christoph Auer, Valery Weber, Federico Zipoli, Michele Dolfi, Peter Staar, Teodoro Laino, Costas Bekas, Akihiro Fujita, Hiroki Toda, Shuichi Hirose, Yasumitsu Orii |
Abstract | Information extraction and data mining in biochemical literature is a daunting task that demands resource-intensive computation and appropriate means to scale knowledge ingestion. Being able to leverage this immense source of technical information helps to drastically reduce costs and time to solution in multiple application fields from food safety to pharmaceutics. We present a scalable document ingestion system that integrates data from databases and publications (in PDF format) in a biochemistry knowledge graph (BCKG). The BCKG is a comprehensive source of knowledge that can be queried to retrieve known biochemical facts and to generate novel insights. After describing the knowledge ingestion framework, we showcase an application of our system in the field of carbohydrate enzymes. The BCKG represents a way to scale knowledge ingestion and automatically exploit prior knowledge to accelerate discovery in biochemical sciences. |
Tasks | |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08400v1 |
https://arxiv.org/pdf/1907.08400v1.pdf | |
PWC | https://paperswithcode.com/paper/an-information-extraction-and-knowledge-graph |
Repo | |
Framework | |
Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving
Title | Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving |
Authors | Maria Huegle, Gabriel Kalweit, Moritz Werling, Joschka Boedecker |
Abstract | The common pipeline in autonomous driving systems is highly modular and includes a perception component which extracts lists of surrounding objects and passes these lists to a high-level decision component. In this case, leveraging the benefits of deep reinforcement learning for high-level decision making requires special architectures to deal with multiple variable-length sequences of different object types, such as vehicles, lanes or traffic signs. At the same time, the architecture has to be able to cover interactions between traffic participants in order to find the optimal action to be taken. In this work, we propose the novel Deep Scenes architecture, that can learn complex interaction-aware scene representations based on extensions of either 1) Deep Sets or 2) Graph Convolutional Networks. We present the Graph-Q and DeepScene-Q off-policy reinforcement learning algorithms, both outperforming state-of-the-art methods in evaluations with the publicly available traffic simulator SUMO. |
Tasks | Autonomous Driving, Decision Making, Scene Understanding |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13582v1 |
https://arxiv.org/pdf/1909.13582v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-interaction-aware-scene-understanding |
Repo | |
Framework | |
Uplink-Downlink Tradeoff in Secure Distributed Matrix Multiplication
Title | Uplink-Downlink Tradeoff in Secure Distributed Matrix Multiplication |
Authors | Jaber Kakar, Anton Khristoforov, Seyedhamed Ebadifar, Aydin Sezgin |
Abstract | In secure distributed matrix multiplication (SDMM) the multiplication $\mathbf{A}\mathbf{B}$ from two private matrices $\mathbf{A}$ and $\mathbf{B}$ is outsourced by a user to $N$ distributed servers. In $\ell$-SDMM, the goal is to a design a joint communication-computation procedure that optimally balances conflicting communication and computation metrics without leaking any information on both $\mathbf{A}$ and $\mathbf{B}$ to any set of $\ell\leq N$ servers. To this end, the user applies coding with $\tilde{\mathbf{A}}_i$ and $\tilde{\mathbf{B}}_i$ representing encoded versions of $\mathbf{A}$ and $\mathbf{B}$ destined to the $i$-th server. Now, SDMM involves multiple tradeoffs. One such tradeoff is the tradeoff between uplink (UL) and downlink (DL) costs. To find a good balance between these two metrics, we propose two schemes which we term USCSA and GSCSA that are based on secure cross subspace alignment (SCSA). We show that there are various scenarios where they outperform existing SDMM schemes from the literature with respect to the UL-DL efficiency. Next, we implement schemes from the literature, including USCSA and GSCSA, and test their performance on Amazon EC2. Our numerical results show that USCSA and GSCSA establish a good balance between the time spend on the communication and computation in SDMMs. This is because they combine advantages of polynomial codes, namely low time for the upload of $\left(\tilde{\mathbf{A}}_i,\tilde{\mathbf{B}}i\right){i=1}^{N}$ and the computation of $\mathbf{O}_i=\tilde{\mathbf{A}}_i\tilde{\mathbf{B}}_i$, with those of SCSA, being a low timing overhead for the download of $\left(\mathbf{O}i\right){i=1}^{N}$ and the decoding of $\mathbf{A}\mathbf{B}$. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13849v3 |
https://arxiv.org/pdf/1910.13849v3.pdf | |
PWC | https://paperswithcode.com/paper/uplink-downlink-tradeoff-in-secure |
Repo | |
Framework | |
Cleaned Similarity for Better Memory-Based Recommenders
Title | Cleaned Similarity for Better Memory-Based Recommenders |
Authors | Farhan Khawar, Nevin L. Zhang |
Abstract | Memory-based collaborative filtering methods like user or item k-nearest neighbors (kNN) are a simple yet effective solution to the recommendation problem. The backbone of these methods is the estimation of the empirical similarity between users/items. In this paper, we analyze the spectral properties of the Pearson and the cosine similarity estimators, and we use tools from random matrix theory to argue that they suffer from noise and eigenvalues spreading. We argue that, unlike the Pearson correlation, the cosine similarity naturally possesses the desirable property of eigenvalue shrinkage for large eigenvalues. However, due to its zero-mean assumption, it overestimates the largest eigenvalues. We quantify this overestimation and present a simple re-scaling and noise cleaning scheme. This results in better performance of the memory-based methods compared to their vanilla counterparts. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07370v1 |
https://arxiv.org/pdf/1905.07370v1.pdf | |
PWC | https://paperswithcode.com/paper/cleaned-similarity-for-better-memory-based |
Repo | |
Framework | |
Exascale Deep Learning for Scientific Inverse Problems
Title | Exascale Deep Learning for Scientific Inverse Problems |
Authors | Nouamane Laanait, Joshua Romero, Junqi Yin, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson |
Abstract | We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. We demonstrate our gradient reduction techniques in the context of training a Fully Convolutional Neural Network to approximate the solution of a longstanding scientific inverse problem in materials imaging. The efficient distributed training on a dataset size of 0.5 PB, produces a model capable of an atomically-accurate reconstruction of materials, and in the process reaching a peak performance of 2.15(4) EFLOPS$_{16}$. |
Tasks | Materials Imaging |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11150v1 |
https://arxiv.org/pdf/1909.11150v1.pdf | |
PWC | https://paperswithcode.com/paper/exascale-deep-learning-for-scientific-inverse |
Repo | |
Framework | |
Artificial mental phenomena: Psychophysics as a framework to detect perception biases in AI models
Title | Artificial mental phenomena: Psychophysics as a framework to detect perception biases in AI models |
Authors | Lizhen Liang, Daniel E. Acuna |
Abstract | Detecting biases in artificial intelligence has become difficult because of the impenetrable nature of deep learning. The central difficulty is in relating unobservable phenomena deep inside models with observable, outside quantities that we can measure from inputs and outputs. For example, can we detect gendered perceptions of occupations (e.g., female librarian, male electrician) using questions to and answers from a word embedding-based system? Current techniques for detecting biases are often customized for a task, dataset, or method, affecting their generalization. In this work, we draw from Psychophysics in Experimental Psychology—meant to relate quantities from the real world (i.e., “Physics”) into subjective measures in the mind (i.e., “Psyche”)—to propose an intellectually coherent and generalizable framework to detect biases in AI. Specifically, we adapt the two-alternative forced choice task (2AFC) to estimate potential biases and the strength of those biases in black-box models. We successfully reproduce previously-known biased perceptions in word embeddings and sentiment analysis predictions. We discuss how concepts in experimental psychology can be naturally applied to understanding artificial mental phenomena, and how psychophysics can form a useful methodological foundation to study fairness in AI. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.10818v1 |
https://arxiv.org/pdf/1912.10818v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-mental-phenomena-psychophysics-as |
Repo | |
Framework | |
Interpreting Neural Networks Using Flip Points
Title | Interpreting Neural Networks Using Flip Points |
Authors | Roozbeh Yousefzadeh, Dianne P. O’Leary |
Abstract | Neural networks have been criticized for their lack of easy interpretation, which undermines confidence in their use for important applications. Here, we introduce a novel technique, interpreting a trained neural network by investigating its flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for “yes” and “no”. The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. This paper gives an overview of the uses of flip points and how they are computed. Through results on standard datasets, we demonstrate how flip points can be used to provide detailed interpretation of the output produced by a neural network. Moreover, for a given input, flip points enable us to measure confidence in the correctness of outputs much more effectively than softmax score. They also identify influential features of the inputs, identify bias, and find changes in the input that change the output of the model. We show that distance between an input and the closest flip point identifies the most influential points in the training data. Using principal component analysis (PCA) and rank-revealing QR factorization (RR-QR), the set of directions from each training input to its closest flip point provides explanations of how a trained neural network processes an entire dataset: what features are most important for classification into a given class, which features are most responsible for particular misclassifications, how an adversary might fool the network, etc. Although we investigate flip points for neural networks, their usefulness is actually model-agnostic. |
Tasks | |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.08789v1 |
http://arxiv.org/pdf/1903.08789v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-neural-networks-using-flip |
Repo | |
Framework | |
Regression-clustering for Improved Accuracy and Training Cost with Molecular-Orbital-Based Machine Learning
Title | Regression-clustering for Improved Accuracy and Training Cost with Molecular-Orbital-Based Machine Learning |
Authors | Lixue Cheng, Nikola B. Kovachki, Matthew Welborn, Thomas F. Miller III |
Abstract | Machine learning (ML) in the representation of molecular-orbital-based (MOB) features has been shown to be an accurate and transferable approach to the prediction of post-Hartree-Fock correlation energies. Previous applications of MOB-ML employed Gaussian Process Regression (GPR), which provides good prediction accuracy with small training sets; however, the cost of GPR training scales cubically with the amount of data and becomes a computational bottleneck for large training sets. In the current work, we address this problem by introducing a clustering/regression/classification implementation of MOB-ML. In a first step, regression clustering (RC) is used to partition the training data to best fit an ensemble of linear regression (LR) models; in a second step, each cluster is regressed independently, using either LR or GPR; and in a third step, a random forest classifier (RFC) is trained for the prediction of cluster assignments based on MOB feature values. Upon inspection, RC is found to recapitulate chemically intuitive groupings of the frontier molecular orbitals, and the combined RC/LR/RFC and RC/GPR/RFC implementations of MOB-ML are found to provide good prediction accuracy with greatly reduced wall-clock training times. For a dataset of thermalized geometries of 7211 organic molecules of up to seven heavy atoms, both implementations reach chemical accuracy (1 kcal/mol error) with only 300 training molecules, while providing 35000-fold and 4500-fold reductions in the wall-clock training time, respectively, compared to MOB-ML without clustering. The resulting models are also demonstrated to retain transferability for the prediction of large-molecule energies with only small-molecule training data. Finally, it is shown that capping the number of training datapoints per cluster leads to further improvements in prediction accuracy with negligible increases in wall-clock training time. |
Tasks | |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.02041v4 |
https://arxiv.org/pdf/1909.02041v4.pdf | |
PWC | https://paperswithcode.com/paper/regression-clustering-for-improved-accuracy |
Repo | |
Framework | |
Searching for Legal Clauses by Analogy. Few-shot Semantic Retrieval Shared Task
Title | Searching for Legal Clauses by Analogy. Few-shot Semantic Retrieval Shared Task |
Authors | Łukasz Borchmann, Dawid Wiśniewski, Andrzej Gretkowski, Izabela Kosmala, Dawid Jurkiewicz, Łukasz Szałkiewicz, Gabriela Pałka, Karol Kaczmarek, Agnieszka Kaliska, Filip Graliński |
Abstract | We introduce a novel shared task for semantic retrieval from legal texts, where one is expected to perform a so-called contract discovery – extract specified legal clauses from documents given a few examples of similar clauses from other legal acts. The task differs substantially from conventional NLI and legal information extraction shared tasks. Its specification is followed with evaluation of multiple k-NN based solutions within the unified framework proposed for this branch of methods. It is shown that state-of-the-art pre-trained encoders fail to provide satisfactory results on the task proposed, whereas Language Model based solutions perform well, especially when unsupervised fine-tuning is applied. In addition to the ablation studies, the questions regarding relevant text fragments detection accuracy depending on number of examples available were addressed. In addition to dataset and reference results, legal-specialized LMs were made publicly available. |
Tasks | Language Modelling |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03911v1 |
https://arxiv.org/pdf/1911.03911v1.pdf | |
PWC | https://paperswithcode.com/paper/searching-for-legal-clauses-by-analogy-few |
Repo | |
Framework | |
Differentiable Architecture Search with Ensemble Gumbel-Softmax
Title | Differentiable Architecture Search with Ensemble Gumbel-Softmax |
Authors | Jianlong Chang, Xinbang Zhang, Yiwen Guo, Gaofeng Meng, Shiming Xiang, Chunhong Pan |
Abstract | For network architecture search (NAS), it is crucial but challenging to simultaneously guarantee both effectiveness and efficiency. Towards achieving this goal, we develop a differentiable NAS solution, where the search space includes arbitrary feed-forward network consisting of the predefined number of connections. Benefiting from a proposed ensemble Gumbel-Softmax estimator, our method optimizes both the architecture of a deep network and its parameters in the same round of backward propagation, yielding an end-to-end mechanism of searching network architectures. Extensive experiments on a variety of popular datasets strongly evidence that our method is capable of discovering high-performance architectures, while guaranteeing the requisite efficiency during searching. |
Tasks | Neural Architecture Search |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.01786v1 |
https://arxiv.org/pdf/1905.01786v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-architecture-search-with |
Repo | |
Framework | |
AP19-OLR Challenge: Three Tasks and Their Baselines
Title | AP19-OLR Challenge: Three Tasks and Their Baselines |
Authors | Zhiyuan Tang, Dong Wang, Liming Song |
Abstract | This paper introduces the fourth oriental language recognition (OLR) challenge AP19-OLR, including the data profile, the tasks and the evaluation principles. The OLR challenge has been held successfully for three consecutive years, along with APSIPA Annual Summit and Conference (APSIPA ASC). The challenge this year still focuses on practical and challenging tasks, precisely (1) short-utterance LID, (2) cross-channel LID and (3) zero-resource LID. The event this year includes more languages and more real-life data provided by SpeechOcean and the NSFC M2ASR project. All the data is free for participants. Recipes for x-vector system and back-end evaluation are also conducted as baselines for the three tasks. The participants can refer to these online-published recipes to deploy LID systems for convenience. We report the baseline results on the three tasks and demonstrate that the three tasks are worth some efforts to achieve better performance. |
Tasks | |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07626v3 |
https://arxiv.org/pdf/1907.07626v3.pdf | |
PWC | https://paperswithcode.com/paper/ap19-olr-challenge-three-tasks-and-their |
Repo | |
Framework | |
High-Performance Support Vector Machines and Its Applications
Title | High-Performance Support Vector Machines and Its Applications |
Authors | Taiping He, Tao Wang, Ralph Abbey, Joshua Griffin |
Abstract | The support vector machines (SVM) algorithm is a popular classification technique in data mining and machine learning. In this paper, we propose a distributed SVM algorithm and demonstrate its use in a number of applications. The algorithm is named high-performance support vector machines (HPSVM). The major contribution of HPSVM is two-fold. First, HPSVM provides a new way to distribute computations to the machines in the cloud without shuffling the data. Second, HPSVM minimizes the inter-machine communications in order to maximize the performance. We apply HPSVM to some real-world classification problems and compare it with the state-of-the-art SVM technique implemented in R on several public data sets. HPSVM achieves similar or better results. |
Tasks | |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00331v1 |
http://arxiv.org/pdf/1905.00331v1.pdf | |
PWC | https://paperswithcode.com/paper/high-performance-support-vector-machines-and |
Repo | |
Framework | |