April 2, 2020

2889 words 14 mins read

Paper Group ANR 162

Paper Group ANR 162

Disentanglement with Hyperspherical Latent Spaces using Diffusion Variational Autoencoders. The Design of a Space-based Observation and Tracking System for Interstellar Objects. Distilled Semantics for Comprehensive Scene Understanding from Videos. Adaptive Stochastic Optimization. UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7 …

Disentanglement with Hyperspherical Latent Spaces using Diffusion Variational Autoencoders

Title Disentanglement with Hyperspherical Latent Spaces using Diffusion Variational Autoencoders
Authors Luis A. Pérez Rey
Abstract A disentangled representation of a data set should be capable of recovering the underlying factors that generated it. One question that arises is whether using Euclidean space for latent variable models can produce a disentangled representation when the underlying generating factors have a certain geometrical structure. Take for example the images of a car seen from different angles. The angle has a periodic structure but a 1-dimensional representation would fail to capture this topology. How can we address this problem? The submissions presented for the first stage of the NeurIPS2019 Disentanglement Challenge consist of a Diffusion Variational Autoencoder ($\Delta$VAE) with a hyperspherical latent space which can, for example, recover periodic true factors. The training of the $\Delta$VAE is enhanced by incorporating a modified version of the Evidence Lower Bound (ELBO) for tailoring the encoding capacity of the posterior approximate.
Tasks Latent Variable Models
Published 2020-03-19
URL https://arxiv.org/abs/2003.08996v1
PDF https://arxiv.org/pdf/2003.08996v1.pdf
PWC https://paperswithcode.com/paper/disentanglement-with-hyperspherical-latent

The Design of a Space-based Observation and Tracking System for Interstellar Objects

Title The Design of a Space-based Observation and Tracking System for Interstellar Objects
Authors Ravi teja Nallapu, Yinan Xu, Abraham Marquez, Tristan Schuler, Jekan Thangavelautham
Abstract The recent observation of interstellar objects, 1I/Oumuamua and 2I/Borisov cross the solar system opened new opportunities for planetary science and planetary defense. As the first confirmed objects originating outside of the solar system, there are myriads of origin questions to explore and discuss, including where they came from, how did they get here and what are they composed of. Besides, there is a need to be cognizant especially if such interstellar objects pass by the Earth of potential dangers of impact. Specifically, in the case of Oumuamua, which was detected after its perihelion, passed by the Earth at around 0.2 AU, with an estimated excess speed of 60 km/s relative to the Earth. Without enough forewarning time, a collision with such high-speed objects can pose a catastrophic danger to all life Earth. Such challenges underscore the importance of detection and exploration systems to study these interstellar visitors. The detection system can include a spacecraft constellation with zenith-pointing telescope spacecraft. After an event is detected, a spacecraft swarm can be deployed from Earth to flyby past the visitor. The flyby can then be designed to perform a proximity operation of interest. This work aims to develop algorithms to design these swarm missions through the IDEAS (Integrated Design Engineering & Automation of Swarms) architecture. Specifically, we develop automated algorithms to design an Earth-based detection constellation and a spacecraft swarm that generates detailed surface maps of the visitor during the rendezvous, along with their heliocentric cruise trajectories.
Published 2020-02-03
URL https://arxiv.org/abs/2002.00984v1
PDF https://arxiv.org/pdf/2002.00984v1.pdf
PWC https://paperswithcode.com/paper/the-design-of-a-space-based-observation-and

Distilled Semantics for Comprehensive Scene Understanding from Videos

Title Distilled Semantics for Comprehensive Scene Understanding from Videos
Authors Fabio Tosi, Filippo Aleotti, Pierluigi Zama Ramirez, Matteo Poggi, Samuele Salti, Luigi Di Stefano, Stefano Mattoccia
Abstract Whole understanding of the surroundings is paramount to autonomous systems. Recent works have shown that deep neural networks can learn geometry (depth) and motion (optical flow) from a monocular video without any explicit supervision from ground truth annotations, particularly hard to source for these two tasks. In this paper, we take an additional step toward holistic scene understanding with monocular cameras by learning depth and motion alongside with semantics, with supervision for the latter provided by a pre-trained network distilling proxy ground truth images. We address the three tasks jointly by a) a novel training protocol based on knowledge distillation and self-supervision and b) a compact network architecture which enables efficient scene understanding on both power hungry GPUs and low-power embedded platforms. We thoroughly assess the performance of our framework and show that it yields state-of-the-art results for monocular depth estimation, optical flow and motion segmentation.
Tasks Depth Estimation, Monocular Depth Estimation, Motion Segmentation, Optical Flow Estimation, Scene Understanding
Published 2020-03-31
URL https://arxiv.org/abs/2003.14030v1
PDF https://arxiv.org/pdf/2003.14030v1.pdf
PWC https://paperswithcode.com/paper/distilled-semantics-for-comprehensive-scene

Adaptive Stochastic Optimization

Title Adaptive Stochastic Optimization
Authors Frank E. Curtis, Katya Scheinberg
Abstract Optimization lies at the heart of machine learning and signal processing. Contemporary approaches based on the stochastic gradient method are non-adaptive in the sense that their implementation employs prescribed parameter values that need to be tuned for each application. This article summarizes recent research and motivates future work on adaptive stochastic optimization methods, which have the potential to offer significant computational savings when training large-scale systems.
Tasks Stochastic Optimization
Published 2020-01-18
URL https://arxiv.org/abs/2001.06699v1
PDF https://arxiv.org/pdf/2001.06699v1.pdf
PWC https://paperswithcode.com/paper/adaptive-stochastic-optimization

UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7B, Phase-B

Title UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7B, Phase-B
Authors Sai Krishna Telukuntla, Aditya Kapri, Wlodek Zadrozny
Abstract In this paper, we detail our submission to the 2019, 7th year, BioASQ competition. We present our approach for Task-7b, Phase B, Exact Answering Task. These Question Answering (QA) tasks include Factoid, Yes/No, List Type Question answering. Our system is based on a contextual word embedding model. We have used a Bidirectional Encoder Representations from Transformers(BERT) based system, fined tuned for biomedical question answering task using BioBERT. In the third test batch set, our system achieved the highest MRR score for Factoid Question Answering task. Also, for List type question answering task our system achieved the highest recall score in the fourth test batch set. Along with our detailed approach, we present the results for our submissions, and also highlight identified downsides for our current approach and ways to improve them in our future experiments.
Tasks Question Answering
Published 2020-02-05
URL https://arxiv.org/abs/2002.01984v1
PDF https://arxiv.org/pdf/2002.01984v1.pdf
PWC https://paperswithcode.com/paper/uncc-biomedical-semantic-question-answering

STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage

Title STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage
Authors Ali HeydariGorji, Mahdi Torabzadehkashi, Siavash Rezaei, Hossein Bobarshad, Vladimir Alves, Pai H. Chou
Abstract This paper proposes a framework for distributed, in-storage training of neural networks on clusters of computational storage devices. Such devices not only contain hardware accelerators but also eliminate data movement between the host and storage, resulting in both improved performance and power savings. More importantly, this in-storage processing style of training ensures that private data never leaves the storage while fully controlling the sharing of public data. Experimental results show up to 2.7x speedup and 69% reduction in energy consumption and no significant loss in accuracy.
Published 2020-02-17
URL https://arxiv.org/abs/2002.07215v2
PDF https://arxiv.org/pdf/2002.07215v2.pdf
PWC https://paperswithcode.com/paper/stannis-low-power-acceleration-of-deep

What’s happened in MOOC Posts Analysis, Knowledge Tracing and Peer Feedbacks? A Review

Title What’s happened in MOOC Posts Analysis, Knowledge Tracing and Peer Feedbacks? A Review
Authors Manikandan Ravikiran
Abstract Learning Management Systems (LMS) and Educational Data Mining (EDM) are two important parts of online educational environment with the former being a centralised web-based information systems where the learning content is managed and learning activities are organised (Stone and Zheng,2014) and latter focusing on using data mining techniques for the analysis of data so generated. As part of this work, we present a literature review of three major tasks of EDM (See section 2), by identifying shortcomings and existing open problems, and a Blumenfield chart (See section 3). The consolidated set of papers and resources so used are released in https://github.com/manikandan-ravikiran/cs6460-Survey. The coverage statistics and review matrix of the survey are as shown in Figure 1 & Table 1 respectively. Acronym expansions are added in the Appendix Section 4.1.
Tasks Knowledge Tracing
Published 2020-01-27
URL https://arxiv.org/abs/2001.09830v1
PDF https://arxiv.org/pdf/2001.09830v1.pdf
PWC https://paperswithcode.com/paper/whats-happened-in-mooc-posts-analysis

Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human Motion Prediction

Title Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human Motion Prediction
Authors Maosen Li, Siheng Chen, Yangheng Zhao, Ya Zhang, Yanfeng Wang, Qi Tian
Abstract We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions. The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning. This multiscale graph is adaptive during training and dynamic across network layers. Based on this graph, we propose a multiscale graph computational unit (MGCU) to extract features at individual scales and fuse features across scales. The entire model is action-category-agnostic and follows an encoder-decoder framework. The encoder consists of a sequence of MGCUs to learn motion features. The decoder uses a proposed graph-based gate recurrent unit to generate future poses. Extensive experiments show that the proposed DMGNN outperforms state-of-the-art methods in both short and long-term predictions on the datasets of Human 3.6M and CMU Mocap. We further investigate the learned multiscale graphs for the interpretability. The codes could be downloaded from https://github.com/limaosen0/DMGNN.
Tasks 3D Human Pose Estimation, 3D Pose Estimation, motion prediction
Published 2020-03-17
URL https://arxiv.org/abs/2003.08802v1
PDF https://arxiv.org/pdf/2003.08802v1.pdf
PWC https://paperswithcode.com/paper/dynamic-multiscale-graph-neural-networks-for

Domain Adaption for Knowledge Tracing

Title Domain Adaption for Knowledge Tracing
Authors Song Cheng, Qi Liu, Enhong Chen
Abstract With the rapid development of online education system, knowledge tracing which aims at predicting students’ knowledge state is becoming a critical and fundamental task in personalized education. Traditionally, existing methods are domain-specified. However, there are a larger number of domains (e.g., subjects, schools) in the real world and the lacking of data in some domains, how to utilize the knowledge and information in other domains to help train a knowledge tracing model for target domains is increasingly important. We refer to this problem as domain adaptation for knowledge tracing (DAKT) which contains two aspects: (1) how to achieve great knowledge tracing performance in each domain. (2) how to transfer good performed knowledge tracing model between domains. To this end, in this paper, we propose a novel adaptable framework, namely adaptable knowledge tracing (AKT) to address the DAKT problem. Specifically, for the first aspect, we incorporate the educational characteristics (e.g., slip, guess, question texts) based on the deep knowledge tracing (DKT) to obtain a good performed knowledge tracing model. For the second aspect, we propose and adopt three domain adaptation processes. First, we pre-train an auto-encoder to select useful source instances for target model training. Second, we minimize the domain-specific knowledge state distribution discrepancy under maximum mean discrepancy (MMD) measurement to achieve domain adaptation. Third, we adopt fine-tuning to deal with the problem that the output dimension of source and target domain are different to make the model suitable for target domains. Extensive experimental results on two private datasets and seven public datasets clearly prove the effectiveness of AKT for great knowledge tracing performance and its superior transferable ability.
Tasks Domain Adaptation, Knowledge Tracing
Published 2020-01-14
URL https://arxiv.org/abs/2001.04841v1
PDF https://arxiv.org/pdf/2001.04841v1.pdf
PWC https://paperswithcode.com/paper/domain-adaption-for-knowledge-tracing

Input Validation for Neural Networks via Runtime Local Robustness Verification

Title Input Validation for Neural Networks via Runtime Local Robustness Verification
Authors Jiangchao Liu, Liqian Chen, Antoine Mine, Ji Wang
Abstract Local robustness verification can verify that a neural network is robust wrt. any perturbation to a specific input within a certain distance. We call this distance Robustness Radius. We observe that the robustness radii of correctly classified inputs are much larger than that of misclassified inputs which include adversarial examples, especially those from strong adversarial attacks. Another observation is that the robustness radii of correctly classified inputs often follow a normal distribution. Based on these two observations, we propose to validate inputs for neural networks via runtime local robustness verification. Experiments show that our approach can protect neural networks from adversarial examples and improve their accuracies.
Published 2020-02-09
URL https://arxiv.org/abs/2002.03339v1
PDF https://arxiv.org/pdf/2002.03339v1.pdf
PWC https://paperswithcode.com/paper/input-validation-for-neural-networks-via

Born-Again Tree Ensembles

Title Born-Again Tree Ensembles
Authors Thibaut Vidal, Toni Pacheco, Maximilian Schiffer
Abstract The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix possible sources of mistakes and biases. Tree ensembles offer a good prediction quality in various domains, but the concurrent use of multiple trees reduces the interpretability of the ensemble. Against this background, we study born-again tree ensembles, i.e., the process of constructing a single decision tree of minimum size that reproduces the exact same behavior as a given tree ensemble. To find such a tree, we develop a dynamic-programming based algorithm that exploits sophisticated pruning and bounding rules to reduce the number of recursive calls. This algorithm generates optimal born-again trees for many datasets of practical interest, leading to classifiers which are typically simpler and more interpretable without any other form of compromise.
Tasks Interpretable Machine Learning
Published 2020-03-24
URL https://arxiv.org/abs/2003.11132v1
PDF https://arxiv.org/pdf/2003.11132v1.pdf
PWC https://paperswithcode.com/paper/born-again-tree-ensembles

Novelty Producing Synaptic Plasticity

Title Novelty Producing Synaptic Plasticity
Authors Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher, Mykola Pechenizkiy
Abstract A learning process with the plasticity property often requires reinforcement signals to guide the process. However, in some tasks (e.g. maze-navigation), it is very difficult (or impossible) to measure the performance of an agent (i.e. a fitness value) to provide reinforcements since the position of the goal is not known. This requires finding the correct behavior among a vast number of possible behaviors without having the knowledge of the reinforcement signals. In these cases, an exhaustive search may be needed. However, this might not be feasible especially when optimizing artificial neural networks in continuous domains. In this work, we introduce novelty producing synaptic plasticity (NPSP), where we evolve synaptic plasticity rules to produce as many novel behaviors as possible to find the behavior that can solve the problem. We evaluate the NPSP on maze-navigation on deceptive maze environments that require complex actions and the achievement of subgoals to complete. Our results show that the search heuristic used with the proposed NPSP is indeed capable of producing much more novel behaviors in comparison with a random search taken as baseline.
Published 2020-02-10
URL https://arxiv.org/abs/2002.03620v1
PDF https://arxiv.org/pdf/2002.03620v1.pdf
PWC https://paperswithcode.com/paper/novelty-producing-synaptic-plasticity

Interpretable machine learning models: a physics-based view

Title Interpretable machine learning models: a physics-based view
Authors Ion Matei, Johan de Kleer, Christoforos Somarakis, Rahul Rai, John S. Baras
Abstract To understand changes in physical systems and facilitate decisions, explaining how model predictions are made is crucial. We use model-based interpretability, where models of physical systems are constructed by composing basic constructs that explain locally how energy is exchanged and transformed. We use the port Hamiltonian (p-H) formalism to describe the basic constructs that contain physically interpretable processes commonly found in the behavior of physical systems. We describe how we can build models out of the p-H constructs and how we can train them. In addition we show how we can impose physical properties such as dissipativity that ensure numerical stability of the training process. We give examples on how to build and train models for describing the behavior of two physical systems: the inverted pendulum and swarm dynamics.
Tasks Interpretable Machine Learning
Published 2020-03-22
URL https://arxiv.org/abs/2003.10025v1
PDF https://arxiv.org/pdf/2003.10025v1.pdf
PWC https://paperswithcode.com/paper/interpretable-machine-learning-models-a

Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond

Title Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond
Authors Wojciech Samek, Grégoire Montavon, Sebastian Lapuschkin, Christopher J. Anders, Klaus-Robert Müller
Abstract With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning such as Deep Learning (DL), LSTMs, and kernel methods are therefore receiving increased attention. In this work we aim to (1) provide a timely overview of this active emerging field and explain its theoretical foundations, (2) put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations, (3) outline best practice aspects i.e. how to best include interpretation methods into the standard usage of machine learning and (4) demonstrate successful usage of explainable AI in a representative selection of application scenarios. Finally, we discuss challenges and possible future directions of this exciting foundational field of machine learning.
Tasks Interpretable Machine Learning
Published 2020-03-17
URL https://arxiv.org/abs/2003.07631v1
PDF https://arxiv.org/pdf/2003.07631v1.pdf
PWC https://paperswithcode.com/paper/toward-interpretable-machine-learning

Explaining Groups of Points in Low-Dimensional Representations

Title Explaining Groups of Points in Low-Dimensional Representations
Authors Gregory Plumb, Jonathan Terhorst, Sriram Sankararaman, Ameet Talwalkar
Abstract A common workflow in data exploration is to learn a low-dimensional representation of the data, identify groups of points in that representation, and examine the differences between the groups to determine what they represent. We treat this as an interpretable machine learning problem by leveraging the model that learned the low-dimensional representation to help identify the key differences between the groups. To solve this problem, we introduce a new type of explanation, a Global Counterfactual Explanation (GCE), and our algorithm, Transitive Global Translations (TGT), for computing GCEs. TGT identifies the differences between each pair of groups using compressed sensing but constrains those pairwise differences to be consistent among all of the groups. Empirically, we demonstrate that TGT is able to identify explanations that accurately explain the model while being relatively sparse, and that these explanations match real patterns in the data.
Tasks Interpretable Machine Learning
Published 2020-03-03
URL https://arxiv.org/abs/2003.01640v2
PDF https://arxiv.org/pdf/2003.01640v2.pdf
PWC https://paperswithcode.com/paper/explaining-groups-of-points-in-low
comments powered by Disqus