April 1, 2020

# Paper Group ANR 427

A Multi-Channel Neural Graphical Event Model with Negative Evidence. Deep Fusion of Local and Non-Local Features for Precision Landslide Recognition. ElixirNet: Relation-aware Network Architecture Adaptation for Medical Lesion Detection. From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI. Minimax Theorem for Latent …

#### A Multi-Channel Neural Graphical Event Model with Negative Evidence

Title A Multi-Channel Neural Graphical Event Model with Negative Evidence
Authors Tian Gao, Dharmashankar Subramanian, Karthikeyan Shanmugam, Debarun Bhattacharjya, Nicholas Mattei
Abstract Event datasets are sequences of events of various types occurring irregularly over the time-line, and they are increasingly prevalent in numerous domains. Existing work for modeling events using conditional intensities rely on either using some underlying parametric form to capture historical dependencies, or on non-parametric models that focus primarily on tasks such as prediction. We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions. We use a novel multi-channel RNN that optimally reinforces the negative evidence of no observable events with the introduction of fake event epochs within each consecutive inter-event interval. We evaluate our method against state-of-the-art baselines on model fitting tasks as gauged by log-likelihood. Through experiments on both synthetic and real-world datasets, we find that our proposed approach outperforms existing baselines on most of the datasets studied.
Published 2020-02-21
URL https://arxiv.org/abs/2002.09575v1
PDF https://arxiv.org/pdf/2002.09575v1.pdf
PWC https://paperswithcode.com/paper/a-multi-channel-neural-graphical-event-model
Repo
Framework

#### Deep Fusion of Local and Non-Local Features for Precision Landslide Recognition

Title Deep Fusion of Local and Non-Local Features for Precision Landslide Recognition
Authors Qing Zhu, Lin Chen, Han Hu, Binzhi Xu, Yeting Zhang, Haifeng Li
Abstract Precision mapping of landslide inventory is crucial for hazard mitigation. Most landslides generally co-exist with other confusing geological features, and the presence of such areas can only be inferred unambiguously at a large scale. In addition, local information is also important for the preservation of object boundaries. Aiming to solve this problem, this paper proposes an effective approach to fuse both local and non-local features to surmount the contextual problem. Built upon the U-Net architecture that is widely adopted in the remote sensing community, we utilize two additional modules. The first one uses dilated convolution and the corresponding atrous spatial pyramid pooling, which enlarged the receptive field without sacrificing spatial resolution or increasing memory usage. The second uses a scale attention mechanism to guide the up-sampling of features from the coarse level by a learned weight map. In implementation, the computational overhead against the original U-Net was only a few convolutional layers. Experimental evaluations revealed that the proposed method outperformed state-of-the-art general-purpose semantic segmentation approaches. Furthermore, ablation studies have shown that the two models afforded extensive enhancements in landslide-recognition performance.
Published 2020-02-20
URL https://arxiv.org/abs/2002.08547v1
PDF https://arxiv.org/pdf/2002.08547v1.pdf
PWC https://paperswithcode.com/paper/deep-fusion-of-local-and-non-local-features
Repo
Framework

#### ElixirNet: Relation-aware Network Architecture Adaptation for Medical Lesion Detection

Title ElixirNet: Relation-aware Network Architecture Adaptation for Medical Lesion Detection
Authors Chenhan Jiang, Shaoju Wang, Hang Xu, Xiaodan Liang, Nong Xiao
Abstract Most advances in medical lesion detection network are limited to subtle modification on the conventional detection network designed for natural images. However, there exists a vast domain gap between medical images and natural images where the medical image detection often suffers from several domain-specific challenges, such as high lesion/background similarity, dominant tiny lesions, and severe class imbalance. Is a hand-crafted detection network tailored for natural image undoubtedly good enough over a discrepant medical lesion domain? Is there more powerful operations, filters, and sub-networks that better fit the medical lesion detection problem to be discovered? In this paper, we introduce a novel ElixirNet that includes three components: 1) TruncatedRPN balances positive and negative data for false positive reduction; 2) Auto-lesion Block is automatically customized for medical images to incorporate relation-aware operations among region proposals, and leads to more suitable and efficient classification and localization. 3) Relation transfer module incorporates the semantic relationship and transfers the relevant contextual information with an interpretable the graph thus alleviates the problem of lack of annotations for all types of lesions. Experiments on DeepLesion and Kits19 prove the effectiveness of ElixirNet, achieving improvement of both sensitivity and precision over FPN with fewer parameters.
Published 2020-03-03
URL https://arxiv.org/abs/2003.08770v1
PDF https://arxiv.org/pdf/2003.08770v1.pdf
PWC https://paperswithcode.com/paper/elixirnet-relation-aware-network-architecture
Repo
Framework

#### From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI

Title From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI
Authors Sebastian Risi, Mike Preuss
Abstract This paper reviews the field of Game AI, which not only deals with creating agents that can play a certain game, but also with areas as diverse as creating game content automatically, game analytics, or player modelling. While Game AI was for a long time not very well recognized by the larger scientific community, it has established itself as a research area for developing and testing the most advanced forms of AI algorithms and articles covering advances in mastering video games such as StarCraft 2 and Quake III appear in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent developments, including that advances in Game AI are starting to be extended to areas outside of games, such as robotics or the synthesis of chemicals. In this article, we review the algorithms and methods that have paved the way for these breakthroughs, report on the other important areas of Game AI research, and also point out exciting directions for the future of Game AI.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10433v1
PDF https://arxiv.org/pdf/2002.10433v1.pdf
PWC https://paperswithcode.com/paper/from-chess-and-atari-to-starcraft-and-beyond
Repo
Framework

#### Minimax Theorem for Latent Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

Title Minimax Theorem for Latent Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets
Authors Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach
Abstract Adversarial training, a special case of multi-objective optimization, is an increasingly useful tool in machine learning. For example, two-player zero-sum games are important for generative modeling (GANs) and for mastering games like Go or Poker via self-play. A classic result in Game Theory states that one must mix strategies, as pure equilibria may not exist. Surprisingly, machine learning practitioners typically train a \emph{single} pair of agents – instead of a pair of mixtures – going against Nash’s principle. Our main contribution is a notion of limited-capacity-equilibrium for which, as capacity grows, optimal agents – not mixtures – can learn increasingly expressive and realistic behaviors. We define \emph{latent games}, a new class of game where agents are mappings that transform latent distributions. Examples include generators in GANs, which transform Gaussian noise into distributions on images, and StarCraft II agents, which transform sampled build orders into policies. We show that minimax equilibria in latent games can be approximated by a \emph{single} pair of dense neural networks. Finally, we apply our latent game approach to solve differentiable Blotto, a game with an infinite strategy space.
Published 2020-02-14
URL https://arxiv.org/abs/2002.05820v1
PDF https://arxiv.org/pdf/2002.05820v1.pdf
PWC https://paperswithcode.com/paper/minimax-theorem-for-latent-games-or-how-i
Repo
Framework

#### Can AI help in screening Viral and COVID-19 pneumonia?

Title Can AI help in screening Viral and COVID-19 pneumonia?
Authors Muhammad E. H. Chowdhury, Tawsifur Rahman, Amith Khandakar, Rashid Mazhar, Muhammad Abdul Kadir, Zaid Bin Mahbub, Khandakar R. Islam, Muhammad Salman Khan, Atif Iqbal, Nasser Al-Emadi, Mamun Bin Ibne Reaz
Abstract Coronavirus disease (COVID-19) is a pandemic disease, which has already infected more than half a million people and caused fatalities of above 30 thousand. The aim of this paper is to automatically detect COVID-19 pneumonia patients using digital x-ray images while maximizing the accuracy in detection using image pre-processing and deep-learning techniques. A public database was created by the authors using three public databases and also by collecting images from recently published articles. The database contains a mixture of 190 COVID-19, 1345 viral pneumonia, and 1341 normal chest x-ray images. An image augmented training set was created with 2500 images of each category for training and validating four different pre-trained deep Convolutional Neural Networks (CNNs). These networks were tested for the classification of two different schemes (normal and COVID-19 pneumonia; normal, viral and COVID-19 pneumonia). The classification accuracy, sensitivity, specificity and precision for both the schemes were 98.3%, 96.7%, 100%, 100% and 98.3%, 96.7%, 99%, 100%, respectively. The high accuracy of this computer-aided diagnostic tool can significantly improve the speed and accuracy of diagnosing cases with COVID-19. This would be highly useful in this pandemic where disease burden and need for preventive measures are at odds with available resources.
Published 2020-03-29
URL https://arxiv.org/abs/2003.13145v1
PDF https://arxiv.org/pdf/2003.13145v1.pdf
PWC https://paperswithcode.com/paper/can-ai-help-in-screening-viral-and-covid-19
Repo
Framework

#### Learning Adaptive Loss for Robust Learning with Noisy Labels

Title Learning Adaptive Loss for Robust Learning with Noisy Labels
Authors Jun Shu, Qian Zhao, Keyu Chen, Zongben Xu, Deyu Meng
Abstract Robust loss minimization is an important strategy for handling robust learning issue on noisy labels. Current robust loss functions, however, inevitably involve hyperparameter(s) to be tuned, manually or heuristically through cross validation, which makes them fairly hard to be generally applied in practice. Besides, the non-convexity brought by the loss as well as the complicated network architecture makes it easily trapped into an unexpected solution with poor generalization capability. To address above issues, we propose a meta-learning method capable of adaptively learning hyperparameter in robust loss functions. Specifically, through mutual amelioration between robust loss hyperparameter and network parameters in our method, both of them can be simultaneously finely learned and coordinated to attain solutions with good generalization capability. Four kinds of SOTA robust loss functions are attempted to be integrated into our algorithm, and comprehensive experiments substantiate the general availability and effectiveness of the proposed method in both its accuracy and generalization performance, as compared with conventional hyperparameter tuning strategy, even with carefully tuned hyperparameters.
Published 2020-02-16
URL https://arxiv.org/abs/2002.06482v1
PDF https://arxiv.org/pdf/2002.06482v1.pdf
Repo
Framework

#### On State Variables, Bandit Problems and POMDPs

Title On State Variables, Bandit Problems and POMDPs
Authors Warren B Powell
Abstract State variables are easily the most subtle dimension of sequential decision problems. This is especially true in the context of active learning problems (bandit problems”) where decisions affect what we observe and learn. We describe our canonical framework that models {\it any} sequential decision problem, and present our definition of state variables that allows us to claim: Any properly modeled sequential decision problem is Markovian. We then present a novel two-agent perspective of partially observable Markov decision problems (POMDPs) that allows us to then claim: Any model of a real decision problem is (possibly) non-Markovian. We illustrate these perspectives using the context of observing and treating flu in a population, and provide examples of all four classes of policies in this setting. We close with an indication of how to extend this thinking to multiagent problems.
Published 2020-02-14
URL https://arxiv.org/abs/2002.06238v1
PDF https://arxiv.org/pdf/2002.06238v1.pdf
PWC https://paperswithcode.com/paper/on-state-variables-bandit-problems-and-pomdps
Repo
Framework

#### Image compression optimized for 3D reconstruction by utilizing deep neural networks

Title Image compression optimized for 3D reconstruction by utilizing deep neural networks
Authors Alex Golts, Yoav Y. Schechner
Abstract Computer vision tasks are often expected to be executed on compressed images. Classical image compression standards like JPEG 2000 are widely used. However, they do not account for the specific end-task at hand. Motivated by works on recurrent neural network (RNN)-based image compression and three-dimensional (3D) reconstruction, we propose unified network architectures to solve both tasks jointly. These joint models provide image compression tailored for the specific task of 3D reconstruction. Images compressed by our proposed models, yield 3D reconstruction performance superior as compared to using JPEG 2000 compression. Our models significantly extend the range of compression rates for which 3D reconstruction is possible. We also show that this can be done highly efficiently at almost no additional cost to obtain compression on top of the computation already required for performing the 3D reconstruction task.
Published 2020-03-27
URL https://arxiv.org/abs/2003.12618v1
PDF https://arxiv.org/pdf/2003.12618v1.pdf
PWC https://paperswithcode.com/paper/image-compression-optimized-for-3d
Repo
Framework

#### A Deep Learning Method for Complex Human Activity Recognition Using Virtual Wearable Sensors

Title A Deep Learning Method for Complex Human Activity Recognition Using Virtual Wearable Sensors
Authors Fanyi Xiao, Ling Pei, Lei Chu, Danping Zou, Wenxian Yu, Yifan Zhu, Tao Li
Abstract Sensor-based human activity recognition (HAR) is now a research hotspot in multiple application areas. With the rise of smart wearable devices equipped with inertial measurement units (IMUs), researchers begin to utilize IMU data for HAR. By employing machine learning algorithms, early IMU-based research for HAR can achieve accurate classification results on traditional classical HAR datasets, containing only simple and repetitive daily activities. However, these datasets rarely display a rich diversity of information in real-scene. In this paper, we propose a novel method based on deep learning for complex HAR in the real-scene. Specially, in the off-line training stage, the AMASS dataset, containing abundant human poses and virtual IMU data, is innovatively adopted for enhancing the variety and diversity. Moreover, a deep convolutional neural network with an unsupervised penalty is proposed to automatically extract the features of AMASS and improve the robustness. In the on-line testing stage, by leveraging advantages of the transfer learning, we obtain the final result by fine-tuning the partial neural network (optimizing the parameters in the fully-connected layers) using the real IMU data. The experimental results show that the proposed method can surprisingly converge in a few iterations and achieve an accuracy of 91.15% on a real IMU dataset, demonstrating the efficiency and effectiveness of the proposed method.
Tasks Activity Recognition, Human Activity Recognition, Transfer Learning
Published 2020-03-04
URL https://arxiv.org/abs/2003.01874v2
PDF https://arxiv.org/pdf/2003.01874v2.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-method-for-complex-human
Repo
Framework

#### Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery

Title Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery
Authors Zepeng Huo, Arash PakBin, Xiaohan Chen, Nathan Hurley, Ye Yuan, Xiaoning Qian, Zhangyang Wang, Shuai Huang, Bobak Mortazavi
Abstract Activity recognition in wearable computing faces two key challenges: i) activity characteristics may be context-dependent and change under different contexts or situations; ii) unknown contexts and activities may occur from time to time, requiring flexibility and adaptability of the algorithm. We develop a context-aware mixture of deep models termed the {\alpha}-\b{eta} network coupled with uncertainty quantification (UQ) based upon maximum entropy to enhance human activity recognition performance. We improve accuracy and F score by 10% by identifying high-level contexts in a data-driven way to guide model development. In order to ensure training stability, we have used a clustering-based pre-training in both public and in-house datasets, demonstrating improved accuracy through unknown context discovery.
Tasks Activity Recognition, Human Activity Recognition
Published 2020-03-03
URL https://arxiv.org/abs/2003.01753v1
PDF https://arxiv.org/pdf/2003.01753v1.pdf
PWC https://paperswithcode.com/paper/uncertainty-quantification-for-deep-context
Repo
Framework

#### Human Activity Recognition using Multi-Head CNN followed by LSTM

Title Human Activity Recognition using Multi-Head CNN followed by LSTM
Authors Waqar Ahmad, Misbah Kazmi, Hazrat Ali
Abstract This study presents a novel method to recognize human physical activities using CNN followed by LSTM. Achieving high accuracy by traditional machine learning algorithms, (such as SVM, KNN and random forest method) is a challenging task because the data acquired from the wearable sensors like accelerometer and gyroscope is a time-series data. So, to achieve high accuracy, we propose a multi-head CNN model comprising of three CNNs to extract features for the data acquired from different sensors and all three CNNs are then merged, which are followed by an LSTM layer and a dense layer. The configuration of all three CNNs is kept the same so that the same number of features are obtained for every input to CNN. By using the proposed method, we achieve state-of-the-art accuracy, which is comparable to traditional machine learning algorithms and other deep neural network algorithms.
Tasks Activity Recognition, Human Activity Recognition, Time Series
Published 2020-02-21
URL https://arxiv.org/abs/2003.06327v1
PDF https://arxiv.org/pdf/2003.06327v1.pdf
Repo
Framework

#### Autonomous robotic nanofabrication with reinforcement learning

Title Autonomous robotic nanofabrication with reinforcement learning
Authors Philipp Leinen, Malte Esders, Kristof T. Schütt, Christian Wagner, Klaus-Robert Müller, F. Stefan Tautz
Abstract The ability to handle single molecules as effectively as macroscopic building-blocks would enable the construction of complex supramolecular structures that are not accessible by self-assembly. The fundamental challenges on the way towards this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles, and demonstrate autonomous robotic nanofabrication by manipulating single molecules. Our approach employs reinforcement learning (RL), which is able to learn solution strategies even in the face of large uncertainty and with sparse feedback. However, to be useful for autonomous nanofabrication, standard RL algorithms need to be adapted to cope with the limited training opportunities available. We demonstrate the potential of our RL approach by applying it to an exemplary task of subtractive manufacturing, the removal of individual molecules from a molecular layer using a scanning probe microscope (SPM). Our RL agent reaches an excellent performance level, enabling us to automate a task which previously had to be performed by a human. We anticipate that our work opens the way towards autonomous agents for the robotic construction of functional supramolecular structures with speed, precision and perseverance beyond our current capabilities.
Published 2020-02-27
URL https://arxiv.org/abs/2002.11952v1
PDF https://arxiv.org/pdf/2002.11952v1.pdf
PWC https://paperswithcode.com/paper/autonomous-robotic-nanofabrication-with
Repo
Framework

#### An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

Title An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos
Authors S. H. Shabbeer Basha, Viswanath Pulabaigari, Snehasis Mukherjee
Abstract We propose a novel scheme for human action recognition in videos, using a 3-dimensional Convolutional Neural Network (3D CNN) based classifier. Traditionally in deep learning based human activity recognition approaches, either a few random frames or every $k^{th}$ frame of the video is considered for training the 3D CNN, where $k$ is a small positive integer, like 4, 5, or 6. This kind of sampling reduces the volume of the input data, which speeds-up training of the network and also avoids over-fitting to some extent, thus enhancing the performance of the 3D CNN model. In the proposed video sampling technique, consecutive $k$ frames of a video are aggregated into a single frame by computing a Gaussian-weighted summation of the $k$ frames. The resulting frame (aggregated frame) preserves the information in a better way than the conventional approaches and experimentally shown to perform better. In this paper, a 3D CNN architecture is proposed to extract the spatio-temporal features and follows Long Short-Term Memory (LSTM) to recognize human actions. The proposed 3D CNN architecture is capable of handling the videos where the camera is placed at a distance from the performer. Experiments are performed with KTH and WEIZMANN human actions datasets, whereby it is shown to produce comparable results with the state-of-the-art techniques.
Tasks Action Recognition In Videos, Activity Recognition, Human Activity Recognition, Temporal Action Localization
Published 2020-02-06
URL https://arxiv.org/abs/2002.02100v2
PDF https://arxiv.org/pdf/2002.02100v2.pdf
PWC https://paperswithcode.com/paper/an-information-rich-sampling-technique-over
Repo
Framework

#### Analyzing Differentiable Fuzzy Logic Operators

Title Analyzing Differentiable Fuzzy Logic Operators
Authors Emile van Krieken, Erman Acar, Frank van Harmelen
Abstract In recent years there has been a push to integrate symbolic AI and deep learning, as it is argued that the strengths and weaknesses of these approaches are complementary. One such trend in the literature are weakly supervised learning techniques that use operators from fuzzy logics. They employ prior background knowledge described in logic to benefit the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks, this background knowledge can be added to regular loss functions used in deep learning to integrate reasoning and learning. In this paper, we analyze how a large collection of logical operators from the fuzzy logic literature behave in a differentiable setting. We find large differences between the formal properties of these operators that are of crucial importance in a differentiable learning setting. We show that many of these operators, including some of the best known, are highly unsuitable for use in a differentiable learning setting. A further finding concerns the treatment of implication in these fuzzy logics, with a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning. However, to achieve the most significant performance improvement over a supervised baseline, we have to resort to non-standard combinations of logical operators which perform well in learning, but which no longer satisfy the usual logical laws. We end with a discussion on extensions to large-scale problems.