April 2, 2020

3095 words 15 mins read

Paper Group ANR 133

Paper Group ANR 133

Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images. Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop’s Fable Experiment. Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization. PDE-based Group Equivariant Convolutional Neural Networks. Automating the Generation of High School Geometry Proofs using Prolog in an …

Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images

Title Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images
Authors Arie Agranonik, Maya Herman, Mark Last
Abstract In this paper we implemented the algorithm we developed in [1] called 3DPIFCM in a parallel environment by using CUDA on a GPU. In our previous work we introduced 3DPIFCM which performs segmentation of images in noisy conditions and uses particle swarm optimization for finding the optimal algorithm parameters to account for noise. This algorithm achieved state of the art segmentation accuracy when compared to FCM (Fuzzy C-Means), IFCMPSO (Improved Fuzzy C-Means with Particle Swarm Optimization), GAIFCM (Genetic Algorithm Improved Fuzzy C-Means) on noisy MRI images of an adult Brain. When using a genetic algorithm or PSO (Particle Swarm Optimization) on a single machine for optimization we witnessed long execution times for practical clinical usage. Therefore, in the current paper our goal was to speed up the execution of 3DPIFCM by taking out parts of the algorithm and executing them as kernels on a GPU. The algorithm was implemented using the CUDA [13] framework from NVIDIA and experiments where performed on a server containing 64GB RAM , 8 cores and a TITAN X GPU with 3072 SP cores and 12GB of GPU memory. Our results show that the parallel version of the algorithm performs up to 27x faster than the original sequential version and 68x faster than GAIFCM algorithm. We show that the speedup of the parallel version increases as we increase the size of the image due to better utilization of cores in the GPU. Also, we show a speedup of up to 5x in our Brainweb experiment compared to other generic variants such as IFCMPSO and GAIFCM.
Published 2020-02-05
URL https://arxiv.org/abs/2002.01981v1
PDF https://arxiv.org/pdf/2002.01981v1.pdf
PWC https://paperswithcode.com/paper/parallel-3dpifcm-algorithm-for-noisy-brain

Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop’s Fable Experiment

Title Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop’s Fable Experiment
Authors Ajaz A. Bhat, Vishwanathan Mohan
Abstract Corvids, apes, and children solve The Crow and The Pitcher task (from Aesop’s Fables) indicating a causal understanding of the task. By cumulatively interacting with different objects, how can cognitive agents abstract the underlying cause-effect relations to predict affordances of novel objects? We address this question by re-enacting the Aesop’s Fable task on a robot and present a) a brain-guided neural model of semantic-episodic memory; with b) four task-agnostic learning rules that compare expectations from recalled past episodes with the current scenario to progressively extract the hidden causal relations. The ensuing robot behaviours illustrate causal learning; and predictions for novel objects converge to Archimedes’ principle, independent of both the objects explored during learning and the order of their cumulative exploration.
Published 2020-02-29
URL https://arxiv.org/abs/2003.00274v1
PDF https://arxiv.org/pdf/2003.00274v1.pdf
PWC https://paperswithcode.com/paper/causal-learning-by-a-robot-with-semantic

Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization

Title Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization
Authors Li Chen, Shuisheng Zhou, Jiajun Ma
Abstract Kernel-based clustering algorithm can identify and capture the non-linear structure in datasets, and thereby it can achieve better performance than linear clustering. However, computing and storing the entire kernel matrix occupy so large memory that it is difficult for kernel-based clustering to deal with large-scale datasets. In this paper, we employ incomplete Cholesky factorization to accelerate kernel clustering and save memory space. The key idea of the proposed kernel $k$-means clustering using incomplete Cholesky factorization is that we approximate the entire kernel matrix by the product of a low-rank matrix and its transposition. Then linear $k$-means clustering is applied to columns of the transpose of the low-rank matrix. We show both analytically and empirically that the performance of the proposed algorithm is similar to that of the kernel $k$-means clustering algorithm, but our method can deal with large-scale datasets.
Published 2020-02-07
URL https://arxiv.org/abs/2002.02846v1
PDF https://arxiv.org/pdf/2002.02846v1.pdf
PWC https://paperswithcode.com/paper/fast-kernel-k-means-clustering-using

PDE-based Group Equivariant Convolutional Neural Networks

Title PDE-based Group Equivariant Convolutional Neural Networks
Authors Bart Smets, Jim Portegies, Erik Bekkers, Remco Duits
Abstract We present a PDE-based framework that generalizes Group equivariant Convolutional Neural Networks (G-CNNs). In this framework, a network layer is seen as a set of PDE-solvers where the equation’s geometrically meaningful coefficients become the layer’s trainable weights. Formulating our PDEs on homogeneous spaces allows these networks to be designed with built-in symmetries such as rotation equivariance instead of being restricted to just translation equivariance as in traditional CNNs. Having all the desired symmetries included in the design obviates the need to include them by means of costly techniques such as data augmentation. Roto-translation equivariance for image analysis applications is the example we will be using throughout the paper. Our default PDE is solved by a combination of linear group convolutions and non-linear morphological group convolutions. Just like for linear convolution a morphological convolution is specified by a kernel and this kernel is what is being optimized during the training process. We demonstrate how the common CNN operations of max/min-pooling and ReLUs arise naturally from solving a PDE and how they are subsumed by morphological convolutions. We present a proof-of-concept experiment to demonstrate the potential of this framework in increasing the performance of deep learning based imaging applications.
Tasks Data Augmentation
Published 2020-01-24
URL https://arxiv.org/abs/2001.09046v2
PDF https://arxiv.org/pdf/2001.09046v2.pdf
PWC https://paperswithcode.com/paper/pde-based-group-equivariant-convolutional

Automating the Generation of High School Geometry Proofs using Prolog in an Educational Context

Title Automating the Generation of High School Geometry Proofs using Prolog in an Educational Context
Authors Ludovic Font, Sébastien Cyr, Philippe R. Richard, Michel Gagnon
Abstract When working on intelligent tutor systems designed for mathematics education and its specificities, an interesting objective is to provide relevant help to the students by anticipating their next steps. This can only be done by knowing, beforehand, the possible ways to solve a problem. Hence the need for an automated theorem prover that provide proofs as they would be written by a student. To achieve this objective, logic programming is a natural tool due to the similarity of its reasoning with a mathematical proof by inference. In this paper, we present the core ideas we used to implement such a prover, from its encoding in Prolog to the generation of the complete set of proofs. However, when dealing with educational aspects, there are many challenges to overcome. We also present the main issues we encountered, as well as the chosen solutions.
Published 2020-02-28
URL https://arxiv.org/abs/2002.12551v1
PDF https://arxiv.org/pdf/2002.12551v1.pdf
PWC https://paperswithcode.com/paper/automating-the-generation-of-high-school

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Title Attacks Which Do Not Kill Training Make Adversarial Learning Stronger
Authors Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, Mohan Kankanhalli
Abstract Adversarial training based on the minimax formulation is necessary for obtaining adversarial robustness of trained models. However, it is conservative or even pessimistic so that it sometimes hurts the natural generalization. In this paper, we raise a fundamental question—do we have to trade off natural generalization for adversarial robustness? We argue that adversarial training is to employ confident adversarial data for updating the current model. We propose a novel approach of friendly adversarial training (FAT): rather than employing most adversarial data maximizing the loss, we search for least adversarial (i.e., friendly adversarial) data minimizing the loss, among the adversarial data that are confidently misclassified. Our novel formulation is easy to implement by just stopping the most adversarial data searching algorithms such as PGD (projected gradient descent) early, which we call early-stopped PGD. Theoretically, FAT is justified by an upper bound of the adversarial risk. Empirically, early-stopped PGD allows us to answer the earlier question negatively—adversarial robustness can indeed be achieved without compromising the natural generalization.
Published 2020-02-26
URL https://arxiv.org/abs/2002.11242v1
PDF https://arxiv.org/pdf/2002.11242v1.pdf
PWC https://paperswithcode.com/paper/attacks-which-do-not-kill-training-make

Few-shot Domain Adaptation by Causal Mechanism Transfer

Title Few-shot Domain Adaptation by Causal Mechanism Transfer
Authors Takeshi Teshima, Issei Sato, Masashi Sugiyama
Abstract We study few-shot supervised domain adaptation (DA) for regression problems, where only a few labeled target domain data and many labeled source domain data are available. Many of the current DA methods base their transfer assumptions on either parametrized distribution shift or apparent distribution similarities, e.g., identical conditionals or small distributional discrepancies. However, these assumptions may preclude the possibility of adaptation from intricately shifted and apparently very different distributions. To overcome this problem, we propose mechanism transfer, a meta-distributional scenario in which a data generating mechanism is invariant among domains. This transfer assumption can accommodate nonparametric shifts resulting in apparently different distributions while providing a solid statistical basis for DA. We take the structural equations in causal modeling as an example and propose a novel DA method, which is shown to be useful both theoretically and experimentally. Our method can be seen as the first attempt to fully leverage the structural causal models for DA.
Tasks Domain Adaptation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03497v1
PDF https://arxiv.org/pdf/2002.03497v1.pdf
PWC https://paperswithcode.com/paper/few-shot-domain-adaptation-by-causal

Targeted display advertising: the case of preferential attachment

Title Targeted display advertising: the case of preferential attachment
Authors Saurav Manchanda, Pranjul Yadav, Khoa Doan, S. Sathiya Keerthi
Abstract An average adult is exposed to hundreds of digital advertisements daily (https://www.mediadynamicsinc.com/uploads/files/PR092214-Note-only-150-Ads-2mk.pdf), making the digital advertisement industry a classic example of a big-data-driven platform. As such, the ad-tech industry relies on historical engagement logs (clicks or purchases) to identify potentially interested users for the advertisement campaign of a partner (a seller who wants to target users for its products). The number of advertisements that are shown for a partner, and hence the historical campaign data available for a partner depends upon the budget constraints of the partner. Thus, enough data can be collected for the high-budget partners to make accurate predictions, while this is not the case with the low-budget partners. This skewed distribution of the data leads to “preferential attachment” of the targeted display advertising platforms towards the high-budget partners. In this paper, we develop “domain-adaptation” approaches to address the challenge of predicting interested users for the partners with insufficient data, i.e., the tail partners. Specifically, we develop simple yet effective approaches that leverage the similarity among the partners to transfer information from the partners with sufficient data to cold-start partners, i.e., partners without any campaign data. Our approaches readily adapt to the new campaign data by incremental fine-tuning, and hence work at varying points of a campaign, and not just the cold-start. We present an experimental analysis on the historical logs of a major display advertising platform (https://www.criteo.com/). Specifically, we evaluate our approaches across 149 partners, at varying points of their campaigns. Experimental results show that the proposed approaches outperform the other “domain-adaptation” approaches at different time points of the campaigns.
Tasks Domain Adaptation
Published 2020-02-07
URL https://arxiv.org/abs/2002.02879v1
PDF https://arxiv.org/pdf/2002.02879v1.pdf
PWC https://paperswithcode.com/paper/targeted-display-advertising-the-case-of

Watch your back: Backdoor Attacks in Deep Reinforcement Learning-based Autonomous Vehicle Control Systems

Title Watch your back: Backdoor Attacks in Deep Reinforcement Learning-based Autonomous Vehicle Control Systems
Authors Yue Wang, Esha Sarkar, Michail Maniatakos, Saif Eddin Jabari
Abstract Autonomous Vehicles (AVs) with Deep Reinforcement Learning (DRL)-based controllers are used for reducing traffic jams. AVs trained with such deep neural networks render them vulnerable to machine learning-based attacks. In this work, we explore the backdooring of a DRL-based AV controller in a standard traffic scenario. The AV exhibits intended operation of reducing congestion during genuine observations, but when a particular set of observations appears, the AV can be triggered to either decelerate to cause congestion (congestion attack) or to accelerate and crash into the vehicle in front (insurance attack). These backdoors in AVs may be engineered to pose serious threats to human lives.
Tasks Autonomous Vehicles
Published 2020-03-17
URL https://arxiv.org/abs/2003.07859v1
PDF https://arxiv.org/pdf/2003.07859v1.pdf
PWC https://paperswithcode.com/paper/watch-your-back-backdoor-attacks-in-deep

Machine learning transfer efficiencies for noisy quantum walks

Title Machine learning transfer efficiencies for noisy quantum walks
Authors Alexey A. Melnikov, Leonid E. Fedichkin, Ray-Kuang Lee, Alexander Alodjants
Abstract Quantum effects are known to provide an advantage in particle transfer across networks. In order to achieve this advantage, requirements on both a graph type and a quantum system coherence must be found. Here we show that the process of finding these requirements can be automated by learning from simulated examples. The automation is done by using a convolutional neural network of a particular type that learns to understand with which network and under which coherence requirements quantum advantage is possible. Our machine learning approach is applied to study noisy quantum walks on cycle graphs of different sizes. We found that it is possible to predict the existence of quantum advantage for the entire decoherence parameter range, even for graphs outside of the training set. Our results are of importance for demonstration of advantage in quantum experiments and pave the way towards automating scientific research and discoveries.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05472v2
PDF https://arxiv.org/pdf/2001.05472v2.pdf
PWC https://paperswithcode.com/paper/machine-learning-transfer-efficiencies-for

Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging

Title Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging
Authors Szu-Yeu Hu, Shuhang Wang, Wei-Hung Weng, JingChao Wang, XiaoHong Wang, Arinc Ozturk, Qian Li, Viksit Kumar, Anthony E. Samir
Abstract Modern deep learning algorithms geared towards clinical adaption rely on a significant amount of high fidelity labeled data. Low-resource settings pose challenges like acquiring high fidelity data and becomes the bottleneck for developing artificial intelligence applications. Ultrasound images, stored in Digital Imaging and Communication in Medicine (DICOM) format, have additional metadata data corresponding to ultrasound image parameters and medical exams. In this work, we leverage DICOM metadata from ultrasound images to help learn representations of the ultrasound image. We demonstrate that the proposed method outperforms the non-metadata based approaches across different downstream tasks.
Published 2020-03-20
URL https://arxiv.org/abs/2003.09070v1
PDF https://arxiv.org/pdf/2003.09070v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-context-encoder-using-dicom

Network-Density-Controlled Decentralized Parallel Stochastic Gradient Descent in Wireless Systems

Title Network-Density-Controlled Decentralized Parallel Stochastic Gradient Descent in Wireless Systems
Authors Koya Sato, Yasuyuki Satoh, Daisuke Sugimura
Abstract This paper proposes a communication strategy for decentralized learning on wireless systems. Our discussion is based on the decentralized parallel stochastic gradient descent (D-PSGD), which is one of the state-of-the-art algorithms for decentralized learning. The main contribution of this paper is to raise a novel open question for decentralized learning on wireless systems: there is a possibility that the density of a network topology significantly influences the runtime performance of D-PSGD. In general, it is difficult to guarantee delay-free communications without any communication deterioration in real wireless network systems because of path loss and multi-path fading. These factors significantly degrade the runtime performance of D-PSGD. To alleviate such problems, we first analyze the runtime performance of D-PSGD by considering real wireless systems. This analysis yields the key insights that dense network topology (1) does not significantly gain the training accuracy of D-PSGD compared to sparse one, and (2) strongly degrades the runtime performance because this setting generally requires to utilize a low-rate transmission. Based on these findings, we propose a novel communication strategy, in which each node estimates optimal transmission rates such that communication time during the D-PSGD optimization is minimized under the constraint of network density, which is characterized by radio propagation property. The proposed strategy enables to improve the runtime performance of D-PSGD in wireless systems. Numerical simulations reveal that the proposed strategy is capable of enhancing the runtime performance of D-PSGD.
Published 2020-02-25
URL https://arxiv.org/abs/2002.10758v1
PDF https://arxiv.org/pdf/2002.10758v1.pdf
PWC https://paperswithcode.com/paper/network-density-controlled-decentralized

A Survey on String Constraint Solving

Title A Survey on String Constraint Solving
Authors Roberto Amadini
Abstract String constraint solving refers to solving combinatorial problems involving constraints over string variables. String solving approaches have become popular over the last years given the massive use of strings in different application domains like formal analysis, automated testing, database query processing, and cybersecurity. This paper reports a comprehensive survey on string constraint solving by exploring the large number of approaches that have been proposed over the last decades to solve string constraints.
Published 2020-01-31
URL https://arxiv.org/abs/2002.02376v4
PDF https://arxiv.org/pdf/2002.02376v4.pdf
PWC https://paperswithcode.com/paper/a-survey-on-string-constraint-solving

A (Simplified) Supreme Being Necessarily Exists, says the Computer: Computationally Explored Variants of Gödel’s Ontological Argument

Title A (Simplified) Supreme Being Necessarily Exists, says the Computer: Computationally Explored Variants of Gödel’s Ontological Argument
Authors Christoph Benzmüller
Abstract An approach to universal (meta-)logical reasoning in classical higher-order logic is employed to explore and study simplifications of Kurt G"odel’s modal ontological argument. Some argument premises are modified, others are dropped, modal collapse is avoided and validity is shown already in weak modal logics K and T. Key to the gained simplifications of G"odel’s original theory is the exploitation of a link to the notions of filter and ultrafilter from topology. The paper illustrates how modern knowledge representation and reasoning technology for quantified non-classical logics can contribute new knowledge to other disciplines. The contributed material is also well suited to support teaching of non-trivial logic formalisms in classroom.
Published 2020-01-14
URL https://arxiv.org/abs/2001.04701v8
PDF https://arxiv.org/pdf/2001.04701v8.pdf
PWC https://paperswithcode.com/paper/a-simplified-supreme-being-necessarily-exists

Hallucinating Statistical Moment and Subspace Descriptors from Object and Saliency Detectors for Action Recognition

Title Hallucinating Statistical Moment and Subspace Descriptors from Object and Saliency Detectors for Action Recognition
Authors Lei Wang, Piotr Koniusz
Abstract In this paper, we build on a deep translational action recognition network which takes RGB frames as input to learn to predict both action concepts and auxiliary supervisory feature descriptors e.g., Optical Flow Features and/or Improved Dense Trajectory descriptors. The translation is performed by so-called hallucination streams trained to predict auxiliary cues which are simultaneously fed into classification layers, and then hallucinated for free at the testing stage to boost recognition. In this paper, we design and hallucinate two descriptors, one leveraging four popular object detectors applied to training videos, and the other leveraging image- and video-level saliency detectors. The first descriptor encodes the detector- and ImageNet-wise class prediction scores, confidence scores, and spatial locations of bounding boxes and frame indexes to capture the spatio-temporal distribution of features per video. Another descriptor encodes spatio-angular gradient distributions of saliency maps and intensity patterns. Inspired by the characteristic function of the probability distribution, we capture four statistical moments on the above intermediate descriptors. As numbers of coefficients in the mean, covariance, coskewness and cokurtotsis grow linearly, quadratically, cubically and quartically w.r.t. the dimension of feature vectors, we describe the covariance matrix by its leading n’ eigenvectors (so-called subspace) and we capture skewness/kurtosis rather than costly coskewness/cokurtosis. We obtain state of the art on three popular datasets.
Tasks Optical Flow Estimation
Published 2020-01-14
URL https://arxiv.org/abs/2001.04627v1
PDF https://arxiv.org/pdf/2001.04627v1.pdf
PWC https://paperswithcode.com/paper/hallucinating-statistical-moment-and-subspace
comments powered by Disqus