April 2, 2020

3353 words 16 mins read

Paper Group ANR 333

D3BA: A Tool for Optimizing Business Processes Using Non-Deterministic Planning. Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering. Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning. Dual Graph Representation Learning. Know Your Surroundings: Exploiting Scene Information for …

D3BA: A Tool for Optimizing Business Processes Using Non-Deterministic Planning


Title	D3BA: A Tool for Optimizing Business Processes Using Non-Deterministic Planning
Authors	Tathagata Chakraborti, Yasaman Khazaeni
Abstract	This paper builds upon recent work in the declarative design of dialogue agents and proposes an exciting new tool – D3BA – Declarative Design for Digital Business Automation, built to optimize business processes using the power of AI planning. The tool provides a powerful framework to build, optimize, and maintain complex business processes and optimize them by composing with services that automate one or more subtasks. We illustrate salient features of this composition technique, compare with other philosophies of composition, and highlight exciting opportunities for research in this emerging field of business process automation.
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02619v2
PDF	https://arxiv.org/pdf/2001.02619v2.pdf
PWC	https://paperswithcode.com/paper/d3ba-a-tool-for-optimizing-business-processes
Repo
Framework

Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering


Title	Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Authors	Lei Shi, Shijie Geng, Kai Shuang, Chiori Hori, Songxiang Liu, Peng Gao, Sen Su
Abstract	Multi-modality fusion technologies have greatly improved the performance of neural network-based Video Description/Caption, Visual Question Answering (VQA) and Audio Visual Scene-aware Dialog (AVSD) over the recent years. Most previous approaches only explore the last layers of multiple layer feature fusion while omitting the importance of intermediate layers. To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Network (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously. In our proposed QBN, we use the holistic text features to guide the update of visual features. In the meantime, Hamilton quaternion products can efficiently perform information flow from higher layers to lower layers for both visual and text modalities. The evaluation results show our QBN improved the performance on VQA 2.0, even though using surpass large scale BERT or visual BERT pre-trained models. Extensive ablation study has been carried out to testify the influence of each proposed module in this study.
Tasks	Question Answering, Video Description, Visual Question Answering
Published	2020-01-03
URL	https://arxiv.org/abs/2001.05840v2
PDF	https://arxiv.org/pdf/2001.05840v2.pdf
PWC	https://paperswithcode.com/paper/multi-layer-content-interaction-through
Repo
Framework

Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning


Title	Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning
Authors	Jin-Woo Seo, Hong-Gyu Jung, Seong-Whan Lee
Abstract	Few-shot learning aims to classify unseen classes with a few training examples. While recent works have shown that standard mini-batch training with a carefully designed training strategy can improve generalization ability for unseen classes, well-known problems in deep networks such as memorizing training statistics have been less explored for few-shot learning. To tackle this issue, we propose self-augmentation that consolidates regional dropout and self-distillation. Specifically, we exploit a data augmentation technique called regional dropout, in which a patch of an image is substituted into other values. Then, we employ a backbone network that has auxiliary branches with its own classifier to enforce knowledge sharing. Lastly, we present a fine-tuning method to further exploit a few training examples for unseen classes. Experimental results show that the proposed method outperforms the state-of-the-art methods for prevalent few-shot benchmarks and improves the generalization ability.
Tasks	Data Augmentation, Few-Shot Learning
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00251v1
PDF	https://arxiv.org/pdf/2004.00251v1.pdf
PWC	https://paperswithcode.com/paper/self-augmentation-generalizing-deep-networks
Repo
Framework

Dual Graph Representation Learning


Title	Dual Graph Representation Learning
Authors	Huiling Zhu, Xin Luo, Hankz Hankui Zhuo
Abstract	Graph representation learning embeds nodes in large graphs as low-dimensional vectors and is of great benefit to many downstream applications. Most embedding frameworks, however, are inherently transductive and unable to generalize to unseen nodes or learn representations across different graphs. Although inductive approaches can generalize to unseen nodes, they neglect different contexts of nodes and cannot learn node embeddings dually. In this paper, we present a context-aware unsupervised dual encoding framework, \textbf{CADE}, to generate representations of nodes by combining real-time neighborhoods with neighbor-attentioned representation, and preserving extra memory of known nodes. We exhibit that our approach is effective by comparing to state-of-the-art methods.
Tasks	Graph Representation Learning, Representation Learning
Published	2020-02-25
URL	https://arxiv.org/abs/2002.11501v1
PDF	https://arxiv.org/pdf/2002.11501v1.pdf
PWC	https://paperswithcode.com/paper/dual-graph-representation-learning
Repo
Framework

Know Your Surroundings: Exploiting Scene Information for Object Tracking


Title	Know Your Surroundings: Exploiting Scene Information for Object Tracking
Authors	Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte
Abstract	Current state-of-the-art trackers only rely on a target appearance model in order to localize the object in each frame. Such approaches are however prone to fail in case of e.g. fast appearance changes or presence of distractor objects, where a target appearance model alone is insufficient for robust tracking. Having the knowledge about the presence and locations of other objects in the surrounding scene can be highly beneficial in such cases. This scene information can be propagated through the sequence and used to, for instance, explicitly avoid distractor objects and eliminate target candidate regions. In this work, we propose a novel tracking architecture which can utilize scene information for tracking. Our tracker represents such information as dense localized state vectors, which can encode, for example, if the local region is target, background, or distractor. These state vectors are propagated through the sequence and combined with the appearance model output to localize the target. Our network is learned to effectively utilize the scene information by directly maximizing tracking performance on video segments. The proposed approach sets a new state-of-the-art on 3 tracking benchmarks, achieving an AO score of 63.6% on the recent GOT-10k dataset.
Tasks	Object Tracking
Published	2020-03-24
URL	https://arxiv.org/abs/2003.11014v1
PDF	https://arxiv.org/pdf/2003.11014v1.pdf
PWC	https://paperswithcode.com/paper/know-your-surroundings-exploiting-scene
Repo
Framework

UniformAugment: A Search-free Probabilistic Data Augmentation Approach


Title	UniformAugment: A Search-free Probabilistic Data Augmentation Approach
Authors	Tom Ching LingChen, Ava Khonsari, Amirreza Lashkari, Mina Rafi Nazari, Jaspreet Singh Sambee, Mario A. Nascimento
Abstract	Augmenting training datasets has been shown to improve the learning effectiveness for several computer vision tasks. A good augmentation produces an augmented dataset that adds variability while retaining the statistical properties of the original dataset. Some techniques, such as AutoAugment and Fast AutoAugment, have introduced a search phase to find a set of suitable augmentation policies for a given model and dataset. This comes at the cost of great computational overhead, adding up to several thousand GPU hours. More recently RandAugment was proposed to substantially speedup the search phase by approximating the search space by a couple of hyperparameters, but still incurring non-negligible cost for tuning those. In this paper we show that, under the assumption that the augmentation space is approximately distribution invariant, a uniform sampling over the continuous space of augmentation transformations is sufficient to train highly effective models. Based on that result we propose UniformAugment, an automated data augmentation approach that completely avoids a search phase. In addition to discussing the theoretical underpinning supporting our approach, we also use the standard datasets, as well as established models for image classification, to show that UniformAugment’s effectiveness is comparable to the aforementioned methods, while still being highly efficient by virtue of not requiring any search.
Tasks	Data Augmentation, Image Classification
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14348v1
PDF	https://arxiv.org/pdf/2003.14348v1.pdf
PWC	https://paperswithcode.com/paper/uniformaugment-a-search-free-probabilistic
Repo
Framework

Neural Architecture Search for Compressed Sensing Magnetic Resonance Image Reconstruction


Title	Neural Architecture Search for Compressed Sensing Magnetic Resonance Image Reconstruction
Authors	Jiangpeng Yan, Shuo Chen, Xiu Li, Yongbing Zhang
Abstract	Recent works have demonstrated that deep learning (DL) based compressed sensing (CS) implementation can provide impressive improvements to reconstruct high-quality MR images from sub-sampling k-space data. However, network architectures adopted in current methods are all designed by handcraft, thus the performances of these networks are limited by researchers’ expertise and labor. In this manuscript, we proposed a novel and efficient MR image reconstruction framework by Neural Architecture Search (NAS) algorithm. The inner cells in our reconstruction network are automatically defined from a flexible search space in a differentiable manner. Comparing to previous works where only several common convolutional operations are tried by human, our method can explore different operations (e.g. dilated convolution) with their possible combinations sufficiently. Our proposed method can also reach a better trade-off between computation cost and reconstruction performance for practical clinical translation. Experiments performed on a publicly available dataset show that our network produces better reconstruction results compared to the previous state-of-the-art methods in terms of PSNR and SSIM with 4 times fewer computation resources. The final network architecture found by the algorithm can also offer insights for network architecture designed in other medical image analysis applications.
Tasks	Image Reconstruction, Neural Architecture Search
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09625v1
PDF	https://arxiv.org/pdf/2002.09625v1.pdf
PWC	https://paperswithcode.com/paper/neural-architecture-search-for-compressed
Repo
Framework

Deep Residual Dense U-Net for Resolution Enhancement in Accelerated MRI Acquisition


Title	Deep Residual Dense U-Net for Resolution Enhancement in Accelerated MRI Acquisition
Authors	Pak Lun Kevin Ding, Zhiqiang Li, Yuxiang Zhou, Baoxin Li
Abstract	Typical Magnetic Resonance Imaging (MRI) scan may take 20 to 60 minutes. Reducing MRI scan time is beneficial for both patient experience and cost considerations. Accelerated MRI scan may be achieved by acquiring less amount of k-space data (down-sampling in the k-space). However, this leads to lower resolution and aliasing artifacts for the reconstructed images. There are many existing approaches for attempting to reconstruct high-quality images from down-sampled k-space data, with varying complexity and performance. In recent years, deep-learning approaches have been proposed for this task, and promising results have been reported. Still, the problem remains challenging especially because of the high fidelity requirement in most medical applications employing reconstructed MRI images. In this work, we propose a deep-learning approach, aiming at reconstructing high-quality images from accelerated MRI acquisition. Specifically, we use Convolutional Neural Network (CNN) to learn the differences between the aliased images and the original images, employing a U-Net-like architecture. Further, a micro-architecture termed Residual Dense Block (RDB) is introduced for learning a better feature representation than the plain U-Net. Considering the peculiarity of the down-sampled k-space data, we introduce a new term to the loss function in learning, which effectively employs the given k-space data during training to provide additional regularization on the update of the network weights. To evaluate the proposed approach, we compare it with other state-of-the-art methods. In both visual inspection and evaluation using standard metrics, the proposed approach is able to deliver improved performance, demonstrating its potential for providing an effective solution.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04488v1
PDF	https://arxiv.org/pdf/2001.04488v1.pdf
PWC	https://paperswithcode.com/paper/deep-residual-dense-u-net-for-resolution
Repo
Framework

Testing Unsatisfiability of Constraint Satisfaction Problems via Tensor Products


Title	Testing Unsatisfiability of Constraint Satisfaction Problems via Tensor Products
Authors	Daya Gaur, Muhammad Khan
Abstract	We study the design of stochastic local search methods to prove unsatisfiability of a constraint satisfaction problem (CSP). For a binary CSP, such methods have been designed using the microstructure of the CSP. Here, we develop a method to decompose the microstructure into graph tensors. We show how to use the tensor decomposition to compute a proof of unsatisfiability efficiently and in parallel. We also offer substantial empirical evidence that our approach improves the praxis. For instance, one decomposition yields proofs of unsatisfiability in half the time without sacrificing the quality. Another decomposition is twenty times faster and effective three-tenths of the times compared to the prior method. Our method is applicable to arbitrary CSPs using the well known dual and hidden variable transformations from an arbitrary CSP to a binary CSP.
Tasks
Published	2020-01-31
URL	https://arxiv.org/abs/2002.03766v1
PDF	https://arxiv.org/pdf/2002.03766v1.pdf
PWC	https://paperswithcode.com/paper/testing-unsatisfiability-of-constraint
Repo
Framework

Towards Privacy Protection by Generating Adversarial Identity Masks


Title	Towards Privacy Protection by Generating Adversarial Identity Masks
Authors	Xiao Yang, Yinpeng Dong, Tianyu Pang, Jun Zhu, Hang Su
Abstract	As billions of personal data such as photos are shared through social media and network, the privacy and security of data have drawn an increasing attention. Several attempts have been made to alleviate the leakage of identity information with the aid of image obfuscation techniques. However, most of the present results are either perceptually unsatisfactory or ineffective against real-world recognition systems. In this paper, we argue that an algorithm for privacy protection must block the ability of automatic inference of the identity and at the same time, make the resultant image natural from the users’ point of view. To achieve this, we propose a targeted identity-protection iterative method (TIP-IM), which can generate natural face images by adding adversarial identity masks to conceal ones’ identity against a recognition system. Extensive experiments on various state-of-the-art face recognition models demonstrate the effectiveness of our proposed method on alleviating the identity leakage of face images, without sacrificing? the visual quality of the protected images.
Tasks	Face Recognition
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06814v1
PDF	https://arxiv.org/pdf/2003.06814v1.pdf
PWC	https://paperswithcode.com/paper/towards-privacy-protection-by-generating
Repo
Framework

A Semi-supervised Graph Attentive Network for Financial Fraud Detection


Title	A Semi-supervised Graph Attentive Network for Financial Fraud Detection
Authors	Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, Yuan Qi
Abstract	With the rapid growth of financial services, fraud detection has been a very important problem to guarantee a healthy environment for both users and providers. Conventional solutions for fraud detection mainly use some rule-based methods or distract some features manually to perform prediction. However, in financial services, users have rich interactions and they themselves always show multifaceted information. These data form a large multiview network, which is not fully exploited by conventional methods. Additionally, among the network, only very few of the users are labelled, which also poses a great challenge for only utilizing labeled data to achieve a satisfied performance on fraud detection. To address the problem, we expand the labeled data through their social relations to get the unlabeled data and propose a semi-supervised attentive graph neural network, namedSemiGNN to utilize the multi-view labeled and unlabeled data for fraud detection. Moreover, we propose a hierarchical attention mechanism to better correlate different neighbors and different views. Simultaneously, the attention mechanism can make the model interpretable and tell what are the important factors for the fraud and why the users are predicted as fraud. Experimentally, we conduct the prediction task on the users of Alipay, one of the largest third-party online and offline cashless payment platform serving more than 4 hundreds of million users in China. By utilizing the social relations and the user attributes, our method can achieve a better accuracy compared with the state-of-the-art methods on two tasks. Moreover, the interpretable results also give interesting intuitions regarding the tasks.
Tasks	Fraud Detection
Published	2020-02-28
URL	https://arxiv.org/abs/2003.01171v1
PDF	https://arxiv.org/pdf/2003.01171v1.pdf
PWC	https://paperswithcode.com/paper/a-semi-supervised-graph-attentive-network-for
Repo
Framework

Novelty Detection via Non-Adversarial Generative Network


Title	Novelty Detection via Non-Adversarial Generative Network
Authors	Chengwei Chen, Wang Yuan, Yuan Xie, Yanyun Qu, Yiqing Tao, Haichuan Song, Lizhuang Ma
Abstract	One-class novelty detection is the process of determining if a query example differs from the training examples (the target class). Most of previous strategies attempt to learn the real characteristics of target sample by using generative adversarial networks (GANs) methods. However, the training process of GANs remains challenging, suffering from instability issues such as mode collapse and vanishing gradients. In this paper, by adopting non-adversarial generative networks, a novel decoder-encoder framework is proposed for novelty detection task, insteading of classical encoder-decoder style. Under the non-adversarial framework, both latent space and image reconstruction space are jointly optimized, leading to a more stable training process with super fast convergence and lower training losses. During inference, inspired by cycleGAN, we design a new testing scheme to conduct image reconstruction, which is the reverse way of training sequence. Experiments show that our model has the clear superiority over cutting-edge novelty detectors and achieves the state-of-the-art results on the datasets.
Tasks	Image Reconstruction
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00522v1
PDF	https://arxiv.org/pdf/2002.00522v1.pdf
PWC	https://paperswithcode.com/paper/novelty-detection-via-non-adversarial
Repo
Framework

A Comparison of Data Augmentation Techniques in Training Deep Neural Networks for Satellite Image Classification


Title	A Comparison of Data Augmentation Techniques in Training Deep Neural Networks for Satellite Image Classification
Authors	Mohamed Abdelhack
Abstract	Satellite imagery allows a plethora of applications ranging from weather forecasting to land surveying. The rapid development of computer vision systems could open new horizons to the utilization of satellite data due to the abundance of large volumes of data. However, current state-of-the-art computer vision systems mainly cater to applications that mainly involve natural images. While useful, those images exhibit a different distribution from satellite images in addition to having more spectral channels. This allows the use of pretrained deep learning models only in a subset of spectral channels that are equivalent to natural images thus discarding valuable information from other spectral channels. This calls for research effort to optimize deep learning models for satellite imagery to enable the assessment of their utility in the domain of remote sensing. This study focuses on the topic of image augmentation in training of deep neural network classifiers. I tested different techniques for image augmentation to train a standard deep neural network on satellite images from EuroSAT. Results show that while some image augmentation techniques commonly used in natural image training can readily be transferred to satellite images, some others could actually lead to a decrease in performance. Additionally, some novel image augmentation techniques that take into account the nature of satellite images could be useful to incorporate in training.
Tasks	Data Augmentation, Image Augmentation, Image Classification, Weather Forecasting
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13502v1
PDF	https://arxiv.org/pdf/2003.13502v1.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-data-augmentation-techniques
Repo
Framework

Adversarial Feature Hallucination Networks for Few-Shot Learning


Title	Adversarial Feature Hallucination Networks for Few-Shot Learning
Authors	Kai Li, Yulun Zhang, Kunpeng Li, Yun Fu
Abstract	The recent flourish of deep learning in various tasks is largely accredited to the rich and accessible labeled data. Nonetheless, massive supervision remains a luxury for many real applications, boosting great interest in label-scarce techniques such as few-shot learning (FSL), which aims to learn concept of new classes with a few labeled samples. A natural approach to FSL is data augmentation and many recent works have proved the feasibility by proposing various data synthesis models. However, these models fail to well secure the discriminability and diversity of the synthesized data and thus often produce undesirable results. In this paper, we propose Adversarial Feature Hallucination Networks (AFHN) which is based on conditional Wasserstein Generative Adversarial networks (cWGAN) and hallucinates diverse and discriminative features conditioned on the few labeled samples. Two novel regularizers, i.e., the classification regularizer and the anti-collapse regularizer, are incorporated into AFHN to encourage discriminability and diversity of the synthesized features, respectively. Ablation study verifies the effectiveness of the proposed cWGAN based feature hallucination framework and the proposed regularizers. Comparative results on three common benchmark datasets substantiate the superiority of AFHN to existing data augmentation based FSL approaches and other state-of-the-art ones.
Tasks	Data Augmentation, Few-Shot Learning
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13193v1
PDF	https://arxiv.org/pdf/2003.13193v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-feature-hallucination-networks
Repo
Framework

Reconstructing Natural Scenes from fMRI Patterns using BigBiGAN


Title	Reconstructing Natural Scenes from fMRI Patterns using BigBiGAN
Authors	Milad Mozafari, Leila Reddy, Rufin VanRullen
Abstract	Decoding and reconstructing images from brain imaging data is a research area of high interest. Recent progress in deep generative neural networks has introduced new opportunities to tackle this problem. Here, we employ a recently proposed large-scale bi-directional generative adversarial network, called BigBiGAN, to decode and reconstruct natural scenes from fMRI patterns. BigBiGAN converts images into a 120-dimensional latent space which encodes class and attribute information together, and can also reconstruct images based on their latent vectors. We trained a linear mapping between fMRI data, acquired over images from 150 different categories of ImageNet, and their corresponding BigBiGAN latent vectors. Then, we applied this mapping to the fMRI activity patterns obtained from 50 new test images from 50 unseen categories in order to retrieve their latent vectors, and reconstruct the corresponding images. Pairwise image decoding from the predicted latent vectors was highly accurate (84%). Moreover, qualitative and quantitative assessments revealed that the resulting image reconstructions were visually plausible, successfully captured many attributes of the original images, and had high perceptual similarity with the original content. This method establishes a new state-of-the-art for fMRI-based natural image reconstruction, and can be flexibly updated to take into account any future improvements in generative models of natural scene images.
Tasks	Image Reconstruction
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11761v1
PDF	https://arxiv.org/pdf/2001.11761v1.pdf
PWC	https://paperswithcode.com/paper/reconstructing-natural-scenes-from-fmri
Repo
Framework