January 26, 2020

3055 words 15 mins read

Paper Group ANR 1421

Generative Models For Deep Learning with Very Scarce Data. Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs. AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization. Straggler-Agnostic and Communication-Efficient Distributed …

Generative Models For Deep Learning with Very Scarce Data


Title	Generative Models For Deep Learning with Very Scarce Data
Authors	Juan Maroñas, Roberto Paredes, Daniel Ramos
Abstract	The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities.
Tasks
Published	2019-03-21
URL	http://arxiv.org/abs/1903.09030v1
PDF	http://arxiv.org/pdf/1903.09030v1.pdf
PWC	https://paperswithcode.com/paper/generative-models-for-deep-learning-with-very
Repo
Framework

Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs


Title	Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs
Authors	Jung-Woo Chang, Saehyun Ahn, Keon-Woo Kang, Suk-Ju Kang
Abstract	Generative adversarial networks (GANs) have shown excellent performance in image and speech applications. GANs create impressive data primarily through a new type of operator called deconvolution (DeConv) or transposed convolution (Conv). To implement the DeConv layer in hardware, the state-of-the-art accelerator reduces the high computational complexity via the DeConv-to-Conv conversion and achieves the same results. However, there is a problem that the number of filters increases due to this conversion. Recently, Winograd minimal filtering has been recognized as an effective solution to improve the arithmetic complexity and resource efficiency of the Conv layer. In this paper, we propose an efficient Winograd DeConv accelerator that combines these two orthogonal approaches on FPGAs. Firstly, we introduce a new class of fast algorithm for DeConv layers using Winograd minimal filtering. Since there are regular sparse patterns in Winograd filters, we further amortize the computational complexity by skipping zero weights. Secondly, we propose a new dataflow to prevent resource underutilization by reorganizing the filter layout in the Winograd domain. Finally, we propose an efficient architecture for implementing Winograd DeConv by designing the line buffer and exploring the design space. Experimental results on various GANs show that our accelerator achieves up to 1.78x~8.38x speedup over the state-of-the-art DeConv accelerators.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06918v1
PDF	https://arxiv.org/pdf/1911.06918v1.pdf
PWC	https://paperswithcode.com/paper/towards-design-methodology-of-efficient-fast
Repo
Framework

AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization


Title	AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization
Authors	Xiao-Yu Zhang, Changsheng Li, Haichao Shi, Xiaobin Zhu, Peng Li, Jing Dong
Abstract	The point process is a solid framework to model sequential data, such as videos, by exploring the underlying relevance. As a challenging problem for high-level video understanding, weakly supervised action recognition and localization in untrimmed videos has attracted intensive research attention. Knowledge transfer by leveraging the publicly available trimmed videos as external guidance is a promising attempt to make up for the coarse-grained video-level annotation and improve the generalization performance. However, unconstrained knowledge transfer may bring about irrelevant noise and jeopardize the learning model. This paper proposes a novel adaptability decomposing encoder-decoder network to transfer reliable knowledge between trimmed and untrimmed videos for action recognition and localization via bidirectional point process modeling, given only video-level annotations. By decomposing the original features into domain-adaptable and domain-specific ones based on their adaptability, trimmed-untrimmed knowledge transfer can be safely confined within a more coherent subspace. An encoder-decoder based structure is carefully designed and jointly optimized to facilitate effective action classification and temporal localization. Extensive experiments are conducted on two benchmark datasets (i.e., THUMOS14 and ActivityNet1.3), and experimental results clearly corroborate the efficacy of our method.
Tasks	Action Classification, Temporal Localization, Transfer Learning, Video Understanding
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11961v1
PDF	https://arxiv.org/pdf/1911.11961v1.pdf
PWC	https://paperswithcode.com/paper/adapnet-adaptability-decomposing-encoder
Repo
Framework

Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining


Title	Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining
Authors	Zhouyuan Huo, Heng Huang
Abstract	Recently, reducing communication time between machines becomes the main focus of distributed data mining. Previous methods propose to make workers do more computation locally before aggregating local solutions in the server such that fewer communication rounds between server and workers are required. However, these methods do not consider reducing the communication time per round and work very poor under certain conditions, for example, when there are straggler problems or the dataset is of high dimension. In this paper, we target to reduce communication time per round as well as the required communication rounds. We propose a communication-efficient distributed primal-dual method with straggler-agnostic server and bandwidth-efficient workers. We analyze the convergence property and prove that the proposed method guarantees linear convergence rate to the optimal solution for convex problems. Finally, we conduct large-scale experiments in simulated and real distributed systems and experimental results demonstrate that the proposed method is much faster than compared methods.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04235v1
PDF	https://arxiv.org/pdf/1910.04235v1.pdf
PWC	https://paperswithcode.com/paper/straggler-agnostic-and-communication
Repo
Framework

The Deepfake Detection Challenge (DFDC) Preview Dataset


Title	The Deepfake Detection Challenge (DFDC) Preview Dataset
Authors	Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, Cristian Canton Ferrer
Abstract	In this paper, we introduce a preview of the Deepfakes Detection Challenge (DFDC) dataset consisting of 5K videos featuring two facial modification algorithms. A data collection campaign has been carried out where participating actors have entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset. Diversity in several axes (gender, skin-tone, age, etc.) has been considered and actors recorded videos with arbitrary backgrounds thus bringing visual variability. Finally, a set of specific metrics to evaluate the performance have been defined and two existing models for detecting deepfakes have been tested to provide a reference performance baseline. The DFDC dataset preview can be downloaded at: deepfakedetectionchallenge.ai
Tasks	DeepFake Detection, Face Swapping
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08854v2
PDF	https://arxiv.org/pdf/1910.08854v2.pdf
PWC	https://paperswithcode.com/paper/the-deepfake-detection-challenge-dfdc-preview
Repo
Framework

MTRNet++: One-stage Mask-based Scene Text Eraser


Title	MTRNet++: One-stage Mask-based Scene Text Eraser
Authors	Osman Tursun, Simon Denman, Rui Zeng, Sabesan Sivapalan, Sridha Sridharan, Clinton Fookes
Abstract	A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07183v1
PDF	https://arxiv.org/pdf/1912.07183v1.pdf
PWC	https://paperswithcode.com/paper/mtrnet-one-stage-mask-based-scene-text-eraser
Repo
Framework

Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness


Title	Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness
Authors	Fanny Yang, Zuowen Wang, Christina Heinze-Deml
Abstract	This work provides theoretical and empirical evidence that invariance-inducing regularizers can increase predictive accuracy for worst-case spatial transformations (spatial robustness). Evaluated on these adversarially transformed examples, we demonstrate that adding regularization on top of standard or adversarial training reduces the relative error by 20% for CIFAR10 without increasing the computational cost. This outperforms handcrafted networks that were explicitly designed to be spatial-equivariant. Furthermore, we observe for SVHN, known to have inherent variance in orientation, that robust training also improves standard accuracy on the test set. We prove that this no-trade-off phenomenon holds for adversarial examples from transformation groups in the infinite data limit.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11235v1
PDF	https://arxiv.org/pdf/1906.11235v1.pdf
PWC	https://paperswithcode.com/paper/invariance-inducing-regularization-using
Repo
Framework

Solution of Two-Player Zero-Sum Game by Successive Relaxation


Title	Solution of Two-Player Zero-Sum Game by Successive Relaxation
Authors	Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar
Abstract	We consider the problem of two-player zero-sum game. In this setting, there are two agents working against each other. Both the agents observe the same state and the objective of the agents is to compute a strategy profile that maximizes their rewards. However, the reward of the second agent is negative of reward obtained by the first agent. Therefore, the objective of the second agent is to minimize the total reward obtained by the first agent. This problem is formulated as a min-max Markov game in the literature. The solution of this game, which is the max-min reward (of first player), starting from a given state is called the equilibrium value of the state. In this work, we compute the solution of the two-player zero-sum game utilizing the technique of successive relaxation. Successive relaxation has been successfully applied in the literature to compute a faster value iteration algorithm in the context of Markov Decision Processes. We extend the concept of successive relaxation to the two-player zero-sum games. We prove that, under a special structure, this technique computes the optimal solution faster than the techniques in the literature. We then derive a generalized minimax Q-learning algorithm that computes the optimal policy when the model information is not known. Finally, we prove the convergence of the proposed generalized minimax Q-learning algorithm.
Tasks	Q-Learning
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06659v1
PDF	https://arxiv.org/pdf/1906.06659v1.pdf
PWC	https://paperswithcode.com/paper/solution-of-two-player-zero-sum-game-by
Repo
Framework

Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop


Title	Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop
Authors	Mudit Verma, Siddhant Bhambri, Arun Balaji Buduru
Abstract	Rapid advancements in the Internet of Things (IoT) have facilitated more efficient deployment of smart environment solutions for specific user requirement. With the increase in the number of IoT devices, it has become difficult for the user to control or operate every individual smart device into achieving some desired goal like optimized power consumption, scheduled appliance running time, etc. Furthermore, existing solutions to automatically adapt the IoT devices are not capable enough to incorporate the user behavior. This paper presents a novel approach to accurately configure IoT devices while achieving the twin objectives of energy optimization along with conforming to user preferences. Our work comprises of unsupervised clustering of devices’ data to find the states of operation for each device, followed by probabilistically analyzing user behavior to determine their preferred states. Eventually, we deploy an online reinforcement learning (RL) agent to find the best device settings automatically. Results for three different smart homes’ data-sets show the effectiveness of our methodology. To the best of our knowledge, this is the first time that a practical approach has been adopted to achieve the above mentioned objectives without any human interaction within the system.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03298v1
PDF	https://arxiv.org/pdf/1912.03298v1.pdf
PWC	https://paperswithcode.com/paper/making-smart-homes-smarter-optimizing-energy
Repo
Framework

ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos


Title	ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos
Authors	Giorgos Karvounas, Iason Oikonomidis, Antonis Argyros
Abstract	We address the problem of temporal localization of repetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents a video by the matrix of pairwise frame distances. These distances are computed on frame representations obtained with a convolutional neural network. On top of this representation, we design, implement and evaluate ReActNet, a lightweight convolutional neural network that classifies a given frame as belonging (or not) to a repetitive video segment. An important property of the employed representation is that it can handle repetitive segments of arbitrary number and duration. Furthermore, the proposed training process requires a relatively small number of annotated videos. Our method raises several of the limiting assumptions of existing approaches regarding the contents of the video and the types of the observed repetitive activities. Experimental results on recent, publicly available datasets validate our design choices, verify the generalization potential of ReActNet and demonstrate its superior performance in comparison to the current state of the art.
Tasks	Temporal Localization
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06096v1
PDF	https://arxiv.org/pdf/1910.06096v1.pdf
PWC	https://paperswithcode.com/paper/reactnet-temporal-localization-of-repetitive
Repo
Framework

Variance-reduced $Q$-learning is minimax optimal


Title	Variance-reduced $Q$-learning is minimax optimal
Authors	Martin J. Wainwright
Abstract	We introduce and analyze a form of variance-reduced $Q$-learning. For $\gamma$-discounted MDPs with finite state space $\mathcal{X}$ and action space $\mathcal{U}$, we prove that it yields an $\epsilon$-accurate estimate of the optimal $Q$-function in the $\ell_\infty$-norm using $\mathcal{O} \left(\left(\frac{D}{ \epsilon^2 (1-\gamma)^3} \right) ; \log \left( \frac{D}{(1-\gamma)} \right) \right)$ samples, where $D = \mathcal{X} \times \mathcal{U}$. This guarantee matches known minimax lower bounds up to a logarithmic factor in the discount complexity. In contrast, our past work shows that ordinary $Q$-learning has worst-case quartic scaling in the discount complexity.
Tasks	Q-Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04697v2
PDF	https://arxiv.org/pdf/1906.04697v2.pdf
PWC	https://paperswithcode.com/paper/variance-reduced-q-learning-is-minimax
Repo
Framework

Identifying Hidden Buyers in Darknet Markets via Dirichlet Hawkes Process


Title	Identifying Hidden Buyers in Darknet Markets via Dirichlet Hawkes Process
Authors	Panpan Zheng, Shuhan Yuan, Xintao Wu, Yubao Wu
Abstract	The darknet markets are notorious black markets in cyberspace, which involve selling or brokering drugs, weapons, stolen credit cards, and other illicit goods. To combat illicit transactions in the cyberspace, it is important to analyze the behaviors of participants in darknet markets. Currently, many studies focus on studying the behavior of vendors. However, there is no much work on analyzing buyers. The key challenge is that the buyers are anonymized in darknet markets. For most of the darknet markets, We only observe the first and last digits of a buyer’s ID, such as ``a**b’'. To tackle this challenge, we propose a hidden buyer identification model, called UNMIX, which can group the transactions from one hidden buyer into one cluster given a transaction sequence from an anonymized ID. UNMIX is able to model the temporal dynamics information as well as the product, comment, and vendor information associated with each transaction. As a result, the transactions with similar patterns in terms of time and content group together as the subsequence from one hidden buyer. Experiments on the data collected from three real-world darknet markets demonstrate the effectiveness of our approach measured by various clustering metrics. Case studies on real transaction sequences explicitly show that our approach can group transactions with similar patterns into the same clusters. \|
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04620v1
PDF	https://arxiv.org/pdf/1911.04620v1.pdf
PWC	https://paperswithcode.com/paper/identifying-hidden-buyers-in-darknet-markets
Repo
Framework

Confounder-Aware Visualization of ConvNets


Title	Confounder-Aware Visualization of ConvNets
Authors	Qingyu Zhao, Ehsan Adeli, Adolf Pfefferbaum, Edith V. Sullivan, Kilian M. Pohl
Abstract	With recent advances in deep learning, neuroimaging studies increasingly rely on convolutional networks (ConvNets) to predict diagnosis based on MR images. To gain a better understanding of how a disease impacts the brain, the studies visualize the salience maps of the ConvNet highlighting voxels within the brain majorly contributing to the prediction. However, these salience maps are generally confounded, i.e., some salient regions are more predictive of confounding variables (such as age) than the diagnosis. To avoid such misinterpretation, we propose in this paper an approach that aims to visualize confounder-free saliency maps that only highlight voxels predictive of the diagnosis. The approach incorporates univariate statistical tests to identify confounding effects within the intermediate features learned by ConvNet. The influence from the subset of confounded features is then removed by a novel partial back-propagation procedure. We use this two-step approach to visualize confounder-free saliency maps extracted from synthetic and two real datasets. These experiments reveal the potential of our visualization in producing unbiased model-interpretation.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12727v1
PDF	https://arxiv.org/pdf/1907.12727v1.pdf
PWC	https://paperswithcode.com/paper/confounder-aware-visualization-of-convnets
Repo
Framework

Landmark Detection in Low Resolution Faces with Semi-Supervised Learning


Title	Landmark Detection in Low Resolution Faces with Semi-Supervised Learning
Authors	Amit Kumar, Rama Chellappa
Abstract	Landmark detection algorithms trained on high resolution images perform poorly on datasets containing low resolution images. This deters the performance of algorithms relying on quality landmarks, for example, face recognition. To the best of our knowledge, there does not exist any dataset consisting of low resolution face images along with their annotated landmarks, making supervised training infeasible. In this paper, we present a semi-supervised approach to predict landmarks on low resolution images by learning them from labeled high resolution images. The objective of this work is to show that predicting landmarks directly on low resolution images is more effective than the current practice of aligning images after rescaling or superresolution. In a two-step process, the proposed approach first learns to generate low resolution images by modeling the distribution of target low resolution images. In the second stage, the roles of generated images and real low resolution images are switched and the model learns to predict landmarks for real low resolution images from generated low resolution images. With extensive experimentation, we study the impact of each of the design choices and also show that prediction of landmarks directly on low resolution images improves the performance of important tasks such as face recognition in low resolution images.
Tasks	Face Recognition
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13255v1
PDF	https://arxiv.org/pdf/1907.13255v1.pdf
PWC	https://paperswithcode.com/paper/landmark-detection-in-low-resolution-faces
Repo
Framework

Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing


Title	Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing
Authors	Chen Qi, Yuxiu Hua, Rongpeng Li, Zhifeng Zhao, Honggang Zhang
Abstract	Network slicing promises to provision diversified services with distinct requirements in one infrastructure. Deep reinforcement learning (e.g., deep $\mathcal{Q}$-learning, DQL) is assumed to be an appropriate algorithm to solve the demand-aware inter-slice resource management issue in network slicing by regarding the varying demands and the allocated bandwidth as the environment state and the action, respectively. However, allocating bandwidth in a finer resolution usually implies larger action space, and unfortunately DQL fails to quickly converge in this case. In this paper, we introduce discrete normalized advantage functions (DNAF) into DQL, by separating the $\mathcal{Q}$-value function as a state-value function term and an advantage term and exploiting a deterministic policy gradient descent (DPGD) algorithm to avoid the unnecessary calculation of $\mathcal{Q}$-value for every state-action pair. Furthermore, as DPGD only works in continuous action space, we embed a k-nearest neighbor algorithm into DQL to quickly find a valid action in the discrete space nearest to the DPGD output. Finally, we verify the faster convergence of the DNAF-based DQL through extensive simulations.
Tasks	Q-Learning
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04594v1
PDF	https://arxiv.org/pdf/1906.04594v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-with-discrete
Repo
Framework