Paper Group ANR 1421
Generative Models For Deep Learning with Very Scarce Data. Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs. AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization. Straggler-Agnostic and Communication-Efficient Distributed …
Generative Models For Deep Learning with Very Scarce Data
Title | Generative Models For Deep Learning with Very Scarce Data |
Authors | Juan Maroñas, Roberto Paredes, Daniel Ramos |
Abstract | The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities. |
Tasks | |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09030v1 |
http://arxiv.org/pdf/1903.09030v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-models-for-deep-learning-with-very |
Repo | |
Framework | |
Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs
Title | Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs |
Authors | Jung-Woo Chang, Saehyun Ahn, Keon-Woo Kang, Suk-Ju Kang |
Abstract | Generative adversarial networks (GANs) have shown excellent performance in image and speech applications. GANs create impressive data primarily through a new type of operator called deconvolution (DeConv) or transposed convolution (Conv). To implement the DeConv layer in hardware, the state-of-the-art accelerator reduces the high computational complexity via the DeConv-to-Conv conversion and achieves the same results. However, there is a problem that the number of filters increases due to this conversion. Recently, Winograd minimal filtering has been recognized as an effective solution to improve the arithmetic complexity and resource efficiency of the Conv layer. In this paper, we propose an efficient Winograd DeConv accelerator that combines these two orthogonal approaches on FPGAs. Firstly, we introduce a new class of fast algorithm for DeConv layers using Winograd minimal filtering. Since there are regular sparse patterns in Winograd filters, we further amortize the computational complexity by skipping zero weights. Secondly, we propose a new dataflow to prevent resource underutilization by reorganizing the filter layout in the Winograd domain. Finally, we propose an efficient architecture for implementing Winograd DeConv by designing the line buffer and exploring the design space. Experimental results on various GANs show that our accelerator achieves up to 1.78x~8.38x speedup over the state-of-the-art DeConv accelerators. |
Tasks | |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.06918v1 |
https://arxiv.org/pdf/1911.06918v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-design-methodology-of-efficient-fast |
Repo | |
Framework | |
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization
Title | AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization |
Authors | Xiao-Yu Zhang, Changsheng Li, Haichao Shi, Xiaobin Zhu, Peng Li, Jing Dong |
Abstract | The point process is a solid framework to model sequential data, such as videos, by exploring the underlying relevance. As a challenging problem for high-level video understanding, weakly supervised action recognition and localization in untrimmed videos has attracted intensive research attention. Knowledge transfer by leveraging the publicly available trimmed videos as external guidance is a promising attempt to make up for the coarse-grained video-level annotation and improve the generalization performance. However, unconstrained knowledge transfer may bring about irrelevant noise and jeopardize the learning model. This paper proposes a novel adaptability decomposing encoder-decoder network to transfer reliable knowledge between trimmed and untrimmed videos for action recognition and localization via bidirectional point process modeling, given only video-level annotations. By decomposing the original features into domain-adaptable and domain-specific ones based on their adaptability, trimmed-untrimmed knowledge transfer can be safely confined within a more coherent subspace. An encoder-decoder based structure is carefully designed and jointly optimized to facilitate effective action classification and temporal localization. Extensive experiments are conducted on two benchmark datasets (i.e., THUMOS14 and ActivityNet1.3), and experimental results clearly corroborate the efficacy of our method. |
Tasks | Action Classification, Temporal Localization, Transfer Learning, Video Understanding |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11961v1 |
https://arxiv.org/pdf/1911.11961v1.pdf | |
PWC | https://paperswithcode.com/paper/adapnet-adaptability-decomposing-encoder |
Repo | |
Framework | |
Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining
Title | Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining |
Authors | Zhouyuan Huo, Heng Huang |
Abstract | Recently, reducing communication time between machines becomes the main focus of distributed data mining. Previous methods propose to make workers do more computation locally before aggregating local solutions in the server such that fewer communication rounds between server and workers are required. However, these methods do not consider reducing the communication time per round and work very poor under certain conditions, for example, when there are straggler problems or the dataset is of high dimension. In this paper, we target to reduce communication time per round as well as the required communication rounds. We propose a communication-efficient distributed primal-dual method with straggler-agnostic server and bandwidth-efficient workers. We analyze the convergence property and prove that the proposed method guarantees linear convergence rate to the optimal solution for convex problems. Finally, we conduct large-scale experiments in simulated and real distributed systems and experimental results demonstrate that the proposed method is much faster than compared methods. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04235v1 |
https://arxiv.org/pdf/1910.04235v1.pdf | |
PWC | https://paperswithcode.com/paper/straggler-agnostic-and-communication |
Repo | |
Framework | |
The Deepfake Detection Challenge (DFDC) Preview Dataset
Title | The Deepfake Detection Challenge (DFDC) Preview Dataset |
Authors | Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, Cristian Canton Ferrer |
Abstract | In this paper, we introduce a preview of the Deepfakes Detection Challenge (DFDC) dataset consisting of 5K videos featuring two facial modification algorithms. A data collection campaign has been carried out where participating actors have entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset. Diversity in several axes (gender, skin-tone, age, etc.) has been considered and actors recorded videos with arbitrary backgrounds thus bringing visual variability. Finally, a set of specific metrics to evaluate the performance have been defined and two existing models for detecting deepfakes have been tested to provide a reference performance baseline. The DFDC dataset preview can be downloaded at: deepfakedetectionchallenge.ai |
Tasks | DeepFake Detection, Face Swapping |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08854v2 |
https://arxiv.org/pdf/1910.08854v2.pdf | |
PWC | https://paperswithcode.com/paper/the-deepfake-detection-challenge-dfdc-preview |
Repo | |
Framework | |
MTRNet++: One-stage Mask-based Scene Text Eraser
Title | MTRNet++: One-stage Mask-based Scene Text Eraser |
Authors | Osman Tursun, Simon Denman, Rui Zeng, Sabesan Sivapalan, Sridha Sridharan, Clinton Fookes |
Abstract | A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability. |
Tasks | |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07183v1 |
https://arxiv.org/pdf/1912.07183v1.pdf | |
PWC | https://paperswithcode.com/paper/mtrnet-one-stage-mask-based-scene-text-eraser |
Repo | |
Framework | |
Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness
Title | Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness |
Authors | Fanny Yang, Zuowen Wang, Christina Heinze-Deml |
Abstract | This work provides theoretical and empirical evidence that invariance-inducing regularizers can increase predictive accuracy for worst-case spatial transformations (spatial robustness). Evaluated on these adversarially transformed examples, we demonstrate that adding regularization on top of standard or adversarial training reduces the relative error by 20% for CIFAR10 without increasing the computational cost. This outperforms handcrafted networks that were explicitly designed to be spatial-equivariant. Furthermore, we observe for SVHN, known to have inherent variance in orientation, that robust training also improves standard accuracy on the test set. We prove that this no-trade-off phenomenon holds for adversarial examples from transformation groups in the infinite data limit. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11235v1 |
https://arxiv.org/pdf/1906.11235v1.pdf | |
PWC | https://paperswithcode.com/paper/invariance-inducing-regularization-using |
Repo | |
Framework | |
Solution of Two-Player Zero-Sum Game by Successive Relaxation
Title | Solution of Two-Player Zero-Sum Game by Successive Relaxation |
Authors | Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar |
Abstract | We consider the problem of two-player zero-sum game. In this setting, there are two agents working against each other. Both the agents observe the same state and the objective of the agents is to compute a strategy profile that maximizes their rewards. However, the reward of the second agent is negative of reward obtained by the first agent. Therefore, the objective of the second agent is to minimize the total reward obtained by the first agent. This problem is formulated as a min-max Markov game in the literature. The solution of this game, which is the max-min reward (of first player), starting from a given state is called the equilibrium value of the state. In this work, we compute the solution of the two-player zero-sum game utilizing the technique of successive relaxation. Successive relaxation has been successfully applied in the literature to compute a faster value iteration algorithm in the context of Markov Decision Processes. We extend the concept of successive relaxation to the two-player zero-sum games. We prove that, under a special structure, this technique computes the optimal solution faster than the techniques in the literature. We then derive a generalized minimax Q-learning algorithm that computes the optimal policy when the model information is not known. Finally, we prove the convergence of the proposed generalized minimax Q-learning algorithm. |
Tasks | Q-Learning |
Published | 2019-06-16 |
URL | https://arxiv.org/abs/1906.06659v1 |
https://arxiv.org/pdf/1906.06659v1.pdf | |
PWC | https://paperswithcode.com/paper/solution-of-two-player-zero-sum-game-by |
Repo | |
Framework | |
Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop
Title | Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop |
Authors | Mudit Verma, Siddhant Bhambri, Arun Balaji Buduru |
Abstract | Rapid advancements in the Internet of Things (IoT) have facilitated more efficient deployment of smart environment solutions for specific user requirement. With the increase in the number of IoT devices, it has become difficult for the user to control or operate every individual smart device into achieving some desired goal like optimized power consumption, scheduled appliance running time, etc. Furthermore, existing solutions to automatically adapt the IoT devices are not capable enough to incorporate the user behavior. This paper presents a novel approach to accurately configure IoT devices while achieving the twin objectives of energy optimization along with conforming to user preferences. Our work comprises of unsupervised clustering of devices’ data to find the states of operation for each device, followed by probabilistically analyzing user behavior to determine their preferred states. Eventually, we deploy an online reinforcement learning (RL) agent to find the best device settings automatically. Results for three different smart homes’ data-sets show the effectiveness of our methodology. To the best of our knowledge, this is the first time that a practical approach has been adopted to achieve the above mentioned objectives without any human interaction within the system. |
Tasks | |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03298v1 |
https://arxiv.org/pdf/1912.03298v1.pdf | |
PWC | https://paperswithcode.com/paper/making-smart-homes-smarter-optimizing-energy |
Repo | |
Framework | |
ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos
Title | ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos |
Authors | Giorgos Karvounas, Iason Oikonomidis, Antonis Argyros |
Abstract | We address the problem of temporal localization of repetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents a video by the matrix of pairwise frame distances. These distances are computed on frame representations obtained with a convolutional neural network. On top of this representation, we design, implement and evaluate ReActNet, a lightweight convolutional neural network that classifies a given frame as belonging (or not) to a repetitive video segment. An important property of the employed representation is that it can handle repetitive segments of arbitrary number and duration. Furthermore, the proposed training process requires a relatively small number of annotated videos. Our method raises several of the limiting assumptions of existing approaches regarding the contents of the video and the types of the observed repetitive activities. Experimental results on recent, publicly available datasets validate our design choices, verify the generalization potential of ReActNet and demonstrate its superior performance in comparison to the current state of the art. |
Tasks | Temporal Localization |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06096v1 |
https://arxiv.org/pdf/1910.06096v1.pdf | |
PWC | https://paperswithcode.com/paper/reactnet-temporal-localization-of-repetitive |
Repo | |
Framework | |
Variance-reduced $Q$-learning is minimax optimal
Title | Variance-reduced $Q$-learning is minimax optimal |
Authors | Martin J. Wainwright |
Abstract | We introduce and analyze a form of variance-reduced $Q$-learning. For $\gamma$-discounted MDPs with finite state space $\mathcal{X}$ and action space $\mathcal{U}$, we prove that it yields an $\epsilon$-accurate estimate of the optimal $Q$-function in the $\ell_\infty$-norm using $\mathcal{O} \left(\left(\frac{D}{ \epsilon^2 (1-\gamma)^3} \right) ; \log \left( \frac{D}{(1-\gamma)} \right) \right)$ samples, where $D = \mathcal{X} \times \mathcal{U}$. This guarantee matches known minimax lower bounds up to a logarithmic factor in the discount complexity. In contrast, our past work shows that ordinary $Q$-learning has worst-case quartic scaling in the discount complexity. |
Tasks | Q-Learning |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04697v2 |
https://arxiv.org/pdf/1906.04697v2.pdf | |
PWC | https://paperswithcode.com/paper/variance-reduced-q-learning-is-minimax |
Repo | |
Framework | |
Identifying Hidden Buyers in Darknet Markets via Dirichlet Hawkes Process
Title | Identifying Hidden Buyers in Darknet Markets via Dirichlet Hawkes Process |
Authors | Panpan Zheng, Shuhan Yuan, Xintao Wu, Yubao Wu |
Abstract | The darknet markets are notorious black markets in cyberspace, which involve selling or brokering drugs, weapons, stolen credit cards, and other illicit goods. To combat illicit transactions in the cyberspace, it is important to analyze the behaviors of participants in darknet markets. Currently, many studies focus on studying the behavior of vendors. However, there is no much work on analyzing buyers. The key challenge is that the buyers are anonymized in darknet markets. For most of the darknet markets, We only observe the first and last digits of a buyer’s ID, such as ``a**b’'. To tackle this challenge, we propose a hidden buyer identification model, called UNMIX, which can group the transactions from one hidden buyer into one cluster given a transaction sequence from an anonymized ID. UNMIX is able to model the temporal dynamics information as well as the product, comment, and vendor information associated with each transaction. As a result, the transactions with similar patterns in terms of time and content group together as the subsequence from one hidden buyer. Experiments on the data collected from three real-world darknet markets demonstrate the effectiveness of our approach measured by various clustering metrics. Case studies on real transaction sequences explicitly show that our approach can group transactions with similar patterns into the same clusters. | |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04620v1 |
https://arxiv.org/pdf/1911.04620v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-hidden-buyers-in-darknet-markets |
Repo | |
Framework | |
Confounder-Aware Visualization of ConvNets
Title | Confounder-Aware Visualization of ConvNets |
Authors | Qingyu Zhao, Ehsan Adeli, Adolf Pfefferbaum, Edith V. Sullivan, Kilian M. Pohl |
Abstract | With recent advances in deep learning, neuroimaging studies increasingly rely on convolutional networks (ConvNets) to predict diagnosis based on MR images. To gain a better understanding of how a disease impacts the brain, the studies visualize the salience maps of the ConvNet highlighting voxels within the brain majorly contributing to the prediction. However, these salience maps are generally confounded, i.e., some salient regions are more predictive of confounding variables (such as age) than the diagnosis. To avoid such misinterpretation, we propose in this paper an approach that aims to visualize confounder-free saliency maps that only highlight voxels predictive of the diagnosis. The approach incorporates univariate statistical tests to identify confounding effects within the intermediate features learned by ConvNet. The influence from the subset of confounded features is then removed by a novel partial back-propagation procedure. We use this two-step approach to visualize confounder-free saliency maps extracted from synthetic and two real datasets. These experiments reveal the potential of our visualization in producing unbiased model-interpretation. |
Tasks | |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12727v1 |
https://arxiv.org/pdf/1907.12727v1.pdf | |
PWC | https://paperswithcode.com/paper/confounder-aware-visualization-of-convnets |
Repo | |
Framework | |
Landmark Detection in Low Resolution Faces with Semi-Supervised Learning
Title | Landmark Detection in Low Resolution Faces with Semi-Supervised Learning |
Authors | Amit Kumar, Rama Chellappa |
Abstract | Landmark detection algorithms trained on high resolution images perform poorly on datasets containing low resolution images. This deters the performance of algorithms relying on quality landmarks, for example, face recognition. To the best of our knowledge, there does not exist any dataset consisting of low resolution face images along with their annotated landmarks, making supervised training infeasible. In this paper, we present a semi-supervised approach to predict landmarks on low resolution images by learning them from labeled high resolution images. The objective of this work is to show that predicting landmarks directly on low resolution images is more effective than the current practice of aligning images after rescaling or superresolution. In a two-step process, the proposed approach first learns to generate low resolution images by modeling the distribution of target low resolution images. In the second stage, the roles of generated images and real low resolution images are switched and the model learns to predict landmarks for real low resolution images from generated low resolution images. With extensive experimentation, we study the impact of each of the design choices and also show that prediction of landmarks directly on low resolution images improves the performance of important tasks such as face recognition in low resolution images. |
Tasks | Face Recognition |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.13255v1 |
https://arxiv.org/pdf/1907.13255v1.pdf | |
PWC | https://paperswithcode.com/paper/landmark-detection-in-low-resolution-faces |
Repo | |
Framework | |
Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing
Title | Deep Reinforcement Learning with Discrete Normalized Advantage Functions for Resource Management in Network Slicing |
Authors | Chen Qi, Yuxiu Hua, Rongpeng Li, Zhifeng Zhao, Honggang Zhang |
Abstract | Network slicing promises to provision diversified services with distinct requirements in one infrastructure. Deep reinforcement learning (e.g., deep $\mathcal{Q}$-learning, DQL) is assumed to be an appropriate algorithm to solve the demand-aware inter-slice resource management issue in network slicing by regarding the varying demands and the allocated bandwidth as the environment state and the action, respectively. However, allocating bandwidth in a finer resolution usually implies larger action space, and unfortunately DQL fails to quickly converge in this case. In this paper, we introduce discrete normalized advantage functions (DNAF) into DQL, by separating the $\mathcal{Q}$-value function as a state-value function term and an advantage term and exploiting a deterministic policy gradient descent (DPGD) algorithm to avoid the unnecessary calculation of $\mathcal{Q}$-value for every state-action pair. Furthermore, as DPGD only works in continuous action space, we embed a k-nearest neighbor algorithm into DQL to quickly find a valid action in the discrete space nearest to the DPGD output. Finally, we verify the faster convergence of the DNAF-based DQL through extensive simulations. |
Tasks | Q-Learning |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04594v1 |
https://arxiv.org/pdf/1906.04594v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-with-discrete |
Repo | |
Framework | |