April 2, 2020

2986 words 15 mins read

Paper Group ANR 170

Paper Group ANR 170

RoutedFusion: Learning Real-time Depth Map Fusion. Automated Anonymisation of Visual and Audio Data in Classroom Studies. VideoSSL: Semi-Supervised Learning for Video Classification. Peeking into occluded joints: A novel framework for crowd pose estimation. Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals. Towa …

RoutedFusion: Learning Real-time Depth Map Fusion

Title RoutedFusion: Learning Real-time Depth Map Fusion
Authors Silvan Weder, Johannes Schönberger, Marc Pollefeys, Martin R. Oswald
Abstract The efficient fusion of depth maps is a key part of most state-of-the-art 3D reconstruction methods. Besides requiring high accuracy, these depth fusion methods need to be scalable and real-time capable. To this end, we present a novel real-time capable machine learning-based method for depth map fusion. Similar to the seminal depth map fusion approach by Curless and Levoy, we only update a local group of voxels to ensure real-time capability. Instead of a simple linear fusion of depth information, we propose a neural network that predicts non-linear updates to better account for typical fusion errors. Our network is composed of a 2D depth routing network and a 3D depth fusion network which efficiently handle sensor-specific noise and outliers. This is especially useful for surface edges and thin objects for which the original approach suffers from thickening artifacts. Our method outperforms the traditional fusion approach and related learned approaches on both synthetic and real data. We demonstrate the performance of our method in reconstructing fine geometric details from noise and outlier contaminated data on various scenes
Tasks 3D Reconstruction
Published 2020-01-13
URL https://arxiv.org/abs/2001.04388v1
PDF https://arxiv.org/pdf/2001.04388v1.pdf
PWC https://paperswithcode.com/paper/routedfusion-learning-real-time-depth-map
Repo
Framework

Automated Anonymisation of Visual and Audio Data in Classroom Studies

Title Automated Anonymisation of Visual and Audio Data in Classroom Studies
Authors Ömer Sümer, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci
Abstract Understanding students’ and teachers’ verbal and non-verbal behaviours during instruction may help infer valuable information regarding the quality of teaching. In education research, there have been many studies that aim to measure students’ attentional focus on learning-related tasks: Based on audio-visual recordings and manual or automated ratings of behaviours of teachers and students. Student data is, however, highly sensitive. Therefore, ensuring high standards of data protection and privacy has the utmost importance in current practices. For example, in the context of teaching management studies, data collection is carried out with the consent of pupils, parents, teachers and school administrations. Nevertheless, there may often be students whose data cannot be used for research purposes. Excluding these students from the classroom is an unnatural intrusion into the organisation of the classroom. A possible solution would be to request permission to record the audio-visual recordings of all students (including those who do not voluntarily participate in the study) and to anonymise their data. Yet, the manual anonymisation of audio-visual data is very demanding. In this study, we examine the use of artificial intelligence methods to automatically anonymise the visual and audio data of a particular person.
Tasks
Published 2020-01-14
URL https://arxiv.org/abs/2001.05080v1
PDF https://arxiv.org/pdf/2001.05080v1.pdf
PWC https://paperswithcode.com/paper/automated-anonymisation-of-visual-and-audio
Repo
Framework

VideoSSL: Semi-Supervised Learning for Video Classification

Title VideoSSL: Semi-Supervised Learning for Video Classification
Authors Longlong Jing, Toufiq Parag, Zhe Wu, Yingli Tian, Hongcheng Wang
Abstract We propose a semi-supervised learning approach for video classification, VideoSSL, using convolutional neural networks (CNN). Like other computer vision tasks, existing supervised video classification methods demand a large amount of labeled data to attain good performance. However, annotation of a large dataset is expensive and time consuming. To minimize the dependence on a large annotated dataset, our proposed semi-supervised method trains from a small number of labeled examples and exploits two regulatory signals from unlabeled data. The first signal is the pseudo-labels of unlabeled examples computed from the confidences of the CNN being trained. The other is the normalized probabilities, as predicted by an image classifier CNN, that captures the information about appearances of the interesting objects in the video. We show that, under the supervision of these guiding signals from unlabeled examples, a video classification CNN can achieve impressive performances utilizing a small fraction of annotated examples on three publicly available datasets: UCF101, HMDB51 and Kinetics.
Tasks Video Classification
Published 2020-02-29
URL https://arxiv.org/abs/2003.00197v1
PDF https://arxiv.org/pdf/2003.00197v1.pdf
PWC https://paperswithcode.com/paper/videossl-semi-supervised-learning-for-video
Repo
Framework

Peeking into occluded joints: A novel framework for crowd pose estimation

Title Peeking into occluded joints: A novel framework for crowd pose estimation
Authors Lingteng Qiu, Xuanye Zhang, Yanran Li, Guanbin Li, Xiaojun Wu, Zixiang Xiong, Xiaoguang Han, Shuguang Cui
Abstract Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions. Their intrinsic problem is that they directly localize the joints based on visual information; however, the invisible joints are lack of that. In contrast to localization, our framework estimates the invisible joints from an inference perspective by proposing an Image-Guided Progressive GCN module which provides a comprehensive understanding of both image context and pose structure. Moreover, existing benchmarks contain limited occlusions for evaluation. Therefore, we thoroughly pursue this problem and propose a novel OPEC-Net framework together with a new Occluded Pose (OCPose) dataset with 9k annotated images. Extensive quantitative and qualitative evaluations on benchmarks demonstrate that OPEC-Net achieves significant improvements over recent leading works. Notably, our OCPose is the most complex occlusion dataset with respect to average IoU between adjacent instances. Source code and OCPose will be publicly available.
Tasks Pose Estimation
Published 2020-03-23
URL https://arxiv.org/abs/2003.10506v3
PDF https://arxiv.org/pdf/2003.10506v3.pdf
PWC https://paperswithcode.com/paper/peeking-into-occluded-joints-a-novel
Repo
Framework

Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals

Title Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals
Authors Siyuan Li, Semih Günel, Mirela Ostrek, Pavan Ramdya, Pascal Fua, Helge Rhodin
Abstract Our goal is to capture the pose of neuroscience model organisms, without using any manual supervision, to be able to study how neural circuits orchestrate behaviour. Human pose estimation attains remarkable accuracy when trained on real or simulated datasets consisting of millions of frames. However, for many applications simulated models are unrealistic and real training datasets with comprehensive annotations do not exist. We address this problem with a new sim2real domain transfer method. Our key contribution is the explicit and independent modeling of appearance, shape and poses in an unpaired image translation framework. Our model lets us train a pose estimator on the target domain by transferring readily available body keypoint locations from the source domain to generated target images. We compare our approach with existing domain transfer methods and demonstrate improved pose estimation accuracy on Drosophila melanogaster (fruit fly), Caenorhabditis elegans (worm) and Danio rerio (zebrafish), without requiring any manual annotation on the target domain and despite using simplistic off-the-shelf animal characters for simulation, or simple geometric shapes as models. Our new datasets, code, and trained models will be published to support future neuroscientific studies.
Tasks Pose Estimation
Published 2020-01-23
URL https://arxiv.org/abs/2001.08601v1
PDF https://arxiv.org/pdf/2001.08601v1.pdf
PWC https://paperswithcode.com/paper/deformation-aware-unpaired-image-translation
Repo
Framework

Towards Cognitive Routing based on Deep Reinforcement Learning

Title Towards Cognitive Routing based on Deep Reinforcement Learning
Authors Jiawei Wu, Jianxue Li, Yang Xiao, Jun Liu
Abstract Routing is one of the key functions for stable operation of network infrastructure. Nowadays, the rapid growth of network traffic volume and changing of service requirements call for more intelligent routing methods than before. Towards this end, we propose a definition of cognitive routing and an implementation approach based on Deep Reinforcement Learning (DRL). To facilitate the research of DRL-based cognitive routing, we introduce a simulator named RL4Net for DRL-based routing algorithm development and simulation. Then, we design and implement a DDPG-based routing algorithm. The simulation results on an example network topology show that the DDPG-based routing algorithm achieves better performance than OSPF and random weight algorithms. It demonstrate the preliminary feasibility and potential advantage of cognitive routing for future network.
Tasks
Published 2020-03-19
URL https://arxiv.org/abs/2003.12439v1
PDF https://arxiv.org/pdf/2003.12439v1.pdf
PWC https://paperswithcode.com/paper/towards-cognitive-routing-based-on-deep
Repo
Framework

Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

Title Black Box Explanation by Learning Image Exemplars in the Latent Feature Space
Authors Riccardo Guidotti, Anna Monreale, Stan Matwin, Dino Pedreschi
Abstract We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by “morphing” into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
Tasks Image Classification
Published 2020-01-27
URL https://arxiv.org/abs/2002.03746v1
PDF https://arxiv.org/pdf/2002.03746v1.pdf
PWC https://paperswithcode.com/paper/black-box-explanation-by-learning-image
Repo
Framework

VQ-DRAW: A Sequential Discrete VAE

Title VQ-DRAW: A Sequential Discrete VAE
Authors Alex Nichol
Abstract In this paper, I present VQ-DRAW, an algorithm for learning compact discrete representations of data. VQ-DRAW leverages a vector quantization effect to adapt the sequential generation scheme of DRAW to discrete latent variables. I show that VQ-DRAW can effectively learn to compress images from a variety of common datasets, as well as generate realistic samples from these datasets with no help from an autoregressive prior.
Tasks Quantization
Published 2020-03-03
URL https://arxiv.org/abs/2003.01599v1
PDF https://arxiv.org/pdf/2003.01599v1.pdf
PWC https://paperswithcode.com/paper/vq-draw-a-sequential-discrete-vae
Repo
Framework

Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization

Title Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization
Authors Ahmed T. Elthakeb, Prannoy Pilligundla, Fatemehsadat Mireshghallah, Tarek Elgindi, Charles-Alban Deledalle, Hadi Esmaeilzadeh
Abstract As deep neural networks make their ways into different domains, their compute efficiency is becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the operations (below 8 bits), offers a unique opportunity as it can reduce both the storage and compute requirements of the network super-linearly. However, if not employed with diligence, this can lead to significant accuracy loss. Due to the strong inter-dependence between layers and exhibiting different characteristics across the same network, choosing an optimal bitwidth per layer granularity is not a straight forward. As such, deep quantization opens a large hyper-parameter space, the exploration of which is a major challenge. We propose a novel sinusoidal regularization, called SINAREQ, for deep quantized training. Leveraging the sinusoidal properties, we seek to learn multiple quantization parameterization in conjunction during gradient-based training process. Specifically, we learn (i) a per-layer quantization bitwidth along with (ii) a scale factor through learning the period of the sinusoidal function. At the same time, we exploit the periodicity, differentiability, and the local convexity profile in sinusoidal functions to automatically propel (iii) network weights towards values quantized at levels that are jointly determined. We show how SINAREQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks (AlexNet, CIFAR-10, MobileNet, ResNet-18, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy. Furthermore, we carry out experimentation using fixed homogenous bitwidths with 3- to 5-bit assignment and show the versatility of SINAREQ in enhancing quantized training algorithms (DoReFa and WRPN) with about 4.8% accuracy improvements on average, and then outperforming multiple state-of-the-art techniques.
Tasks Quantization
Published 2020-02-29
URL https://arxiv.org/abs/2003.00146v1
PDF https://arxiv.org/pdf/2003.00146v1.pdf
PWC https://paperswithcode.com/paper/gradient-based-deep-quantization-of-neural
Repo
Framework

On Parameter Tuning in Meta-learning for Computer Vision

Title On Parameter Tuning in Meta-learning for Computer Vision
Authors Farid Ghareh Mohammadi, M. Hadi Amini, Hamid R. Arabnia
Abstract Learning to learn plays a pivotal role in meta-learning (MTL) to obtain an optimal learning model. In this paper, we investigate mage recognition for unseen categories of a given dataset with limited training information. We deploy a zero-shot learning (ZSL) algorithm to achieve this goal. We also explore the effect of parameter tuning on performance of semantic auto-encoder (SAE). We further address the parameter tuning problem for meta-learning, especially focusing on zero-shot learning. By combining different embedded parameters, we improved the accuracy of tuned-SAE. Advantages and disadvantages of parameter tuning and its application in image classification are also explored.
Tasks Image Classification, Meta-Learning, Zero-Shot Learning
Published 2020-02-11
URL https://arxiv.org/abs/2003.00837v1
PDF https://arxiv.org/pdf/2003.00837v1.pdf
PWC https://paperswithcode.com/paper/on-parameter-tuning-in-meta-learning-for
Repo
Framework

Adversarial Attack on Deep Product Quantization Network for Image Retrieval

Title Adversarial Attack on Deep Product Quantization Network for Image Retrieval
Authors Yan Feng, Bin Chen, Tao Dai, Shutao Xia
Abstract Deep product quantization network (DPQN) has recently received much attention in fast image retrieval tasks due to its efficiency of encoding high-dimensional visual features especially when dealing with large-scale datasets. Recent studies show that deep neural networks (DNNs) are vulnerable to input with small and maliciously designed perturbations (a.k.a., adversarial examples). This phenomenon raises the concern of security issues for DPQN in the testing/deploying stage as well. However, little effort has been devoted to investigating how adversarial examples affect DPQN. To this end, we propose product quantization adversarial generation (PQ-AG), a simple yet effective method to generate adversarial examples for product quantization based retrieval systems. PQ-AG aims to generate imperceptible adversarial perturbations for query images to form adversarial queries, whose nearest neighbors from a targeted product quantizaiton model are not semantically related to those from the original queries. Extensive experiments show that our PQ-AQ successfully creates adversarial examples to mislead targeted product quantization retrieval models. Besides, we found that our PQ-AG significantly degrades retrieval performance in both white-box and black-box settings.
Tasks Adversarial Attack, Image Retrieval, Quantization
Published 2020-02-26
URL https://arxiv.org/abs/2002.11374v1
PDF https://arxiv.org/pdf/2002.11374v1.pdf
PWC https://paperswithcode.com/paper/adversarial-attack-on-deep-product
Repo
Framework

A Simulation Model Demonstrating the Impact of Social Aspects on Social Internet of Things

Title A Simulation Model Demonstrating the Impact of Social Aspects on Social Internet of Things
Authors Kashif Zia
Abstract In addition to seamless connectivity and smartness, the objects in the Internet of Things (IoT) are expected to have the social capabilities – these objects are termed as ``social objects’'. In this paper, an intuitive paradigm of social interactions between these objects are argued and modeled. The impact of social behavior on the interaction pattern of social objects is studied taking Peer-to-Peer (P2P) resource sharing as an example application. The model proposed in this paper studies the implications of competitive vs. cooperative social paradigm, while peers attempt to attain the shared resources / services. The simulation results divulge that the social capabilities of the peers impart a significant increase in the quality of interactions between social objects. Through an agent-based simulation study, it is proved that cooperative strategy is more efficient than competitive strategy. Moreover, cooperation with an underpinning on real-life networking structure and mobility does not negatively impact the efficiency of the system at all; rather it helps. |
Tasks
Published 2020-02-23
URL https://arxiv.org/abs/2002.11507v1
PDF https://arxiv.org/pdf/2002.11507v1.pdf
PWC https://paperswithcode.com/paper/a-simulation-model-demonstrating-the-impact
Repo
Framework

Task Augmentation by Rotating for Meta-Learning

Title Task Augmentation by Rotating for Meta-Learning
Authors Jialin Liu, Fei Chao, Chih-Min Lin
Abstract Data augmentation is one of the most effective approaches for improving the accuracy of modern machine learning models, and it is also indispensable to train a deep model for meta-learning. In this paper, we introduce a task augmentation method by rotating, which increases the number of classes by rotating the original images 90, 180 and 270 degrees, different from traditional augmentation methods which increase the number of images. With a larger amount of classes, we can sample more diverse task instances during training. Therefore, task augmentation by rotating allows us to train a deep network by meta-learning methods with little over-fitting. Experimental results show that our approach is better than the rotation for increasing the number of images and achieves state-of-the-art performance on miniImageNet, CIFAR-FS, and FC100 few-shot learning benchmarks. The code is available on \url{www.github.com/AceChuse/TaskLevelAug}.
Tasks Data Augmentation, Few-Shot Learning, Meta-Learning
Published 2020-02-08
URL https://arxiv.org/abs/2003.00804v1
PDF https://arxiv.org/pdf/2003.00804v1.pdf
PWC https://paperswithcode.com/paper/task-augmentation-by-rotating-for-meta
Repo
Framework

Neural Sign Language Translation by Learning Tokenization

Title Neural Sign Language Translation by Learning Tokenization
Authors Alptekin Orbay, Lale Akarun
Abstract Sign Language Translation has attained considerable success recently, raising hopes for improved communication with the Deaf. A pre-processing step called tokenization improves the success of translations. Tokens can be learned from sign videos if supervised data is available. However, data annotation at the gloss level is costly, and annotated data is scarce. The paper utilizes Adversarial, Multitask, Transfer Learning to search for semi-supervised tokenization approaches without burden of additional labeling. It provides extensive experiments to compare all the methods in different settings to conduct a deeper analysis. In the case of no additional target annotation besides sentences, the proposed methodology attains 13.25 BLUE-4 and 36.28 ROUGE scores which improves the current state-of-the-art by 4 points in BLUE-4 and 5 points in ROUGE.
Tasks Sign Language Translation, Tokenization, Transfer Learning
Published 2020-02-02
URL https://arxiv.org/abs/2002.00479v2
PDF https://arxiv.org/pdf/2002.00479v2.pdf
PWC https://paperswithcode.com/paper/neural-sign-language-translation-by-learning
Repo
Framework

Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future

Title Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future
Authors Grace W. Lindsay
Abstract Convolutional neural networks (CNNs) were inspired by early findings in the study of biological vision. They have since become successful tools in computer vision and state-of-the-art models of both neural activity and behavior on visual tasks. This review highlights what, in the context of CNNs, it means to be a good model in computational neuroscience and the various ways models can provide insight. Specifically, it covers the origins of CNNs and the methods by which we validate them as models of biological vision. It then goes on to elaborate on what we can learn about biological vision by understanding and experimenting on CNNs and discusses emerging opportunities for the use of CNNS in vision research beyond basic object recognition.
Tasks Object Recognition
Published 2020-01-20
URL https://arxiv.org/abs/2001.07092v2
PDF https://arxiv.org/pdf/2001.07092v2.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-networks-as-a-model-of
Repo
Framework
comments powered by Disqus