October 17, 2019

3306 words 16 mins read

Paper Group ANR 752

Dropout with Tabu Strategy for Regularizing Deep Neural Networks. Towards Robust Human Activity Recognition from RGB Video Stream with Limited Labeled Data. Play Duration based User-Entity Affinity Modeling in Spoken Dialog System. Towards Deep and Representation Learning for Talent Search at LinkedIn. Reinforcement Learning for Autonomous Defence …

Dropout with Tabu Strategy for Regularizing Deep Neural Networks


Title	Dropout with Tabu Strategy for Regularizing Deep Neural Networks
Authors	Zongjie Ma, Abdul Sattar, Jun Zhou, Qingliang Chen, Kaile Su
Abstract	Dropout has proven to be an effective technique for regularization and preventing the co-adaptation of neurons in deep neural networks (DNN). It randomly drops units with a probability $p$ during the training stage of DNN. Dropout also provides a way of approximately combining exponentially many different neural network architectures efficiently. In this work, we add a diversification strategy into dropout, which aims at generating more different neural network architectures in a proper times of iterations. The dropped units in last forward propagation will be marked. Then the selected units for dropping in the current FP will be kept if they have been marked in the last forward propagation. We only mark the units from the last forward propagation. We call this new technique Tabu Dropout. Tabu Dropout has no extra parameters compared with the standard Dropout and also it is computationally cheap. The experiments conducted on MNIST, Fashion-MNIST datasets show that Tabu Dropout improves the performance of the standard dropout.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09907v1
PDF	http://arxiv.org/pdf/1808.09907v1.pdf
PWC	https://paperswithcode.com/paper/dropout-with-tabu-strategy-for-regularizing
Repo
Framework

Towards Robust Human Activity Recognition from RGB Video Stream with Limited Labeled Data


Title	Towards Robust Human Activity Recognition from RGB Video Stream with Limited Labeled Data
Authors	Krishanu Sarker, Mohamed Masoud, Saeid Belkasim, Shihao Ji
Abstract	Human activity recognition based on video streams has received numerous attentions in recent years. Due to lack of depth information, RGB video based activity recognition performs poorly compared to RGB-D video based solutions. On the other hand, acquiring depth information, inertia etc. is costly and requires special equipment, whereas RGB video streams are available in ordinary cameras. Hence, our goal is to investigate whether similar or even higher accuracy can be achieved with RGB-only modality. In this regard, we propose a novel framework that couples skeleton data extracted from RGB video and deep Bidirectional Long Short Term Memory (BLSTM) model for activity recognition. A big challenge of training such a deep network is the limited training data, and exploring RGB-only stream significantly exaggerates the difficulty. We therefore propose a set of algorithmic techniques to train this model effectively, e.g., data augmentation, dynamic frame dropout and gradient injection. The experiments demonstrate that our RGB-only solution surpasses the state-of-the-art approaches that all exploit RGB-D video streams by a notable margin. This makes our solution widely deployable with ordinary cameras.
Tasks	Activity Recognition, Data Augmentation, Human Activity Recognition
Published	2018-12-16
URL	http://arxiv.org/abs/1812.06544v1
PDF	http://arxiv.org/pdf/1812.06544v1.pdf
PWC	https://paperswithcode.com/paper/towards-robust-human-activity-recognition
Repo
Framework

Play Duration based User-Entity Affinity Modeling in Spoken Dialog System


Title	Play Duration based User-Entity Affinity Modeling in Spoken Dialog System
Authors	Bo Xiao, Nicholas Monath, Shankar Ananthakrishnan, Abishek Ravi
Abstract	Multimedia streaming services over spoken dialog systems have become ubiquitous. User-entity affinity modeling is critical for the system to understand and disambiguate user intents and personalize user experiences. However, fully voice-based interaction demands quantification of novel behavioral cues to determine user affinities. In this work, we propose using play duration cues to learn a matrix factorization based collaborative filtering model. We first binarize play durations to obtain implicit positive and negative affinity labels. The Bayesian Personalized Ranking objective and learning algorithm are employed in our low-rank matrix factorization approach. To cope with uncertainties in the implicit affinity labels, we propose to apply a weighting function that emphasizes the importance of high confidence samples. Based on a large-scale database of Alexa music service records, we evaluate the affinity models by computing Spearman correlation between play durations and predicted affinities. Comparing different data utilizations and weighting functions, we find that employing both positive and negative affinity samples with a convex weighting function yields the best performance. Further analysis demonstrates the model’s effectiveness on individual entity level and provides insights on the temporal dynamics of observed affinities.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11479v1
PDF	http://arxiv.org/pdf/1806.11479v1.pdf
PWC	https://paperswithcode.com/paper/play-duration-based-user-entity-affinity
Repo
Framework

Towards Deep and Representation Learning for Talent Search at LinkedIn


Title	Towards Deep and Representation Learning for Talent Search at LinkedIn
Authors	Rohan Ramanath, Hakan Inan, Gungor Polatkan, Bo Hu, Qi Guo, Cagri Ozcaglar, Xianren Wu, Krishnaram Kenthapadi, Sahin Cem Geyik
Abstract	Talent search and recommendation systems at LinkedIn strive to match the potential candidates to the hiring needs of a recruiter or a hiring manager expressed in terms of a search query or a job posting. Recent work in this domain has mainly focused on linear models, which do not take complex relationships between features into account, as well as ensemble tree models, which introduce non-linearity but are still insufficient for exploring all the potential feature interactions, and strictly separate feature generation from modeling. In this paper, we present the results of our application of deep and representation learning models on LinkedIn Recruiter. Our key contributions include: (i) Learning semantic representations of sparse entities within the talent search domain, such as recruiter ids, candidate ids, and skill entity ids, for which we utilize neural network models that take advantage of LinkedIn Economic Graph, and (ii) Deep models for learning recruiter engagement and candidate response in talent search applications. We also explore learning to rank approaches applied to deep models, and show the benefits for the talent search use case. Finally, we present offline and online evaluation results for LinkedIn talent search and recommendation systems, and discuss potential challenges along the path to a fully deep model architecture. The challenges and approaches discussed generalize to any multi-faceted search engine.
Tasks	Learning Semantic Representations, Learning-To-Rank, Recommendation Systems, Representation Learning
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06473v1
PDF	http://arxiv.org/pdf/1809.06473v1.pdf
PWC	https://paperswithcode.com/paper/towards-deep-and-representation-learning-for
Repo
Framework

Reinforcement Learning for Autonomous Defence in Software-Defined Networking


Title	Reinforcement Learning for Autonomous Defence in Software-Defined Networking
Authors	Yi Han, Benjamin I. P. Rubinstein, Tamas Abraham, Tansu Alpcan, Olivier De Vel, Sarah Erfani, David Hubczenko, Christopher Leckie, Paul Montague
Abstract	Despite the successful application of machine learning (ML) in a wide range of domains, adaptability—the very property that makes machine learning desirable—can be exploited by adversaries to contaminate training and evade classification. In this paper, we investigate the feasibility of applying a specific class of machine learning algorithms, namely, reinforcement learning (RL) algorithms, for autonomous cyber defence in software-defined networking (SDN). In particular, we focus on how an RL agent reacts towards different forms of causative attacks that poison its training process, including indiscriminate and targeted, white-box and black-box attacks. In addition, we also study the impact of the attack timing, and explore potential countermeasures such as adversarial training.
Tasks
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05770v1
PDF	http://arxiv.org/pdf/1808.05770v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-for-autonomous-defence
Repo
Framework

M2M-GAN: Many-to-Many Generative Adversarial Transfer Learning for Person Re-Identification


Title	M2M-GAN: Many-to-Many Generative Adversarial Transfer Learning for Person Re-Identification
Authors	Wenqi Liang, Guangcong Wang, Jianhuang Lai, Junyong Zhu
Abstract	Cross-domain transfer learning (CDTL) is an extremely challenging task for the person re-identification (ReID). Given a source domain with annotations and a target domain without annotations, CDTL seeks an effective method to transfer the knowledge from the source domain to the target domain. However, such a simple two-domain transfer learning method is unavailable for the person ReID in that the source/target domain consists of several sub-domains, e.g., camera-based sub-domains. To address this intractable problem, we propose a novel Many-to-Many Generative Adversarial Transfer Learning method (M2M-GAN) that takes multiple source sub-domains and multiple target sub-domains into consideration and performs each sub-domain transferring mapping from the source domain to the target domain in a unified optimization process. The proposed method first translates the image styles of source sub-domains into that of target sub-domains, and then performs the supervised learning by using the transferred images and the corresponding annotations in source domain. As the gap is reduced, M2M-GAN achieves a promising result for the cross-domain person ReID. Experimental results on three benchmark datasets Market-1501, DukeMTMC-reID and MSMT17 show the effectiveness of our M2M-GAN.
Tasks	Person Re-Identification, Transfer Learning
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03768v1
PDF	http://arxiv.org/pdf/1811.03768v1.pdf
PWC	https://paperswithcode.com/paper/m2m-gan-many-to-many-generative-adversarial
Repo
Framework

An Interpretable Machine Vision Approach to Human Activity Recognition using Photoplethysmograph Sensor Data


Title	An Interpretable Machine Vision Approach to Human Activity Recognition using Photoplethysmograph Sensor Data
Authors	Eoin Brophy, José Juan Dominguez Veiga, Zhengwei Wang, Alan F. Smeaton, Tomas E. Ward
Abstract	The current gold standard for human activity recognition (HAR) is based on the use of cameras. However, the poor scalability of camera systems renders them impractical in pursuit of the goal of wider adoption of HAR in mobile computing contexts. Consequently, researchers instead rely on wearable sensors and in particular inertial sensors. A particularly prevalent wearable is the smart watch which due to its integrated inertial and optical sensing capabilities holds great potential for realising better HAR in a non-obtrusive way. This paper seeks to simplify the wearable approach to HAR through determining if the wrist-mounted optical sensor alone typically found in a smartwatch or similar device can be used as a useful source of data for activity recognition. The approach has the potential to eliminate the need for the inertial sensing element which would in turn reduce the cost of and complexity of smartwatches and fitness trackers. This could potentially commoditise the hardware requirements for HAR while retaining the functionality of both heart rate monitoring and activity capture all from a single optical sensor. Our approach relies on the adoption of machine vision for activity recognition based on suitably scaled plots of the optical signals. We take this approach so as to produce classifications that are easily explainable and interpretable by non-technical users. More specifically, images of photoplethysmography signal time series are used to retrain the penultimate layer of a convolutional neural network which has initially been trained on the ImageNet database. We then use the 2048 dimensional features from the penultimate layer as input to a support vector machine. Results from the experiment yielded an average classification accuracy of 92.3%. This result outperforms that of an optical and inertial sensor combined (78%) and illustrates the capability of HAR systems using…
Tasks	Activity Recognition, Human Activity Recognition, Time Series
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00668v1
PDF	http://arxiv.org/pdf/1812.00668v1.pdf
PWC	https://paperswithcode.com/paper/an-interpretable-machine-vision-approach-to
Repo
Framework

A Game-Theoretic Approach to Design Secure and Resilient Distributed Support Vector Machines


Title	A Game-Theoretic Approach to Design Secure and Resilient Distributed Support Vector Machines
Authors	Rui Zhang, Quanyan Zhu
Abstract	Distributed Support Vector Machines (DSVM) have been developed to solve large-scale classification problems in networked systems with a large number of sensors and control units. However, the systems become more vulnerable as detection and defense are increasingly difficult and expensive. This work aims to develop secure and resilient DSVM algorithms under adversarial environments in which an attacker can manipulate the training data to achieve his objective. We establish a game-theoretic framework to capture the conflicting interests between an adversary and a set of distributed data processing units. The Nash equilibrium of the game allows predicting the outcome of learning algorithms in adversarial environments, and enhancing the resilience of the machine learning through dynamic distributed learning algorithms. We prove that the convergence of the distributed algorithm is guaranteed without assumptions on the training data or network topologies. Numerical experiments are conducted to corroborate the results. We show that network topology plays an important role in the security of DSVM. Networks with fewer nodes and higher average degrees are more secure. Moreover, a balanced network is found to be less vulnerable to attacks.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02907v1
PDF	http://arxiv.org/pdf/1802.02907v1.pdf
PWC	https://paperswithcode.com/paper/a-game-theoretic-approach-to-design-secure
Repo
Framework

Skeleton-based Activity Recognition with Local Order Preserving Match of Linear Patches


Title	Skeleton-based Activity Recognition with Local Order Preserving Match of Linear Patches
Authors	Yaqiang Yao, Yan Liu, Huanhuan Chen
Abstract	Human activity recognition has drawn considerable attention recently in the field of computer vision due to the development of commodity depth cameras, by which the human activity is represented as a sequence of 3D skeleton postures. Assuming human body 3D joint locations of an activity lie on a manifold, the problem of recognizing human activity is formulated as the computation of activity manifold-manifold distance (AMMD). In this paper, we first design an efficient division method to decompose a manifold into ordered continuous maximal linear patches (CMLPs) that denote meaningful action snippets of the action sequence. Then the CMLP is represented by its position (average value of points) and the first principal component, which specify the major posture and main evolving direction of an action snippet, respectively. Finally, we compute the distance between CMLPs by taking both the posture and direction into consideration. Based on these preparations, an intuitive distance measure that preserves the local order of action snippets is proposed to compute AMMD. The performance on two benchmark datasets demonstrates the effectiveness of the proposed approach.
Tasks	Activity Recognition, Human Activity Recognition
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00256v1
PDF	http://arxiv.org/pdf/1811.00256v1.pdf
PWC	https://paperswithcode.com/paper/skeleton-based-activity-recognition-with
Repo
Framework

Unsupervised Adversarial Invariance


Title	Unsupervised Adversarial Invariance
Authors	Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Premkumar Natarajan
Abstract	Data representations that contain all the information about target variables but are invariant to nuisance factors benefit supervised learning algorithms by preventing them from learning associations between these factors and the targets, thus reducing overfitting. We present a novel unsupervised invariance induction framework for neural networks that learns a split representation of data through competitive training between the prediction task and a reconstruction task coupled with disentanglement, without needing any labeled information about nuisance factors or domain knowledge. We describe an adversarial instantiation of this framework and provide analysis of its working. Our unsupervised model outperforms state-of-the-art methods, which are supervised, at inducing invariance to inherent nuisance factors, effectively using synthetic data augmentation to learn invariance, and domain adaptation. Our method can be applied to any prediction task, eg., binary/multi-class classification or regression, without loss of generality.
Tasks	Data Augmentation, Domain Adaptation
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10083v1
PDF	http://arxiv.org/pdf/1809.10083v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-adversarial-invariance
Repo
Framework

ICFVR 2017: 3rd International Competition on Finger Vein Recognition


Title	ICFVR 2017: 3rd International Competition on Finger Vein Recognition
Authors	Yi Zhang, Houjun Huang, Haifeng Zhang, Liao Ni, Wei Xu, Nasir Uddin Ahmed, Md. Shakil Ahmed, Yilun Jin, Yingjie Chen, Jingxuan Wen, Wenxin Li
Abstract	In recent years, finger vein recognition has become an important sub-field in biometrics and been applied to real-world applications. The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets. In order to motivate research on finger vein recognition, we released the largest finger vein data set up to now and hold finger vein recognition competitions based on our data set every year. In 2017, International Competition on Finger Vein Recognition(ICFVR) is held jointly with IJCB 2017. 11 teams registered and 10 of them joined the final evaluation. The winner of this year dramatically improved the EER from 2.64% to 0.483% compared to the winner of last year. In this paper, we introduce the process and results of ICFVR 2017 and give insights on development of state-of-art finger vein recognition algorithms.
Tasks
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01262v1
PDF	http://arxiv.org/pdf/1801.01262v1.pdf
PWC	https://paperswithcode.com/paper/icfvr-2017-3rd-international-competition-on
Repo
Framework

PerceptionNet: A Deep Convolutional Neural Network for Late Sensor Fusion


Title	PerceptionNet: A Deep Convolutional Neural Network for Late Sensor Fusion
Authors	Panagiotis Kasnesis, Charalampos Z. Patrikakis, Iakovos S. Venieris
Abstract	Human Activity Recognition (HAR) based on motion sensors has drawn a lot of attention over the last few years, since perceiving the human status enables context-aware applications to adapt their services on users’ needs. However, motion sensor fusion and feature extraction have not reached their full potentials, remaining still an open issue. In this paper, we introduce PerceptionNet, a deep Convolutional Neural Network (CNN) that applies a late 2D convolution to multimodal time-series sensor data, in order to extract automatically efficient features for HAR. We evaluate our approach on two public available HAR datasets to demonstrate that the proposed model fuses effectively multimodal sensors and improves the performance of HAR. In particular, PerceptionNet surpasses the performance of state-of-the-art HAR methods based on: (i) features extracted from humans, (ii) deep CNNs exploiting early fusion approaches, and (iii) Long Short-Term Memory (LSTM), by an average accuracy of more than 3%.
Tasks	Activity Recognition, Human Activity Recognition, Sensor Fusion, Time Series
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00170v1
PDF	http://arxiv.org/pdf/1811.00170v1.pdf
PWC	https://paperswithcode.com/paper/perceptionnet-a-deep-convolutional-neural
Repo
Framework

Reconciling meta-learning and continual learning with online mixtures of tasks


Title	Reconciling meta-learning and continual learning with online mixtures of tasks
Authors	Ghassen Jerfel, Erin Grant, Thomas L. Griffiths, Katherine Heller
Abstract	Learning-to-learn or meta-learning leverages data-driven inductive bias to increase the efficiency of learning on a novel task. This approach encounters difficulty when transfer is not advantageous, for instance, when tasks are considerably dissimilar or change over time. We use the connection between gradient-based meta-learning and hierarchical Bayes to propose a Dirichlet process mixture of hierarchical Bayesian models over the parameters of an arbitrary parametric model such as a neural network. In contrast to consolidating inductive biases into a single set of hyperparameters, our approach of task-dependent hyperparameter selection better handles latent distribution shift, as demonstrated on a set of evolving, image-based, few-shot learning benchmarks.
Tasks	Continual Learning, Few-Shot Learning, Meta-Learning
Published	2018-12-14
URL	https://arxiv.org/abs/1812.06080v3
PDF	https://arxiv.org/pdf/1812.06080v3.pdf
PWC	https://paperswithcode.com/paper/online-gradient-based-mixtures-for-transfer
Repo
Framework

Neural Machine Translation with Adequacy-Oriented Learning


Title	Neural Machine Translation with Adequacy-Oriented Learning
Authors	Xiang Kong, Zhaopeng Tu, Shuming Shi, Eduard Hovy, Tong Zhang
Abstract	Although Neural Machine Translation (NMT) models have advanced state-of-the-art performance in machine translation, they face problems like the inadequate translation. We attribute this to that the standard Maximum Likelihood Estimation (MLE) cannot judge the real translation quality due to its several limitations. In this work, we propose an adequacy-oriented learning mechanism for NMT by casting translation as a stochastic policy in Reinforcement Learning (RL), where the reward is estimated by explicitly measuring translation adequacy. Benefiting from the sequence-level training of RL strategy and a more accurate reward designed specifically for translation, our model outperforms multiple strong baselines, including (1) standard and coverage-augmented attention models with MLE-based training, and (2) advanced reinforcement and adversarial training strategies with rewards based on both word-level BLEU and character-level chrF3. Quantitative and qualitative analyses on different language pairs and NMT architectures demonstrate the effectiveness and universality of the proposed approach.
Tasks	Machine Translation
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08541v1
PDF	http://arxiv.org/pdf/1811.08541v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-with-adequacy
Repo
Framework

Reversed Active Learning based Atrous DenseNet for Pathological Image Classification


Title	Reversed Active Learning based Atrous DenseNet for Pathological Image Classification
Authors	Yuexiang Li, Xinpeng Xie, Linlin Shen, Shaoxiong Liu
Abstract	Witnessed the development of deep learning in recent years, increasing number of researches try to adopt deep learning model for medical image analysis. However, the usage of deep learning networks for the pathological image analysis encounters several challenges, e.g. high resolution (gigapixel) of pathological images and lack of annotations of cancer areas. To address the challenges, we proposed a complete framework for the pathological image classification, which consists of a novel training strategy, namely reversed active learning (RAL), and an advanced network, namely atrous DenseNet (ADN). The proposed RAL can remove the mislabel patches in the training set. The refined training set can then be used to train widely used deep learning networks, e.g. VGG-16, ResNets, etc. A novel deep learning network, i.e. atrous DenseNet (ADN), is also proposed for the classification of pathological images. The proposed ADN achieves multi-scale feature extraction by integrating the atrous convolutions to the Dense Block. The proposed RAL and ADN have been evaluated on two pathological datasets, i.e. BACH and CCG. The experimental results demonstrate the excellent performance of the proposed ADN + RAL framework, i.e. the average patch-level ACAs of 94.10% and 92.05% on BACH and CCG validation sets were achieved.
Tasks	Active Learning, Image Classification
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02420v1
PDF	http://arxiv.org/pdf/1807.02420v1.pdf
PWC	https://paperswithcode.com/paper/reversed-active-learning-based-atrous
Repo
Framework