April 1, 2020

3357 words 16 mins read

Paper Group ANR 445

Paper Group ANR 445

Friend Recommendation based on Hashtags Analysis. RF Sensing for Continuous Monitoring of Human Activities for Home Consumer Applications. Evaluating Temporal Queries Over Video Feeds. Unbiased Mean Teacher for Cross Domain Object Detection. Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes. FEA-Net: A Physics-guided Data-driven Mode …

Friend Recommendation based on Hashtags Analysis

Title Friend Recommendation based on Hashtags Analysis
Authors Ali Choumane, Zein Al Abidin Ibrahim
Abstract Social networks include millions of users constantly looking for new relationships for personal or professional purposes. Social network sites recommend friends based on relationship features and content information. A significant part of information shared every day is spread in Hashtags. None of the existing content-based recommender systems uses the semantic of hashtags while suggesting new friends. Currently, hashtags are considered as strings without looking at their meanings. Social network sites group together people sharing exactly the same hashtags and never semantically close ones. We think that hashtags encapsulate some people interests. In this paper, we propose a framework showing how a recommender system can benefit from hashtags to enrich users’ profiles. This framework consists of three main components: (1) constructing user’s profile based on shared hashtags, (2) matching method that computes semantic similarity between profiles, (3) grouping semantically close users using clustering technics. The proposed framework has been tested on a Twitter dataset from the Stanford Large Network Dataset Collection consisting of 81306 profiles.
Tasks Recommendation Systems, Semantic Similarity, Semantic Textual Similarity
Published 2020-03-07
URL https://arxiv.org/abs/2003.03531v1
PDF https://arxiv.org/pdf/2003.03531v1.pdf
PWC https://paperswithcode.com/paper/friend-recommendation-based-on-hashtags

RF Sensing for Continuous Monitoring of Human Activities for Home Consumer Applications

Title RF Sensing for Continuous Monitoring of Human Activities for Home Consumer Applications
Authors Moeness G. Amin, Arun Ravisankar, Ronny G. Guendel
Abstract Radar for indoor monitoring is an emerging area of research and development, covering and supporting different health and wellbeing applications of smart homes, assisted living, and medical diagnosis. We report on a successful RF sensing system for home monitoring applications. The system recognizes Activities of Daily Living(ADL) and detects unique motion characteristics, using data processing and training algorithms. We also examine the challenges of continuously monitoring various human activities which can be categorized into translation motions (active mode) and in-place motions (resting mode). We use the range-map, offered by a range-Doppler radar, to obtain the transition time between these two categories, characterized by changing and constant range values, respectively. This is achieved using the Radon transform that identifies straight lines of different slopes in the range-map image. Over the in-place motion time intervals, where activities have insignificant or negligible range swath, power threshold of the radar return micro-Doppler signatures,which is employed to define the time-spans of individual activities with insignificant or negligible range swath. Finding both the transition times and the time-spans of the different motions leads to improved classifications, as it avoids decisions rendered over time windows covering mixed activities.
Tasks Medical Diagnosis
Published 2020-03-21
URL https://arxiv.org/abs/2003.09699v1
PDF https://arxiv.org/pdf/2003.09699v1.pdf
PWC https://paperswithcode.com/paper/rf-sensing-for-continuous-monitoring-of-human

Evaluating Temporal Queries Over Video Feeds

Title Evaluating Temporal Queries Over Video Feeds
Authors Yueting Chen, Xiaohui Yu, Nick Koudas
Abstract Recent advances in Computer Vision and Deep Learning made possible the efficient extraction of a schema from frames of streaming video. As such, a stream of objects and their associated classes along with unique object identifiers derived via object tracking can be generated, providing unique objects as they are captured across frames. In this paper we initiate a study of temporal queries involving objects and their co-occurrences in video feeds. For example, queries that identify video segments during which the same two red cars and the same two humans appear jointly for five minutes are of interest to many applications ranging from law enforcement to security and safety. We take the first step and define such queries in a way that they incorporate certain physical aspects of video capture such as object occlusion. We present an architecture consisting of three layers, namely object detection/tracking, intermediate data generation and query evaluation. We propose two techniques,MFS and SSG, to organize all detected objects in the intermediate data generation layer, which effectively, given the queries, minimizes the number of objects and frames that have to be considered during query evaluation. We also introduce an algorithm called State Traversal (ST) that processes incoming frames against the SSG and efficiently prunes objects and frames unrelated to query evaluation, while maintaining all states required for succinct query evaluation. We present the results of a thorough experimental evaluation utilizing both real and synthetic data establishing the trade-offs between MFS and SSG. We stress various parameters of interest in our evaluation and demonstrate that the proposed query evaluation methodology coupled with the proposed algorithms is capable to evaluate temporal queries over video feeds efficiently, achieving orders of magnitude performance benefits.
Tasks Object Detection, Object Tracking
Published 2020-03-02
URL https://arxiv.org/abs/2003.00953v3
PDF https://arxiv.org/pdf/2003.00953v3.pdf
PWC https://paperswithcode.com/paper/evaluating-temporal-queries-over-video-feeds

Unbiased Mean Teacher for Cross Domain Object Detection

Title Unbiased Mean Teacher for Cross Domain Object Detection
Authors Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan
Abstract Cross domain object detection is challenging, because object detection model is often vulnerable to data variance, especially to the considerable domain shift in cross domain scenarios. In this paper, we propose a new approach called Unbiased Mean Teacher (UMT) for cross domain object detection. While the simple mean teacher (MT) model exhibits good robustness to small data variance, it can also become easily biased in cross domain scenarios. We thus improve it with several simple yet highly effective strategies. In particular, we firstly propose a novel cross domain distillation for MT to maximally exploit the expertise of the teacher model. Then, we further alleviate the bias in the student model by augmenting training samples with pixel-level adaptation. The feature level adversarial training is also incorporated to learn domain-invariant representation. Those strategies can be implemented easily into MT and leads to our unbiased MT model. Our model surpasses the existing state-of-the-art models in large margins on benchmark datasets, which demonstrates the effectiveness of our approach.
Tasks Object Detection
Published 2020-03-02
URL https://arxiv.org/abs/2003.00707v1
PDF https://arxiv.org/pdf/2003.00707v1.pdf
PWC https://paperswithcode.com/paper/unbiased-mean-teacher-for-cross-domain-object

Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

Title Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes
Authors Andre Mendes, Julian Togelius, Leandro dos Santos Coelho
Abstract In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.
Tasks Medical Diagnosis, Multi-Task Learning
Published 2020-03-15
URL https://arxiv.org/abs/2003.06899v1
PDF https://arxiv.org/pdf/2003.06899v1.pdf
PWC https://paperswithcode.com/paper/adversarial-encoder-multi-task-decoder-for

FEA-Net: A Physics-guided Data-driven Model for Efficient Mechanical Response Prediction

Title FEA-Net: A Physics-guided Data-driven Model for Efficient Mechanical Response Prediction
Authors Houpu Yao, Yi Gao, Yongming Liu
Abstract An innovative physics-guided learning algorithm for predicting the mechanical response of materials and structures is proposed in this paper. The key concept of the proposed study is based on the fact that physics models are governed by Partial Differential Equation (PDE), and its loading/ response mapping can be solved using Finite Element Analysis (FEA). Based on this, a special type of deep convolutional neural network (DCNN) is proposed that takes advantage of our prior knowledge in physics to build data-driven models whose architectures are of physics meaning. This type of network is named as FEA-Net and is used to solve the mechanical response under external loading. Thus, the identification of a mechanical system parameters and the computation of its responses are treated as the learning and inference of FEA-Net, respectively. Case studies on multi-physics (e.g., coupled mechanical-thermal analysis) and multi-phase problems (e.g., composite materials with random micro-structures) are used to demonstrate and verify the theoretical and computational advantages of the proposed method.
Published 2020-01-31
URL https://arxiv.org/abs/2002.01893v1
PDF https://arxiv.org/pdf/2002.01893v1.pdf
PWC https://paperswithcode.com/paper/fea-net-a-physics-guided-data-driven-model

VGAI: A Vision-Based Decentralized Controller Learning Framework for Robot Swarms

Title VGAI: A Vision-Based Decentralized Controller Learning Framework for Robot Swarms
Authors Ting-Kuei Hu, Fernando Gama, Zhangyang Wang, Alejandro Ribeiro, Brian M. Sadler
Abstract Despite the popularity of decentralized controller learning, very few successes have been demonstrated on learning to control large robot swarms using raw visual observations. To fill in this gap, we present Vision-based Graph Aggregation and Inference (VGAI), a decentralized learning-to-control framework that directly maps raw visual observations to agent actions, aided by sparse local communication among only neighboring agents. Our framework is implemented by an innovative cascade of convolutional neural networks (CNNs) and one graph neural network (GNN), addressing agent-level visual perception and feature learning, as well as swarm-level local information aggregation and agent action inference, respectively. Using the application example of drone flocking, we show that VGAI yields comparable or more competitive performance with other decentralized controllers, and even the centralized controller that learns from global information. Especially, it shows substantial scalability to learn over large swarms (e.g., 50 agents), thanks to the integration between visual perception and local communication.
Published 2020-02-06
URL https://arxiv.org/abs/2002.02308v1
PDF https://arxiv.org/pdf/2002.02308v1.pdf
PWC https://paperswithcode.com/paper/vgai-a-vision-based-decentralized-controller

Emo-CNN for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Title Emo-CNN for Perceiving Stress from Audio Signals: A Brain Chemistry Approach
Authors Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier
Abstract Emotion plays a key role in many applications like healthcare, to gather patients emotional behavior. There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is defining the very meaning of stress and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation and emotion detection is the limited amount of annotated data of stress. The existing labelled stress emotion datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC features in Convolutional Neural Network. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. To tackle the second and the more significant problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.
Tasks Feature Selection
Published 2020-01-08
URL https://arxiv.org/abs/2001.02329v1
PDF https://arxiv.org/pdf/2001.02329v1.pdf
PWC https://paperswithcode.com/paper/emo-cnn-for-perceiving-stress-from-audio

Variational Learning of Individual Survival Distributions

Title Variational Learning of Individual Survival Distributions
Authors Zidi Xiu, Chenyang Tao, Ricardo Henao
Abstract The abundance of modern health data provides many opportunities for the use of machine learning techniques to build better statistical models to improve clinical decision making. Predicting time-to-event distributions, also known as survival analysis, plays a key role in many clinical applications. We introduce a variational time-to-event prediction model, named Variational Survival Inference (VSI), which builds upon recent advances in distribution learning techniques and deep neural networks. VSI addresses the challenges of non-parametric distribution estimation by ($i$) relaxing the restrictive modeling assumptions made in classical models, and ($ii$) efficiently handling the censored observations, {\it i.e.}, events that occur outside the observation window, all within the variational framework. To validate the effectiveness of our approach, an extensive set of experiments on both synthetic and real-world datasets is carried out, showing improved performance relative to competing solutions.
Tasks Decision Making, Survival Analysis, Time-to-Event Prediction
Published 2020-03-09
URL https://arxiv.org/abs/2003.04430v1
PDF https://arxiv.org/pdf/2003.04430v1.pdf
PWC https://paperswithcode.com/paper/variational-learning-of-individual-survival

Adaptive Object Detection with Dual Multi-Label Prediction

Title Adaptive Object Detection with Dual Multi-Label Prediction
Authors Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye
Abstract In this paper, we propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection by exploiting multi-label object recognition as a dual auxiliary task. The model exploits multi-label prediction to reveal the object category information in each image and then uses the prediction results to perform conditional adversarial global feature alignment, such that the multi-modal structure of image features can be tackled to bridge the domain divergence at the global feature level while preserving the discriminability of the features. Moreover, we introduce a prediction consistency regularization mechanism to assist object detection, which uses the multi-label prediction results as an auxiliary regularization information to ensure consistent object category discoveries between the object recognition task and the object detection task. Experiments are conducted on a few benchmark datasets and the results show the proposed model outperforms the state-of-the-art comparison methods.
Tasks Domain Adaptation, Object Detection, Object Recognition
Published 2020-03-29
URL https://arxiv.org/abs/2003.12943v1
PDF https://arxiv.org/pdf/2003.12943v1.pdf
PWC https://paperswithcode.com/paper/adaptive-object-detection-with-dual-multi

M^2 Deep-ID: A Novel Model for Multi-View Face Identification Using Convolutional Deep Neural Networks

Title M^2 Deep-ID: A Novel Model for Multi-View Face Identification Using Convolutional Deep Neural Networks
Authors Sara Shahsavarani, Morteza Analoui, Reza Shoja Ghiass
Abstract Despite significant advances in Deep Face Recognition (DFR) systems, introducing new DFRs under specific constraints such as varying pose still remains a big challenge. Most particularly, due to the 3D nature of a human head, facial appearance of the same subject introduces a high intra-class variability when projected to the camera image plane. In this paper, we propose a new multi-view Deep Face Recognition (MVDFR) system to address the mentioned challenge. In this context, multiple 2D images of each subject under different views are fed into the proposed deep neural network with a unique design to re-express the facial features in a single and more compact face descriptor, which in turn, produces a more informative and abstract way for face identification using convolutional neural networks. To extend the functionality of our proposed system to multi-view facial images, the golden standard Deep-ID model is modified in our proposed model. The experimental results indicate that our proposed method yields a 99.8% accuracy, while the state-of-the-art method achieves a 97% accuracy. We also gathered the Iran University of Science and Technology (IUST) face database with 6552 images of 504 subjects to accomplish our experiments.
Tasks Face Identification, Face Recognition
Published 2020-01-22
URL https://arxiv.org/abs/2001.07871v1
PDF https://arxiv.org/pdf/2001.07871v1.pdf
PWC https://paperswithcode.com/paper/m2-deep-id-a-novel-model-for-multi-view-face

Sampling for Deep Learning Model Diagnosis (Technical Report)

Title Sampling for Deep Learning Model Diagnosis (Technical Report)
Authors Parmita Mehta, Stephen Portillo, Magdalena Balazinska, Andrew Connolly
Abstract Deep learning (DL) models have achieved paradigm-changing performance in many fields with high dimensional data, such as images, audio, and text. However, the black-box nature of deep neural networks is a barrier not just to adoption in applications such as medical diagnosis, where interpretability is essential, but also impedes diagnosis of under performing models. The task of diagnosing or explaining DL models requires the computation of additional artifacts, such as activation values and gradients. These artifacts are large in volume, and their computation, storage, and querying raise significant data management challenges. In this paper, we articulate DL diagnosis as a data management problem, and we propose a general, yet representative, set of queries to evaluate systems that strive to support this new workload. We further develop a novel data sampling technique that produce approximate but accurate results for these model debugging queries. Our sampling technique utilizes the lower dimension representation learned by the DL model and focuses on model decision boundaries for the data in this lower dimensional space. We evaluate our techniques on one standard computer vision and one scientific data set and demonstrate that our sampling technique outperforms a variety of state-of-the-art alternatives in terms of query accuracy.
Tasks Medical Diagnosis
Published 2020-02-22
URL https://arxiv.org/abs/2002.09754v1
PDF https://arxiv.org/pdf/2002.09754v1.pdf
PWC https://paperswithcode.com/paper/sampling-for-deep-learning-model-diagnosis

Hazard Detection in Supermarkets using Deep Learning on the Edge

Title Hazard Detection in Supermarkets using Deep Learning on the Edge
Authors M. G. Sarwar Murshed, Edward Verenich, James J. Carroll, Nazar Khan, Faraz Hussain
Abstract Supermarkets need to ensure clean and safe environments for both shoppers and employees. Slips, trips, and falls can result in injuries that have a physical as well as financial cost. Timely detection of hazardous conditions such as spilled liquids or fallen items on supermarket floors can reduce the chances of serious injuries. This paper presents EdgeLite, a novel, lightweight deep learning model for easy deployment and inference on resource-constrained devices. We describe the use of EdgeLite on two edge devices for detecting supermarket floor hazards. On a hazard detection dataset that we developed, EdgeLite, when deployed on edge devices, outperformed six state-of-the-art object detection models in terms of accuracy while having comparable memory usage and inference time.
Tasks Object Detection
Published 2020-02-29
URL https://arxiv.org/abs/2003.04116v1
PDF https://arxiv.org/pdf/2003.04116v1.pdf
PWC https://paperswithcode.com/paper/hazard-detection-in-supermarkets-using-deep

Learning Fair Scoring Functions: Fairness Definitions, Algorithms and Generalization Bounds for Bipartite Ranking

Title Learning Fair Scoring Functions: Fairness Definitions, Algorithms and Generalization Bounds for Bipartite Ranking
Authors Robin Vogel, Aurélien Bellet, Stéphan Clémençon
Abstract Many applications of artificial intelligence, ranging from credit lending to the design of medical diagnosis support tools through recidivism prediction, involve scoring individuals using a learned function of their attributes. These predictive risk scores are used to rank a set of people, and/or take individual decisions about them based on whether the score exceeds a certain threshold that may depend on the context in which the decision is taken. The level of delegation granted to such systems will heavily depend on how questions of fairness can be answered. While this concern has received a lot of attention in the classification setup, the design of relevant fairness constraints for the problem of learning scoring functions has not been much investigated. In this paper, we propose a flexible approach to group fairness for the scoring problem with binary labeled data, a standard learning task referred to as bipartite ranking. We argue that the functional nature of the ROC curve, the gold standard measuring ranking performance in this context, leads to several possible ways of formulating fairness constraints. We introduce general classes of fairness conditions in bipartite ranking and establish generalization bounds for scoring rules learned under such constraints. Beyond the theoretical formulation and results, we design practical learning algorithms and illustrate our approach with numerical experiments.
Tasks Medical Diagnosis
Published 2020-02-19
URL https://arxiv.org/abs/2002.08159v1
PDF https://arxiv.org/pdf/2002.08159v1.pdf
PWC https://paperswithcode.com/paper/learning-fair-scoring-functions-fairness

Sparse Recovery With Non-Linear Fourier Features

Title Sparse Recovery With Non-Linear Fourier Features
Authors Ayca Ozcelikkale
Abstract Random non-linear Fourier features have recently shown remarkable performance in a wide-range of regression and classification applications. Motivated by this success, this article focuses on a sparse non-linear Fourier feature (NFF) model. We provide a characterization of the sufficient number of data points that guarantee perfect recovery of the unknown parameters with high-probability. In particular, we show how the sufficient number of data points depends on the kernel matrix associated with the probability distribution function of the input data. We compare our results with the recoverability bounds for the bounded orthonormal systems and provide examples that illustrate sparse recovery under the NFF model.
Published 2020-02-12
URL https://arxiv.org/abs/2002.04985v1
PDF https://arxiv.org/pdf/2002.04985v1.pdf
PWC https://paperswithcode.com/paper/sparse-recovery-with-non-linear-fourier
comments powered by Disqus