January 28, 2020

3281 words 16 mins read

Paper Group ANR 907

Paper Group ANR 907

Improving land cover segmentation across satellites using domain adaptation. Controlling Style and Semantics in Weakly-Supervised Image Generation. DENS: A Dataset for Multi-class Emotion Analysis. Domain adversarial learning for emotion recognition. Training CNNs with Selective Allocation of Channels. Attention-based Convolutional Neural Network f …

Improving land cover segmentation across satellites using domain adaptation

Title Improving land cover segmentation across satellites using domain adaptation
Authors Nadir Bengana, Janne Heikkilä
Abstract Land use and land cover mapping are essential to various fields of study, including forestry, agriculture, and urban management. Using earth observation satellites both facilitate and accelerate the task. Lately, deep learning methods have proven to be excellent at automating the mapping via semantic image segmentation. However, because deep neural networks require large amounts of labeled data, it is not easy to exploit the full potential of satellite imagery. Additionally, the land cover tends to differ in appearance from one region to another; therefore, having labeled data from one location does not necessarily help in mapping others. Furthermore, satellite images come in various multispectral bands (the bands could range from RGB to over twelve bands). In this paper, we aim at using domain adaptation to solve the aforementioned problems. We applied a well-performing domain adaptation approach on datasets we have built using RGB images from Sentinel-2, WorldView-2, and Pleiades-1 satellites with Corine Land Cover as ground-truth labels. We have also used the DeepGlobe land cover dataset. Experiments show a significant improvement over results obtained without the use of domain adaptation. In some cases, an improvement of over 20% MIoU. At times it even manages to correct errors in the ground-truth labels.
Tasks Domain Adaptation, Semantic Segmentation
Published 2019-11-25
URL https://arxiv.org/abs/1912.05000v2
PDF https://arxiv.org/pdf/1912.05000v2.pdf
PWC https://paperswithcode.com/paper/improving-land-cover-segmentation-across
Repo
Framework

Controlling Style and Semantics in Weakly-Supervised Image Generation

Title Controlling Style and Semantics in Weakly-Supervised Image Generation
Authors Dario Pavllo, Aurelien Lucchi, Thomas Hofmann
Abstract We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene. We exploit sparse semantic maps to control object shapes and classes, as well as textual descriptions or attributes to control both local and global style. To further augment the controllability of the scene, we propose a two-step generation scheme that decomposes background and foreground. The label maps used to train our model are produced by a large-vocabulary object detector, which enables access to unlabeled sets of data and provides structured instance information. In such a setting, we report better FID scores compared to a fully-supervised setting where the model is trained on ground-truth semantic maps. We also showcase the ability of our model to manipulate a scene on complex datasets such as COCO and Visual Genome.
Tasks Conditional Image Generation, Image Generation
Published 2019-12-06
URL https://arxiv.org/abs/1912.03161v1
PDF https://arxiv.org/pdf/1912.03161v1.pdf
PWC https://paperswithcode.com/paper/controlling-style-and-semantics-in-weakly
Repo
Framework

DENS: A Dataset for Multi-class Emotion Analysis

Title DENS: A Dataset for Multi-class Emotion Analysis
Authors Chen Liu, Muhammad Osama, Anderson de Andrade
Abstract We introduce a new dataset for multi-class emotion analysis from long-form narratives in English. The Dataset for Emotions of Narrative Sequences (DENS) was collected from both classic literature available on Project Gutenberg and modern online narratives available on Wattpad, annotated using Amazon Mechanical Turk. A number of statistics and baseline benchmarks are provided for the dataset. Of the tested techniques, we find that the fine-tuning of a pre-trained BERT model achieves the best results, with an average micro-F1 score of 60.4%. Our results show that the dataset provides a novel opportunity in emotion analysis that requires moving beyond existing sentence-level techniques.
Tasks Emotion Recognition
Published 2019-10-25
URL https://arxiv.org/abs/1910.11769v1
PDF https://arxiv.org/pdf/1910.11769v1.pdf
PWC https://paperswithcode.com/paper/dens-a-dataset-for-multi-class-emotion
Repo
Framework

Domain adversarial learning for emotion recognition

Title Domain adversarial learning for emotion recognition
Authors Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang
Abstract In practical applications for emotion recognition, users do not always exist in the training corpus. The mismatch between training speakers and testing speakers affects the performance of the trained model. To deal with this problem, we need our model to focus on emotion-related information, while ignoring the difference between speaker identities. In this paper, we look into the use of the domain adversarial neural network (DANN) to extract a common representation between different speakers. The primary task is to predict emotion labels. The secondary task is to learn a common representation where speaker identities can not be distinguished. By using the gradient reversal layer, the gradients coming from the secondary task are used to bring the representations for different speakers closer. To verify the effectiveness of the proposed method, we conduct experiments on the IEMOCAP database. Experimental results demonstrate that the proposed framework shows an absolute improvement of 3.48% over state-of-the-art strategies.
Tasks Emotion Recognition
Published 2019-10-24
URL https://arxiv.org/abs/1910.13807v1
PDF https://arxiv.org/pdf/1910.13807v1.pdf
PWC https://paperswithcode.com/paper/domain-adversarial-learning-for-emotion
Repo
Framework

Training CNNs with Selective Allocation of Channels

Title Training CNNs with Selective Allocation of Channels
Authors Jongheon Jeong, Jinwoo Shin
Abstract Recent progress in deep convolutional neural networks (CNNs) have enabled a simple paradigm of architecture design: larger models typically achieve better accuracy. Due to this, in modern CNN architectures, it becomes more important to design models that generalize well under certain resource constraints, e.g. the number of parameters. In this paper, we propose a simple way to improve the capacity of any CNN model having large-scale features, without adding more parameters. In particular, we modify a standard convolutional layer to have a new functionality of channel-selectivity, so that the layer is trained to select important channels to re-distribute their parameters. Our experimental results under various CNN architectures and datasets demonstrate that the proposed new convolutional layer allows new optima that generalize better via efficient resource utilization, compared to the baseline.
Tasks
Published 2019-05-11
URL https://arxiv.org/abs/1905.04509v1
PDF https://arxiv.org/pdf/1905.04509v1.pdf
PWC https://paperswithcode.com/paper/training-cnns-with-selective-allocation-of
Repo
Framework

Attention-based Convolutional Neural Network for Weakly Labeled Human Activities Recognition with Wearable Sensors

Title Attention-based Convolutional Neural Network for Weakly Labeled Human Activities Recognition with Wearable Sensors
Authors Kun Wang, Jun He, Lei Zhang
Abstract Unlike images or videos data which can be easily labeled by human being, sensor data annotation is a time-consuming process. However, traditional methods of human activity recognition require a large amount of such strictly labeled data for training classifiers. In this paper, we present an attention-based convolutional neural network for human recognition from weakly labeled data. The proposed attention model can focus on labeled activity among a long sequence of sensor data, and while filter out a large amount of background noise signals. In experiment on the weakly labeled dataset, we show that our attention model outperforms classical deep learning methods in accuracy. Besides, we determine the specific locations of the labeled activity in a long sequence of weakly labeled data by converting the compatibility score which is generated from attention model to compatibility density. Our method greatly facilitates the process of sensor data annotation, and makes data collection more easy.
Tasks Activity Recognition, Human Activity Recognition
Published 2019-03-24
URL http://arxiv.org/abs/1903.10909v2
PDF http://arxiv.org/pdf/1903.10909v2.pdf
PWC https://paperswithcode.com/paper/attention-based-convolutional-neural-network-4
Repo
Framework

Providentia – A Large Scale Sensing System for the Assistance of Autonomous Vehicles

Title Providentia – A Large Scale Sensing System for the Assistance of Autonomous Vehicles
Authors Annkathrin Krämmer, Christoph Schöller, Dhiraj Gulati, Alois Knoll
Abstract The environmental perception of autonomous vehicles is not only limited by physical sensor ranges and algorithmic performance, but also occlusions degrade their understanding of the current traffic situation. This poses a great threat for safety, limits their driving speed and can lead to inconvenient maneuvers that decrease their acceptance. Intelligent Transportation Systems can help to alleviate these problems. By providing autonomous vehicles with additional detailed information about the current traffic in form of a digital model of their world, i.e. a digital twin, an Intelligent Transportation System can fill in the gaps in the vehicle’s perception and enhance its field of view. However, detailed descriptions of implementations of such a system and working prototypes demonstrating its feasibility are scarce. In this work, we propose a hardware and software architecture to build such a reliable Intelligent Transportation System. We have implemented this system in the real world and show that it is able to create an accurate digital twin of an extended highway stretch. Furthermore, we provide this digital twin to an autonomous vehicle and demonstrate how it extends the vehicle’s perception beyond the limits of its on-board sensors.
Tasks Autonomous Vehicles
Published 2019-06-16
URL https://arxiv.org/abs/1906.06789v3
PDF https://arxiv.org/pdf/1906.06789v3.pdf
PWC https://paperswithcode.com/paper/providentia-a-large-scale-sensing-system-for
Repo
Framework

Learning Stylized Character Expressions from Humans

Title Learning Stylized Character Expressions from Humans
Authors Deepali Aneja, Alex Colburn, Gary Faigin, Linda Shapiro, Barbara Mones
Abstract We present DeepExpr, a novel expression transfer system from humans to multiple stylized characters via deep learning. We developed : 1) a data-driven perceptual model of facial expressions, 2) a novel stylized character data set with cardinal expression annotations : FERG (Facial Expression Research Group) - DB (added two new characters), and 3) . We evaluated our method on a set of retrieval tasks on our collected stylized character dataset of expressions. We have also shown that the ranking order predicted by the proposed features is highly correlated with the ranking order provided by a facial expression expert and Mechanical Turk (MT) experiments.
Tasks
Published 2019-11-19
URL https://arxiv.org/abs/1911.08591v1
PDF https://arxiv.org/pdf/1911.08591v1.pdf
PWC https://paperswithcode.com/paper/learning-stylized-character-expressions-from
Repo
Framework

Asymmetric Residual Neural Network for Accurate Human Activity Recognition

Title Asymmetric Residual Neural Network for Accurate Human Activity Recognition
Authors Jun Long, WuQing Sun, Zhan Yang, Osolo Ian Raymond
Abstract Human Activity Recognition (HAR) using deep neural network has become a hot topic in human-computer interaction. Machine can effectively identify human naturalistic activities by learning from a large collection of sensor data. Activity recognition is not only an interesting research problem, but also has many real-world practical applications. Based on the success of residual networks in achieving a high level of aesthetic representation of the automatic learning, we propose a novel \textbf{A}symmetric \textbf{R}esidual \textbf{N}etwork, named ARN. ARN is implemented using two identical path frameworks consisting of (1) a short time window, which is used to capture spatial features, and (2) a long time window, which is used to capture fine temporal features. The long time window path can be made very lightweight by reducing its channel capacity, yet still being able to learn useful temporal representations for activity recognition. In this paper, we mainly focus on proposing a new model to improve the accuracy of HAR. In order to demonstrate the effectiveness of ARN model, we carried out extensive experiments on benchmark datasets (i.e., OPPORTUNITY, UniMiB-SHAR) and compared with some conventional and state-of-the-art learning-based methods. Then, we discuss the influence of networks parameters on performance to provide insights about its optimization. Results from our experiments show that ARN is effective in recognizing human activities via wearable datasets.
Tasks Activity Recognition, Human Activity Recognition
Published 2019-03-13
URL https://arxiv.org/abs/1903.05359v3
PDF https://arxiv.org/pdf/1903.05359v3.pdf
PWC https://paperswithcode.com/paper/dual-residual-network-for-accurate-human
Repo
Framework

Automatic Text Line Segmentation Directly in JPEG Compressed Document Images

Title Automatic Text Line Segmentation Directly in JPEG Compressed Document Images
Authors Bulla Rajesh, Mohammed Javed, P Nagabhushan
Abstract JPEG is one of the popular image compression algorithms that provide efficient storage and transmission capabilities in consumer electronics, and hence it is the most preferred image format over the internet world. In the present digital and Big-data era, a huge volume of JPEG compressed document images are being archived and communicated through consumer electronics on daily basis. Though it is advantageous to have data in the compressed form on one side, however, on the other side processing with off-the-shelf methods becomes computationally expensive because it requires decompression and recompression operations. Therefore, it would be novel and efficient, if the compressed data are processed directly in their respective compressed domains of consumer electronics. In the present research paper, we propose to demonstrate this idea taking the case study of printed text line segmentation. Since, JPEG achieves compression by dividing the image into non overlapping 8x8 blocks in the pixel domain and using Discrete Cosine Transform (DCT); it is very likely that the partitioned 8x8 DCT blocks overlap the contents of two adjacent text-lines without leaving any clue for the line separator, thus making text-line segmentation a challenging problem. Two approaches of segmentation have been proposed here using the DC projection profile and AC coefficients of each 8x8 DCT block. The first approach is based on the strategy of partial decompression of selected DCT blocks, and the second approach is with intelligent analysis of F10 and F11 AC coefficients and without using any type of decompression. The proposed methods have been tested with variable font sizes, font style and spacing between lines, and a good performance is reported.
Tasks Image Compression
Published 2019-07-29
URL https://arxiv.org/abs/1907.12219v1
PDF https://arxiv.org/pdf/1907.12219v1.pdf
PWC https://paperswithcode.com/paper/automatic-text-line-segmentation-directly-in
Repo
Framework

Sample-efficient Deep Reinforcement Learning with Imaginary Rollouts for Human-Robot Interaction

Title Sample-efficient Deep Reinforcement Learning with Imaginary Rollouts for Human-Robot Interaction
Authors Mohammad Thabet, Massimiliano Patacchiola, Angelo Cangelosi
Abstract Deep reinforcement learning has proven to be a great success in allowing agents to learn complex tasks. However, its application to actual robots can be prohibitively expensive. Furthermore, the unpredictability of human behavior in human-robot interaction tasks can hinder convergence to a good policy. In this paper, we present an architecture that allows agents to learn models of stochastic environments and use them to accelerate learning. We descirbe how an environment model can be learned online and used to generate synthetic transitions, as well as how an agent can leverage these synthetic data to accelerate learning. We validate our approach using an experiment in which a robotic arm has to complete a task composed of a series of actions based on human gestures. Results show that our approach leads to significantly faster learning, requiring much less interaction with the environment. Furthermore, we demonstrate how learned models can be used by a robot to produce optimal plans in real world applications.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05546v1
PDF https://arxiv.org/pdf/1908.05546v1.pdf
PWC https://paperswithcode.com/paper/sample-efficient-deep-reinforcement-learning-4
Repo
Framework

Does Preference Always Help? A Holistic Study on Preference-Based Evolutionary Multi-Objective Optimisation Using Reference Points

Title Does Preference Always Help? A Holistic Study on Preference-Based Evolutionary Multi-Objective Optimisation Using Reference Points
Authors Ke Li, Minhui Liao, Kalyanmoy Deb, Geyong Min, Xin Yao
Abstract The ultimate goal of multi-objective optimisation is to help a decision maker (DM) identify solution(s) of interest (SOI) achieving satisfactory trade-offs among multiple conflicting criteria. This can be realised by leveraging DM’s preference information in evolutionary multi-objective optimisation (EMO). No consensus has been reached on the effectiveness brought by incorporating preference in EMO (either a priori or interactively) versus a posteriori decision making after a complete run of an EMO algorithm. Bearing this consideration in mind, this paper i) provides a pragmatic overview of the existing developments of preference-based EMO; and ii) conducts a series of experiments to investigate the effectiveness brought by preference incorporation in EMO for approximating various SOI. In particular, the DM’s preference information is elicited as a reference point, which represents her/his aspirations for different objectives. Experimental results demonstrate that preference incorporation in EMO does not always lead to a desirable approximation of SOI if the DM’s preference information is not well utilised, nor does the DM elicit invalid preference information, which is not uncommon when encountering a black-box system. To a certain extent, this issue can be remedied through an interactive preference elicitation. Last but not the least, we find that a preference-based EMO algorithm is able to be generalised to approximate the whole PF given an appropriate setup of preference information.
Tasks Decision Making
Published 2019-09-30
URL https://arxiv.org/abs/1909.13567v1
PDF https://arxiv.org/pdf/1909.13567v1.pdf
PWC https://paperswithcode.com/paper/does-preference-always-help-a-holistic-study
Repo
Framework

On the Evaluation Metric for Hashing

Title On the Evaluation Metric for Hashing
Authors Qing-Yuan Jiang, Ming-Wei Li, Wu-Jun Li
Abstract Due to its low storage cost and fast query speed, hashing has been widely used for large-scale approximate nearest neighbor (ANN) search. Bucket search, also called hash lookup, can achieve fast query speed with a sub-linear time cost based on the inverted index table constructed from hash codes. Many metrics have been adopted to evaluate hashing algorithms. However, all existing metrics are improper to evaluate the hash codes for bucket search. On one hand, all existing metrics ignore the retrieval time cost which is an important factor reflecting the performance of search. On the other hand, some of them, such as mean average precision (MAP), suffer from the uncertainty problem as the ranked list is based on integer-valued Hamming distance, and are insensitive to Hamming radius as these metrics only depend on relative Hamming distance. Other metrics, such as precision at Hamming radius R, fail to evaluate global performance as these metrics only depend on one specific Hamming radius. In this paper, we first point out the problems of existing metrics which have been ignored by the hashing community, and then propose a novel evaluation metric called radius aware mean average precision (RAMAP) to evaluate hash codes for bucket search. Furthermore, two coding strategies are also proposed to qualitatively show the problems of existing metrics. Experiments demonstrate that our proposed RAMAP can provide more proper evaluation than existing metrics.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.10951v1
PDF https://arxiv.org/pdf/1905.10951v1.pdf
PWC https://paperswithcode.com/paper/on-the-evaluation-metric-for-hashing
Repo
Framework

Online Human Activity Recognition Employing Hierarchical Hidden Markov Models

Title Online Human Activity Recognition Employing Hierarchical Hidden Markov Models
Authors Parviz Asghari, Elnaz Soelimani, Ehsan Nazerfard
Abstract In the last few years there has been a growing interest in Human Activity Recognition~(HAR) topic. Sensor-based HAR approaches, in particular, has been gaining more popularity owing to their privacy preserving nature. Furthermore, due to the widespread accessibility of the internet, a broad range of streaming-based applications such as online HAR, has emerged over the past decades. However, proposing sufficiently robust online activity recognition approach in smart environment setting is still considered as a remarkable challenge. This paper presents a novel online application of Hierarchical Hidden Markov Model in order to detect the current activity on the live streaming of sensor events. Our method consists of two phases. In the first phase, data stream is segmented based on the beginning and ending of the activity patterns. Also, on-going activity is reported with every receiving observation. This phase is implemented using Hierarchical Hidden Markov models. The second phase is devoted to the correction of the provided label for the segmented data stream based on statistical features. The proposed model can also discover the activities that happen during another activity - so-called interrupted activities. After detecting the activity pane, the predicted label will be corrected utilizing statistical features such as time of day at which the activity happened and the duration of the activity. We validated our proposed method by testing it against two different smart home datasets and demonstrated its effectiveness, which is competing with the state-of-the-art methods.
Tasks Activity Recognition, Human Activity Recognition
Published 2019-03-12
URL http://arxiv.org/abs/1903.04820v1
PDF http://arxiv.org/pdf/1903.04820v1.pdf
PWC https://paperswithcode.com/paper/online-human-activity-recognition-employing
Repo
Framework

Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal

Title Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal
Authors Jacques Carette, William M. Farmer, Michael Kohlhase, Florian Rabe
Abstract Over the last decades, a class of important mathematical results have required an ever increasing amount of human effort to carry out. For some, the help of computers is now indispensable. We analyze the implications of this trend towards “big mathematics”, its relation to human cognition, and how machine support for big math can be organized. The central contribution of this position paper is an information model for “doing mathematics”, which posits that humans very efficiently integrate four aspects: inference, computation, tabulation, and narration around a well-organized core of mathematical knowledge. The challenge for mathematical software systems is that these four aspects need to be integrated as well. We briefly survey the state of the art.
Tasks
Published 2019-04-23
URL https://arxiv.org/abs/1904.10405v2
PDF https://arxiv.org/pdf/1904.10405v2.pdf
PWC https://paperswithcode.com/paper/big-math-and-the-one-brain-barrier-a-position
Repo
Framework
comments powered by Disqus