January 28, 2020

2884 words 14 mins read

Paper Group ANR 782

Boundless: Generative Adversarial Networks for Image Extension. Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model. Sparsity Constrained Distributed Unmixing of Hyperspectral Data. Faster Unsupervised Semantic Inpainting: A GAN Based Approach. Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR …

Boundless: Generative Adversarial Networks for Image Extension


Title	Boundless: Generative Adversarial Networks for Image Extension
Authors	Piotr Teterwak, Aaron Sarna, Dilip Krishnan, Aaron Maschinot, David Belanger, Ce Liu, William T. Freeman
Abstract	Image extension models have broad applications in image editing, computational photography and computer graphics. While image inpainting has been extensively studied in the literature, it is challenging to directly apply the state-of-the-art inpainting methods to image extension as they tend to generate blurry or repetitive pixels with inconsistent semantics. We introduce semantic conditioning to the discriminator of a generative adversarial network (GAN), and achieve strong results on image extension with coherent semantics and visually pleasing colors and textures. We also show promising results in extreme extensions, such as panorama generation.
Tasks	Image Inpainting
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07007v1
PDF	https://arxiv.org/pdf/1908.07007v1.pdf
PWC	https://paperswithcode.com/paper/boundless-generative-adversarial-networks-for
Repo
Framework

Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model


Title	Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model
Authors	Longsheng Jiang, Yue Wang
Abstract	Often, when modeling human decision-making behaviors in the context of human-robot teaming, the emotion aspect of human is ignored. Nevertheless, the influence of emotion, in some cases, is not only undeniable but beneficial. This work studies the human-like characteristics brought by regret emotion in one-human-multi-robot teaming for the application of domain search. In such application, the task management load is outsourced to the robots to reduce the human’s workload, freeing the human to do more important work. The regret decision model is first used by each robot for deciding whether to request human service, then is extended for optimally queuing the requests from multiple robots. For the movement of the robots in the domain search, we designed a path planning algorithm based on dynamic programming for each robot. The simulation shows that the human-like characteristics, namely, risk-seeking and risk-aversion, indeed bring some appealing effects for balancing the workload and performance in the human-multi-robot team.
Tasks	Decision Making
Published	2019-09-18
URL	https://arxiv.org/abs/1910.00087v1
PDF	https://arxiv.org/pdf/1910.00087v1.pdf
PWC	https://paperswithcode.com/paper/respect-your-emotion-human-multi-robot
Repo
Framework

Sparsity Constrained Distributed Unmixing of Hyperspectral Data


Title	Sparsity Constrained Distributed Unmixing of Hyperspectral Data
Authors	Sara Khoshsokhan, Roozbeh Rajabi, Hadi Zayyani
Abstract	Spectral unmixing (SU) is a technique to characterize mixed pixels in hyperspectral images measured by remote sensors. Most of the spectral unmixing algorithms are developed using the linear mixing models. To estimate endmembers and fractional abundance matrices in a blind problem, nonnegative matrix factorization (NMF) and its developments are widely used in the SU problem. One of the constraints which was added to NMF is sparsity, that was regularized by Lq norm. In this paper, a new algorithm based on distributed optimization is suggested for spectral unmixing. In the proposed algorithm, a network including single-node clusters is employed. Each pixel in the hyperspectral images is considered as a node in this network. The sparsity constrained distributed unmixing is optimized with diffusion least mean p-power (LMP) strategy, and then the update equations for fractional abundance and signature matrices are obtained. Afterwards the proposed algorithm is analyzed for different values of LMP power and Lq norms. Simulation results based on defined performance metrics illustrate the advantage of the proposed algorithm in spectral unmixing of hyperspectral data compared with other methods.
Tasks	Distributed Optimization
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07593v1
PDF	http://arxiv.org/pdf/1902.07593v1.pdf
PWC	https://paperswithcode.com/paper/sparsity-constrained-distributed-unmixing-of
Repo
Framework

Faster Unsupervised Semantic Inpainting: A GAN Based Approach


Title	Faster Unsupervised Semantic Inpainting: A GAN Based Approach
Authors	Avisek Lahiri, Arnav Kumar Jain, Divyasri Nadendla, Prabir Kumar Biswas
Abstract	In this paper, we propose to improve the inference speed and visual quality of contemporary baseline of Generative Adversarial Networks (GAN) based unsupervised semantic inpainting. This is made possible with better initialization of the core iterative optimization involved in the framework. To our best knowledge, this is also the first attempt of GAN based video inpainting with consideration to temporal cues. On single image inpainting, we achieve about 4.5-5$\times$ speedup and 80$\times$ on videos compared to baseline. Simultaneously, our method has better spatial and temporal reconstruction qualities as found on three image and one video dataset.
Tasks	Image Inpainting, Video Inpainting
Published	2019-08-14
URL	https://arxiv.org/abs/1908.04968v1
PDF	https://arxiv.org/pdf/1908.04968v1.pdf
PWC	https://paperswithcode.com/paper/faster-unsupervised-semantic-inpainting-a-gan
Repo
Framework

Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR Point Cloud Classification


Title	Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR Point Cloud Classification
Authors	Xiang Li, Mingyang Wang, Congcong Wen, Lingjing Wang, Nan Zhou, Yi Fang
Abstract	To better address challenging issues of the irregularity and inhomogeneity inherently present in 3D point clouds, researchers have been shifting their focus from the design of hand-craft point feature towards the learning of 3D point signatures using deep neural networks for 3D point cloud classification. Recent proposed deep learning based point cloud classification methods either apply 2D CNN on projected feature images or apply 1D convolutional layers directly on raw point sets. These methods cannot adequately recognize fine-grained local structures caused by the uneven density distribution of the point cloud data. In this paper, to address this challenging issue, we introduced a density-aware convolution module which uses the point-wise density to re-weight the learnable weights of convolution kernels. The proposed convolution module is able to fully approximate the 3D continuous convolution on unevenly distributed 3D point sets. Based on this convolution module, we further developed a multi-scale fully convolutional neural network with downsampling and upsampling blocks to enable hierarchical point feature learning. In addition, to regularize the global semantic context, we implemented a context encoding module to predict a global context encoding and formulated a context encoding regularizer to enforce the predicted context encoding to be aligned with the ground truth one. The overall network can be trained in an end-to-end fashion with the raw 3D coordinates as well as the height above ground as inputs. Experiments on the International Society for Photogrammetry and Remote Sensing (ISPRS) 3D labeling benchmark demonstrated the superiority of the proposed method for point cloud classification. Our model achieved a new state-of-the-art performance with an average F1 score of 71.2% and improved the performance by a large margin on several categories.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05909v1
PDF	https://arxiv.org/pdf/1910.05909v1.pdf
PWC	https://paperswithcode.com/paper/density-aware-convolutional-networks-with
Repo
Framework

Predictive modeling of brain tumor: A Deep learning approach


Title	Predictive modeling of brain tumor: A Deep learning approach
Authors	Priyansh Saxena, Akshat Maheshwari, Shivani Tayal, Saumil Maheshwari
Abstract	Image processing concepts can visualize the different anatomy structure of the human body. Recent advancements in the field of deep learning have made it possible to detect the growth of cancerous tissue just by a patient’s brain Magnetic Resonance Imaging (MRI) scans. These methods require very high accuracy and meager false negative rates to be of any practical use. This paper presents a Convolutional Neural Network (CNN) based transfer learning approach to classify the brain MRI scans into two classes using three pre-trained models. The performances of these models are compared with each other. Experimental results show that the Resnet-50 model achieves the highest accuracy and least false negative rates as 95% and zero respectively. It is followed by VGG-16 and Inception-V3 model with an accuracy of 90% and 55% respectively.
Tasks	Transfer Learning
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02265v4
PDF	https://arxiv.org/pdf/1911.02265v4.pdf
PWC	https://paperswithcode.com/paper/predictive-modeling-of-brain-tumor-a-deep
Repo
Framework

From News to Medical: Cross-domain Discourse Segmentation


Title	From News to Medical: Cross-domain Discourse Segmentation
Authors	Elisa Ferracane, Titan Page, Junyi Jessy Li, Katrin Erk
Abstract	The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problems can be addressed earlier in the pipeline, while others would require expanding the corpus to a trainable size to learn the nuances of the medical domain.
Tasks
Published	2019-04-14
URL	http://arxiv.org/abs/1904.06682v1
PDF	http://arxiv.org/pdf/1904.06682v1.pdf
PWC	https://paperswithcode.com/paper/from-news-to-medical-cross-domain-discourse
Repo
Framework

Neural Architectures for Fine-Grained Propaganda Detection in News


Title	Neural Architectures for Fine-Grained Propaganda Detection in News
Authors	Pankaj Gupta, Khushbu Saxena, Usama Yaseen, Thomas Runkler, Hinrich Schütze
Abstract	This paper describes our system (MIC-CIS) details and results of participation in the fine-grained propaganda detection shared task 2019. To address the tasks of sentence (SLC) and fragment level (FLC) propaganda detection, we explore different neural architectures (e.g., CNN, LSTM-CRF and BERT) and extract linguistic (e.g., part-of-speech, named entity, readability, sentiment, emotion, etc.), layout and topical features. Specifically, we have designed multi-granularity and multi-tasking neural architectures to jointly perform both the sentence and fragment level propaganda detection. Additionally, we investigate different ensemble schemes such as majority-voting, relax-voting, etc. to boost overall system performance. Compared to the other participating systems, our submissions are ranked 3rd and 4th in FLC and SLC tasks, respectively.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06162v1
PDF	https://arxiv.org/pdf/1909.06162v1.pdf
PWC	https://paperswithcode.com/paper/neural-architectures-for-fine-grained
Repo
Framework

The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots


Title	The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots
Authors	Fabio Cermelli, Massimiliano Mancini, Elisa Ricci, Barbara Caputo
Abstract	Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research.
Tasks	Object Detection, Pose Estimation, Scene Segmentation
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00912v2
PDF	http://arxiv.org/pdf/1904.00912v2.pdf
PWC	https://paperswithcode.com/paper/the-rgb-d-triathlon-towards-agile-visual
Repo
Framework

Segmentation Mask Guided End-to-End Person Search


Title	Segmentation Mask Guided End-to-End Person Search
Authors	Dingyuan Zheng, Jimin Xiao, Kaizhu Huang, Yao Zhao
Abstract	Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on benchmark dataset CUHK-SYSU, show that our proposed model achieves the state-of-the-art performance with 86.3% mAP and 86.5 top-1 accuracy respectively.
Tasks	Pedestrian Detection, Person Re-Identification, Person Search
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10179v1
PDF	https://arxiv.org/pdf/1908.10179v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-mask-guided-end-to-end-person
Repo
Framework

Towards Standardization of Data Licenses: The Montreal Data License


Title	Towards Standardization of Data Licenses: The Montreal Data License
Authors	Misha Benjamin, Paul Gagnon, Negar Rostamzadeh, Chris Pal, Yoshua Bengio, Alex Shee
Abstract	This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper’s goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper’s approach is summarized in a new family of data license language - \textit{the Montreal Data License (MDL)}. Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper.
Tasks
Published	2019-03-21
URL	http://arxiv.org/abs/1903.12262v1
PDF	http://arxiv.org/pdf/1903.12262v1.pdf
PWC	https://paperswithcode.com/paper/towards-standardization-of-data-licenses-the
Repo
Framework

Encoder-Decoder based CNN and Fully Connected CRFs for Remote Sensed Image Segmentation


Title	Encoder-Decoder based CNN and Fully Connected CRFs for Remote Sensed Image Segmentation
Authors	Vikas Agaradahalli Gurumurthy
Abstract	With the advancement of remote-sensed imaging large volumes of very high resolution land cover images can now be obtained. Automation of object recognition in these 2D images, however, is still a key issue. High intra-class variance and low inter-class variance in Very High Resolution (VHR) images hamper the accuracy of prediction in object recognition tasks. Most successful techniques in various computer vision tasks recently are based on deep supervised learning. In this work, a deep Convolutional Neural Network (CNN) based on symmetric encoder-decoder architecture with skip connections is employed for the 2D semantic segmentation of most common land cover object classes - impervious surface, buildings, low vegetation, trees and cars. Atrous convolutions are employed to have large receptive field in the proposed CNN model. Further, the CNN outputs are post-processed using Fully Connected Conditional Random Field (FCRF) model to refine the CNN pixel label predictions. The proposed CNN-FCRF model achieves an overall accuracy of 90.5% on the ISPRS Vaihingen Dataset.
Tasks	Object Recognition, Semantic Segmentation
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06041v1
PDF	https://arxiv.org/pdf/1910.06041v1.pdf
PWC	https://paperswithcode.com/paper/encoder-decoder-based-cnn-and-fully-connected
Repo
Framework

CNN-based Analog CSI Feedback in FDD MIMO-OFDM Systems


Title	CNN-based Analog CSI Feedback in FDD MIMO-OFDM Systems
Authors	Mahdi Boloursaz Mashhadi, Qianqian Yang, Deniz Gunduz
Abstract	Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, CSI feedback overhead degrades the overall spectral efficiency. Convolutional neural network (CNN)-based CSI feedback compression schemes has received a lot of attention recently due to significant improvements in compression efficiency; however, they still require reliable feedback links to convey the compressed CSI information to the BS. Instead, we propose here a CNN-based analog feedback scheme, called AnalogDeepCMC, which directly maps the downlink CSI to uplink channel input. Corresponding noisy channel outputs are used by another CNN to reconstruct the DL channel estimate. Not only the proposed outperforms existing digital CSI feedback schemes in terms of the achievable downlink rate, but also simplifies the operation as it does not require explicit quantization, coding and modulation, and provides a low-latency alternative particularly in rapidly changing MIMO channels, where the CSI needs to be estimated and fed back periodically.
Tasks	Quantization
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10428v1
PDF	https://arxiv.org/pdf/1910.10428v1.pdf
PWC	https://paperswithcode.com/paper/cnn-based-analog-csi-feedback-in-fdd-mimo
Repo
Framework

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite


Title	Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite
Authors	Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov
Abstract	The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments.
Tasks	Machine Translation
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00131v1
PDF	https://arxiv.org/pdf/1909.00131v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-pronominal-anaphora-in-machine
Repo
Framework

Slope Difference Distribution and Its Computer Vision Applications


Title	Slope Difference Distribution and Its Computer Vision Applications
Authors	Zhenzhou Wang
Abstract	Slope difference distribution (SDD) is computed from the one-dimensional curve and makes it possible to find derivatives that do not exist in the original curve. It is not only robust to calculate the threshold point to separate the curve logically, but also robust to calculate the center of each part of the separated curve. SDD has been used in image segmentation and it outperforms all classical and state of the art image segmentation methods. SDD is also very useful in calculating the features for pattern recognition and object detection. For the gesture recognition, SDD achieved 100% accuracy for two public datasets: the NUS dataset and the near-infrared dataset. For the object recognition, SDD achieved 100% accuracy for the Kimia 99 dataset. In this memorandum, I will demonstrate the effectiveness of SDD with some typical examples.
Tasks	Gesture Recognition, Object Detection, Object Recognition, Semantic Segmentation
Published	2019-10-13
URL	https://arxiv.org/abs/1910.05704v1
PDF	https://arxiv.org/pdf/1910.05704v1.pdf
PWC	https://paperswithcode.com/paper/slope-difference-distribution-and-its
Repo
Framework