Paper Group ANR 782
Boundless: Generative Adversarial Networks for Image Extension. Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model. Sparsity Constrained Distributed Unmixing of Hyperspectral Data. Faster Unsupervised Semantic Inpainting: A GAN Based Approach. Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR …
Boundless: Generative Adversarial Networks for Image Extension
Title | Boundless: Generative Adversarial Networks for Image Extension |
Authors | Piotr Teterwak, Aaron Sarna, Dilip Krishnan, Aaron Maschinot, David Belanger, Ce Liu, William T. Freeman |
Abstract | Image extension models have broad applications in image editing, computational photography and computer graphics. While image inpainting has been extensively studied in the literature, it is challenging to directly apply the state-of-the-art inpainting methods to image extension as they tend to generate blurry or repetitive pixels with inconsistent semantics. We introduce semantic conditioning to the discriminator of a generative adversarial network (GAN), and achieve strong results on image extension with coherent semantics and visually pleasing colors and textures. We also show promising results in extreme extensions, such as panorama generation. |
Tasks | Image Inpainting |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.07007v1 |
https://arxiv.org/pdf/1908.07007v1.pdf | |
PWC | https://paperswithcode.com/paper/boundless-generative-adversarial-networks-for |
Repo | |
Framework | |
Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model
Title | Respect Your Emotion: Human-Multi-Robot Teaming based on Regret Decision Model |
Authors | Longsheng Jiang, Yue Wang |
Abstract | Often, when modeling human decision-making behaviors in the context of human-robot teaming, the emotion aspect of human is ignored. Nevertheless, the influence of emotion, in some cases, is not only undeniable but beneficial. This work studies the human-like characteristics brought by regret emotion in one-human-multi-robot teaming for the application of domain search. In such application, the task management load is outsourced to the robots to reduce the human’s workload, freeing the human to do more important work. The regret decision model is first used by each robot for deciding whether to request human service, then is extended for optimally queuing the requests from multiple robots. For the movement of the robots in the domain search, we designed a path planning algorithm based on dynamic programming for each robot. The simulation shows that the human-like characteristics, namely, risk-seeking and risk-aversion, indeed bring some appealing effects for balancing the workload and performance in the human-multi-robot team. |
Tasks | Decision Making |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1910.00087v1 |
https://arxiv.org/pdf/1910.00087v1.pdf | |
PWC | https://paperswithcode.com/paper/respect-your-emotion-human-multi-robot |
Repo | |
Framework | |
Sparsity Constrained Distributed Unmixing of Hyperspectral Data
Title | Sparsity Constrained Distributed Unmixing of Hyperspectral Data |
Authors | Sara Khoshsokhan, Roozbeh Rajabi, Hadi Zayyani |
Abstract | Spectral unmixing (SU) is a technique to characterize mixed pixels in hyperspectral images measured by remote sensors. Most of the spectral unmixing algorithms are developed using the linear mixing models. To estimate endmembers and fractional abundance matrices in a blind problem, nonnegative matrix factorization (NMF) and its developments are widely used in the SU problem. One of the constraints which was added to NMF is sparsity, that was regularized by Lq norm. In this paper, a new algorithm based on distributed optimization is suggested for spectral unmixing. In the proposed algorithm, a network including single-node clusters is employed. Each pixel in the hyperspectral images is considered as a node in this network. The sparsity constrained distributed unmixing is optimized with diffusion least mean p-power (LMP) strategy, and then the update equations for fractional abundance and signature matrices are obtained. Afterwards the proposed algorithm is analyzed for different values of LMP power and Lq norms. Simulation results based on defined performance metrics illustrate the advantage of the proposed algorithm in spectral unmixing of hyperspectral data compared with other methods. |
Tasks | Distributed Optimization |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07593v1 |
http://arxiv.org/pdf/1902.07593v1.pdf | |
PWC | https://paperswithcode.com/paper/sparsity-constrained-distributed-unmixing-of |
Repo | |
Framework | |
Faster Unsupervised Semantic Inpainting: A GAN Based Approach
Title | Faster Unsupervised Semantic Inpainting: A GAN Based Approach |
Authors | Avisek Lahiri, Arnav Kumar Jain, Divyasri Nadendla, Prabir Kumar Biswas |
Abstract | In this paper, we propose to improve the inference speed and visual quality of contemporary baseline of Generative Adversarial Networks (GAN) based unsupervised semantic inpainting. This is made possible with better initialization of the core iterative optimization involved in the framework. To our best knowledge, this is also the first attempt of GAN based video inpainting with consideration to temporal cues. On single image inpainting, we achieve about 4.5-5$\times$ speedup and 80$\times$ on videos compared to baseline. Simultaneously, our method has better spatial and temporal reconstruction qualities as found on three image and one video dataset. |
Tasks | Image Inpainting, Video Inpainting |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.04968v1 |
https://arxiv.org/pdf/1908.04968v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-unsupervised-semantic-inpainting-a-gan |
Repo | |
Framework | |
Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR Point Cloud Classification
Title | Density-Aware Convolutional Networks with Context Encoding for Airborne LiDAR Point Cloud Classification |
Authors | Xiang Li, Mingyang Wang, Congcong Wen, Lingjing Wang, Nan Zhou, Yi Fang |
Abstract | To better address challenging issues of the irregularity and inhomogeneity inherently present in 3D point clouds, researchers have been shifting their focus from the design of hand-craft point feature towards the learning of 3D point signatures using deep neural networks for 3D point cloud classification. Recent proposed deep learning based point cloud classification methods either apply 2D CNN on projected feature images or apply 1D convolutional layers directly on raw point sets. These methods cannot adequately recognize fine-grained local structures caused by the uneven density distribution of the point cloud data. In this paper, to address this challenging issue, we introduced a density-aware convolution module which uses the point-wise density to re-weight the learnable weights of convolution kernels. The proposed convolution module is able to fully approximate the 3D continuous convolution on unevenly distributed 3D point sets. Based on this convolution module, we further developed a multi-scale fully convolutional neural network with downsampling and upsampling blocks to enable hierarchical point feature learning. In addition, to regularize the global semantic context, we implemented a context encoding module to predict a global context encoding and formulated a context encoding regularizer to enforce the predicted context encoding to be aligned with the ground truth one. The overall network can be trained in an end-to-end fashion with the raw 3D coordinates as well as the height above ground as inputs. Experiments on the International Society for Photogrammetry and Remote Sensing (ISPRS) 3D labeling benchmark demonstrated the superiority of the proposed method for point cloud classification. Our model achieved a new state-of-the-art performance with an average F1 score of 71.2% and improved the performance by a large margin on several categories. |
Tasks | |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.05909v1 |
https://arxiv.org/pdf/1910.05909v1.pdf | |
PWC | https://paperswithcode.com/paper/density-aware-convolutional-networks-with |
Repo | |
Framework | |
Predictive modeling of brain tumor: A Deep learning approach
Title | Predictive modeling of brain tumor: A Deep learning approach |
Authors | Priyansh Saxena, Akshat Maheshwari, Shivani Tayal, Saumil Maheshwari |
Abstract | Image processing concepts can visualize the different anatomy structure of the human body. Recent advancements in the field of deep learning have made it possible to detect the growth of cancerous tissue just by a patient’s brain Magnetic Resonance Imaging (MRI) scans. These methods require very high accuracy and meager false negative rates to be of any practical use. This paper presents a Convolutional Neural Network (CNN) based transfer learning approach to classify the brain MRI scans into two classes using three pre-trained models. The performances of these models are compared with each other. Experimental results show that the Resnet-50 model achieves the highest accuracy and least false negative rates as 95% and zero respectively. It is followed by VGG-16 and Inception-V3 model with an accuracy of 90% and 55% respectively. |
Tasks | Transfer Learning |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02265v4 |
https://arxiv.org/pdf/1911.02265v4.pdf | |
PWC | https://paperswithcode.com/paper/predictive-modeling-of-brain-tumor-a-deep |
Repo | |
Framework | |
From News to Medical: Cross-domain Discourse Segmentation
Title | From News to Medical: Cross-domain Discourse Segmentation |
Authors | Elisa Ferracane, Titan Page, Junyi Jessy Li, Katrin Erk |
Abstract | The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problems can be addressed earlier in the pipeline, while others would require expanding the corpus to a trainable size to learn the nuances of the medical domain. |
Tasks | |
Published | 2019-04-14 |
URL | http://arxiv.org/abs/1904.06682v1 |
http://arxiv.org/pdf/1904.06682v1.pdf | |
PWC | https://paperswithcode.com/paper/from-news-to-medical-cross-domain-discourse |
Repo | |
Framework | |
Neural Architectures for Fine-Grained Propaganda Detection in News
Title | Neural Architectures for Fine-Grained Propaganda Detection in News |
Authors | Pankaj Gupta, Khushbu Saxena, Usama Yaseen, Thomas Runkler, Hinrich Schütze |
Abstract | This paper describes our system (MIC-CIS) details and results of participation in the fine-grained propaganda detection shared task 2019. To address the tasks of sentence (SLC) and fragment level (FLC) propaganda detection, we explore different neural architectures (e.g., CNN, LSTM-CRF and BERT) and extract linguistic (e.g., part-of-speech, named entity, readability, sentiment, emotion, etc.), layout and topical features. Specifically, we have designed multi-granularity and multi-tasking neural architectures to jointly perform both the sentence and fragment level propaganda detection. Additionally, we investigate different ensemble schemes such as majority-voting, relax-voting, etc. to boost overall system performance. Compared to the other participating systems, our submissions are ranked 3rd and 4th in FLC and SLC tasks, respectively. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06162v1 |
https://arxiv.org/pdf/1909.06162v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-architectures-for-fine-grained |
Repo | |
Framework | |
The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots
Title | The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots |
Authors | Fabio Cermelli, Massimiliano Mancini, Elisa Ricci, Barbara Caputo |
Abstract | Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research. |
Tasks | Object Detection, Pose Estimation, Scene Segmentation |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00912v2 |
http://arxiv.org/pdf/1904.00912v2.pdf | |
PWC | https://paperswithcode.com/paper/the-rgb-d-triathlon-towards-agile-visual |
Repo | |
Framework | |
Segmentation Mask Guided End-to-End Person Search
Title | Segmentation Mask Guided End-to-End Person Search |
Authors | Dingyuan Zheng, Jimin Xiao, Kaizhu Huang, Yao Zhao |
Abstract | Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on benchmark dataset CUHK-SYSU, show that our proposed model achieves the state-of-the-art performance with 86.3% mAP and 86.5 top-1 accuracy respectively. |
Tasks | Pedestrian Detection, Person Re-Identification, Person Search |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10179v1 |
https://arxiv.org/pdf/1908.10179v1.pdf | |
PWC | https://paperswithcode.com/paper/segmentation-mask-guided-end-to-end-person |
Repo | |
Framework | |
Towards Standardization of Data Licenses: The Montreal Data License
Title | Towards Standardization of Data Licenses: The Montreal Data License |
Authors | Misha Benjamin, Paul Gagnon, Negar Rostamzadeh, Chris Pal, Yoshua Bengio, Alex Shee |
Abstract | This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper’s goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper’s approach is summarized in a new family of data license language - \textit{the Montreal Data License (MDL)}. Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper. |
Tasks | |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.12262v1 |
http://arxiv.org/pdf/1903.12262v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-standardization-of-data-licenses-the |
Repo | |
Framework | |
Encoder-Decoder based CNN and Fully Connected CRFs for Remote Sensed Image Segmentation
Title | Encoder-Decoder based CNN and Fully Connected CRFs for Remote Sensed Image Segmentation |
Authors | Vikas Agaradahalli Gurumurthy |
Abstract | With the advancement of remote-sensed imaging large volumes of very high resolution land cover images can now be obtained. Automation of object recognition in these 2D images, however, is still a key issue. High intra-class variance and low inter-class variance in Very High Resolution (VHR) images hamper the accuracy of prediction in object recognition tasks. Most successful techniques in various computer vision tasks recently are based on deep supervised learning. In this work, a deep Convolutional Neural Network (CNN) based on symmetric encoder-decoder architecture with skip connections is employed for the 2D semantic segmentation of most common land cover object classes - impervious surface, buildings, low vegetation, trees and cars. Atrous convolutions are employed to have large receptive field in the proposed CNN model. Further, the CNN outputs are post-processed using Fully Connected Conditional Random Field (FCRF) model to refine the CNN pixel label predictions. The proposed CNN-FCRF model achieves an overall accuracy of 90.5% on the ISPRS Vaihingen Dataset. |
Tasks | Object Recognition, Semantic Segmentation |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06041v1 |
https://arxiv.org/pdf/1910.06041v1.pdf | |
PWC | https://paperswithcode.com/paper/encoder-decoder-based-cnn-and-fully-connected |
Repo | |
Framework | |
CNN-based Analog CSI Feedback in FDD MIMO-OFDM Systems
Title | CNN-based Analog CSI Feedback in FDD MIMO-OFDM Systems |
Authors | Mahdi Boloursaz Mashhadi, Qianqian Yang, Deniz Gunduz |
Abstract | Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to better utilize the available spatial diversity and multiplexing gains. However, in a frequency division duplex (FDD) massive MIMO system, CSI feedback overhead degrades the overall spectral efficiency. Convolutional neural network (CNN)-based CSI feedback compression schemes has received a lot of attention recently due to significant improvements in compression efficiency; however, they still require reliable feedback links to convey the compressed CSI information to the BS. Instead, we propose here a CNN-based analog feedback scheme, called AnalogDeepCMC, which directly maps the downlink CSI to uplink channel input. Corresponding noisy channel outputs are used by another CNN to reconstruct the DL channel estimate. Not only the proposed outperforms existing digital CSI feedback schemes in terms of the achievable downlink rate, but also simplifies the operation as it does not require explicit quantization, coding and modulation, and provides a low-latency alternative particularly in rapidly changing MIMO channels, where the CSI needs to be estimated and fed back periodically. |
Tasks | Quantization |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10428v1 |
https://arxiv.org/pdf/1910.10428v1.pdf | |
PWC | https://paperswithcode.com/paper/cnn-based-analog-csi-feedback-in-fdd-mimo |
Repo | |
Framework | |
Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite
Title | Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite |
Authors | Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov |
Abstract | The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments. |
Tasks | Machine Translation |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00131v1 |
https://arxiv.org/pdf/1909.00131v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-pronominal-anaphora-in-machine |
Repo | |
Framework | |
Slope Difference Distribution and Its Computer Vision Applications
Title | Slope Difference Distribution and Its Computer Vision Applications |
Authors | Zhenzhou Wang |
Abstract | Slope difference distribution (SDD) is computed from the one-dimensional curve and makes it possible to find derivatives that do not exist in the original curve. It is not only robust to calculate the threshold point to separate the curve logically, but also robust to calculate the center of each part of the separated curve. SDD has been used in image segmentation and it outperforms all classical and state of the art image segmentation methods. SDD is also very useful in calculating the features for pattern recognition and object detection. For the gesture recognition, SDD achieved 100% accuracy for two public datasets: the NUS dataset and the near-infrared dataset. For the object recognition, SDD achieved 100% accuracy for the Kimia 99 dataset. In this memorandum, I will demonstrate the effectiveness of SDD with some typical examples. |
Tasks | Gesture Recognition, Object Detection, Object Recognition, Semantic Segmentation |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05704v1 |
https://arxiv.org/pdf/1910.05704v1.pdf | |
PWC | https://paperswithcode.com/paper/slope-difference-distribution-and-its |
Repo | |
Framework | |