February 1, 2020

3057 words 15 mins read

Paper Group AWR 138

nuScenes: A multimodal dataset for autonomous driving. The Myths of Our Time: Fake News. Automatic Generation of Personalized Comment Based on User Profile. Distance approximation using Isolation Forests. Deep Learning for Deepfakes Creation and Detection. Collaborative Graph Walk for Semi-supervised Multi-Label Node Classification. Copy-and-Paste …

nuScenes: A multimodal dataset for autonomous driving


Title	nuScenes: A multimodal dataset for autonomous driving
Authors	Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom
Abstract	Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image-based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first published dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online at http://www.nuscenes.org.
Tasks	3D Object Detection, Autonomous Driving, Autonomous Vehicles, Object Detection
Published	2019-03-26
URL	https://arxiv.org/abs/1903.11027v4
PDF	https://arxiv.org/pdf/1903.11027v4.pdf
PWC	https://paperswithcode.com/paper/nuscenes-a-multimodal-dataset-for-autonomous
Repo	https://github.com/xinshuoweng/mynuscene
Framework	none

The Myths of Our Time: Fake News


Title	The Myths of Our Time: Fake News
Authors	Vít Růžička, Eunsu Kang, David Gordon, Ankita Patel, Jacqui Fashimpaur, Manzil Zaheer
Abstract	While the purpose of most fake news is misinformation and political propaganda, our team sees it as a new type of myth that is created by people in the age of internet identities and artificial intelligence. Seeking insights on the fear and desire hidden underneath these modified or generated stories, we use machine learning methods to generate fake articles and present them in the form of an online news blog. This paper aims to share the details of our pipeline and the techniques used for full generation of fake news, from dataset collection to presentation as a media art project on the internet.
Tasks	News Generation, Text Generation
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01760v1
PDF	https://arxiv.org/pdf/1908.01760v1.pdf
PWC	https://paperswithcode.com/paper/the-myths-of-our-time-fake-news
Repo	https://github.com/previtus/fake_news_generation_mark_I
Framework	none

Automatic Generation of Personalized Comment Based on User Profile


Title	Automatic Generation of Personalized Comment Based on User Profile
Authors	Wenhuan Zeng, Abulikemu Abuduweili, Lei Li, Pengcheng Yang
Abstract	Comments on social media are very diverse, in terms of content, style and vocabulary, which make generating comments much more challenging than other existing natural language generation~(NLG) tasks. Besides, since different user has different expression habits, it is necessary to take the user’s profile into consideration when generating comments. In this paper, we introduce the task of automatic generation of personalized comment~(AGPC) for social media. Based on tens of thousands of users’ real comments and corresponding user profiles on weibo, we propose Personalized Comment Generation Network~(PCGN) for AGPC. The model utilizes user feature embedding with a gated memory and attends to user description to model personality of users. In addition, external user representation is taken into consideration during the decoding to enhance the comments generation. Experimental results show that our model can generate natural, human-like and personalized comments.
Tasks	Text Generation
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10371v1
PDF	https://arxiv.org/pdf/1907.10371v1.pdf
PWC	https://paperswithcode.com/paper/automatic-generation-of-personalized-comment
Repo	https://github.com/Walleclipse/AGPC
Framework	tf

Distance approximation using Isolation Forests


Title	Distance approximation using Isolation Forests
Authors	David Cortes
Abstract	This work briefly explores the possibility of approximating spatial distance (alternatively, similarity) between data points using the Isolation Forest method envisioned for outlier detection. The logic is similar to that of isolation: the more similar or closer two points are, the more random splits it will take to separate them. The separation depth between two points can be standardized in the same way as the isolation depth, transforming it into a distance metric that is limited in range, centered, and in compliance with the axioms of distance. This metric presents some desirable properties such as being invariant to the scales of variables or being able to account for non-linear relationships between variables, which other metrics such as Euclidean or Mahalanobis distance do not. Extensions to the Isolation Forest method are also proposed for handling categorical variables and missing values, resulting in a more generalizable and robust metric.
Tasks	Outlier Detection
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12362v2
PDF	https://arxiv.org/pdf/1910.12362v2.pdf
PWC	https://paperswithcode.com/paper/distance-approximation-using-isolation
Repo	https://github.com/david-cortes/isotree
Framework	none

Deep Learning for Deepfakes Creation and Detection


Title	Deep Learning for Deepfakes Creation and Detection
Authors	Thanh Thi Nguyen, Cuong M. Nguyen, Dung Tien Nguyen, Duc Thanh Nguyen, Saeid Nahavandi
Abstract	Deep learning has been successfully applied to solve various complex problems ranging from big data analytics to computer vision and human-level control. Deep learning advances however have also been employed to create software that can cause threats to privacy, democracy and national security. One of those deep learning-powered applications recently emerged is “deepfake”. Deepfake algorithms can create fake images and videos that humans cannot distinguish them from authentic ones. The proposal of technologies that can automatically detect and assess the integrity of digital visual media is therefore indispensable. This paper presents a survey of algorithms used to create deepfakes and, more importantly, methods proposed to detect deepfakes in the literature to date. We present extensive discussions on challenges, research trends and directions related to deepfake technologies. By reviewing the background of deepfakes and state-of-the-art deepfake detection methods, this study provides a comprehensive overview of deepfake techniques and facilitates the development of new and more robust methods to deal with the increasingly challenging deepfakes.
Tasks	DeepFake Detection, Face Swapping
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11573v1
PDF	https://arxiv.org/pdf/1909.11573v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-deepfakes-creation-and
Repo	https://github.com/ApGa/adversarial_deepfakes
Framework	pytorch

Collaborative Graph Walk for Semi-supervised Multi-Label Node Classification


Title	Collaborative Graph Walk for Semi-supervised Multi-Label Node Classification
Authors	Uchenna Akujuobi, Han Yufei, Qiannan Zhang, Xiangliang Zhang
Abstract	In this work, we study semi-supervised multi-label node classification problem in attributed graphs. Classic solutions to multi-label node classification follow two steps, first learn node embedding and then build a node classifier on the learned embedding. To improve the discriminating power of the node embedding, we propose a novel collaborative graph walk, named Multi-Label-Graph-Walk, to finely tune node representations with the available label assignments in attributed graphs via reinforcement learning. The proposed method formulates the multi-label node classification task as simultaneous graph walks conducted by multiple label-specific agents. Furthermore, policies of the label-wise graph walks are learned in a cooperative way to capture first the predictive relation between node labels and structural attributes of graphs; and second, the correlation among the multiple label-specific classification tasks. A comprehensive experimental study demonstrates that the proposed method can achieve significantly better multi-label classification performance than the state-of-the-art approaches and conduct more efficient graph exploration.
Tasks	Multi-Label Classification, Node Classification
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09706v2
PDF	https://arxiv.org/pdf/1910.09706v2.pdf
PWC	https://paperswithcode.com/paper/collaborative-graph-walk-for-semi-supervised
Repo	https://github.com/Uchman21/MLGW
Framework	tf

Copy-and-Paste Networks for Deep Video Inpainting


Title	Copy-and-Paste Networks for Deep Video Inpainting
Authors	Sungho Lee, Seoung Wug Oh, DaeYeun Won, Seon Joo Kim
Abstract	We present a novel deep learning based algorithm for video inpainting. Video inpainting is a process of completing corrupted or missing regions in videos. Video inpainting has additional challenges compared to image inpainting due to the extra temporal information as well as the need for maintaining the temporal coherency. We propose a novel DNN-based framework called the Copy-and-Paste Networks for video inpainting that takes advantage of additional information in other frames of the video. The network is trained to copy corresponding contents in reference frames and paste them to fill the holes in the target frame. Our network also includes an alignment network that computes affine matrices between frames for the alignment, enabling the network to take information from more distant frames for robustness. Our method produces visually pleasing and temporally coherent results while running faster than the state-of-the-art optimization-based method. In addition, we extend our framework for enhancing over/under exposed frames in videos. Using this enhancement technique, we were able to significantly improve the lane detection accuracy on road videos.
Tasks	Image Inpainting, Lane Detection, Video Inpainting
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11587v1
PDF	https://arxiv.org/pdf/1908.11587v1.pdf
PWC	https://paperswithcode.com/paper/copy-and-paste-networks-for-deep-video
Repo	https://github.com/shleecs/Copy-and-Paste-Networks-for-Deep-Video-Inpainting
Framework	pytorch

GAN You Do the GAN GAN?


Title	GAN You Do the GAN GAN?
Authors	Joseph Suarez
Abstract	Generative Adversarial Networks (GANs) have become a dominant class of generative models. In recent years, GAN variants have yielded especially impressive results in the synthesis of a variety of forms of data. Examples include compelling natural and artistic images, textures, musical sequences, and 3D object files. However, one obvious synthesis candidate is missing. In this work, we answer one of deep learning’s most pressing questions: GAN you do the GAN GAN? That is, is it possible to train a GAN to model a distribution of GANs? We release the full source code for this project under the MIT license.
Tasks	Image Generation
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00724v1
PDF	http://arxiv.org/pdf/1904.00724v1.pdf
PWC	https://paperswithcode.com/paper/gan-you-do-the-gan-gan
Repo	https://github.com/jsuarez5341/gan-you-do-the-gan-gan
Framework	pytorch

MisGAN: Learning from Incomplete Data with Generative Adversarial Networks


Title	MisGAN: Learning from Incomplete Data with Generative Adversarial Networks
Authors	Steven Cheng-Xian Li, Bo Jiang, Benjamin Marlin
Abstract	Generative adversarial networks (GANs) have been shown to provide an effective way to model complex distributions and have obtained impressive results on various challenging tasks. However, typical GANs require fully-observed data during training. In this paper, we present a GAN-based framework for learning from complex, high-dimensional incomplete data. The proposed framework learns a complete data generator along with a mask generator that models the missing data distribution. We further demonstrate how to impute missing data by equipping our framework with an adversarially trained imputer. We evaluate the proposed framework using a series of experiments with several types of missing data processes under the missing completely at random assumption.
Tasks
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09599v1
PDF	http://arxiv.org/pdf/1902.09599v1.pdf
PWC	https://paperswithcode.com/paper/misgan-learning-from-incomplete-data-with
Repo	https://github.com/steveli/misgan
Framework	pytorch

Single Image 3D Hand Reconstruction with Mesh Convolutions


Title	Single Image 3D Hand Reconstruction with Mesh Convolutions
Authors	Dominik Kulon, Haoyang Wang, Riza Alp Güler, Michael Bronstein, Stefanos Zafeiriou
Abstract	Monocular 3D reconstruction of deformable objects, such as human body parts, has been typically approached by predicting parameters of heavyweight linear models. In this paper, we demonstrate an alternative solution that is based on the idea of encoding images into a latent non-linear representation of meshes. The prior on 3D hand shapes is learned by training an autoencoder with intrinsic graph convolutions performed in the spectral domain. The pre-trained decoder acts as a non-linear statistical deformable model. The latent parameters that reconstruct the shape and articulated pose of hands in the image are predicted using an image encoder. We show that our system reconstructs plausible meshes and operates in real-time. We evaluate the quality of the mesh reconstructions produced by the decoder on a new dataset and show latent space interpolation results. Our code, data, and models will be made publicly available.
Tasks	3D Reconstruction
Published	2019-05-04
URL	https://arxiv.org/abs/1905.01326v3
PDF	https://arxiv.org/pdf/1905.01326v3.pdf
PWC	https://paperswithcode.com/paper/single-image-3d-hand-reconstruction-with-mesh
Repo	https://github.com/dkulon/hand-reconstruction
Framework	tf

Minimum Information guidelines for fluorescence microscopy: increasing the value, quality, and fidelity of image data


Title	Minimum Information guidelines for fluorescence microscopy: increasing the value, quality, and fidelity of image data
Authors	Maximiliaan Huisman, Mathias Hammer, Alex Rigano, Renu Gopinathan, Carlas Smith, David Grunwald, Caterina Strambio-De-Castillia
Abstract	High-resolution digital microscopy provides ever more powerful tools for probing the real-time dynamics of subcellular structures, and adequate record-keeping is necessary to evaluate results, share data, and allow experiments to be repeated. In addition to advances in microscopic techniques, post-acquisition procedures such as image-data processing and analysis (i.e., feature counting, distance measurements, intensity comparison, and colocalization studies) are often required for the reproducible and quantitative interpretation of images. While these techniques increase the usefulness of microscopy data, the limits to which quantitative results may be interpreted are often poorly quantified and documented. Keeping notes on microscopy experiments and calibration procedures should be relatively unchallenging, as the microscope is a machine whose performance should be easy to assess. Nevertheless, to this date, no widely adopted data provenance and quality control metadata guidelines to be recorded or published with imaging data exist. Metadata automatically recorded by microscopes from different companies vary widely and pose a substantial challenge for microscope users to create a good faith record of their work. Similarly, the complexity and aim of experiments using microscopes vary, leading to different reporting and quality control requirements from the simple description of a sample to the need to document the complexities of sub-diffraction resolution imaging in living cells and beyond. To solve this problem, the 4DN Imaging Standards Working Group has put forth a tiered system of microscopy calibration and metadata standards for images obtained through fluorescence microscopy. The proposal is an extension of the OME data model and aims at increasing data fidelity, ease future analysis, and facilitate objective comparison of different datasets, experimental setups, and essays.
Tasks	Calibration
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11370v1
PDF	https://arxiv.org/pdf/1910.11370v1.pdf
PWC	https://paperswithcode.com/paper/minimum-information-guidelines-for
Repo	https://github.com/WU-BIMAC/MicroscopyMetadata4DNGuidelines
Framework	none

Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound


Title	Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound
Authors	Yi Wang, Haoran Dou, Xiaowei Hu, Lei Zhu, Xin Yang, Ming Xu, Jing Qin, Pheng-Ann Heng, Tianfu Wang, Dong Ni
Abstract	Automatic prostate segmentation in transrectal ultrasound (TRUS) images is of essential importance for image-guided prostate interventions and treatment planning. However, developing such automatic solutions remains very challenging due to the missing/ambiguous boundary and inhomogeneous intensity distribution of the prostate in TRUS, as well as the large variability in prostate shapes. This paper develops a novel 3D deep neural network equipped with attention modules for better prostate segmentation in TRUS by fully exploiting the complementary information encoded in different layers of the convolutional neural network (CNN). Our attention module utilizes the attention mechanism to selectively leverage the multilevel features integrated from different layers to refine the features at each individual layer, suppressing the non-prostate noise at shallow layers of the CNN and increasing more prostate details into features at deep layers. Experimental results on challenging 3D TRUS volumes show that our method attains satisfactory segmentation performance. The proposed attention mechanism is a general strategy to aggregate multi-level deep features and has the potential to be used for other medical image segmentation tasks. The code is publicly available at https://github.com/wulalago/DAF3D.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01743v1
PDF	https://arxiv.org/pdf/1907.01743v1.pdf
PWC	https://paperswithcode.com/paper/deep-attentive-features-for-prostate
Repo	https://github.com/wulalago/DAF3D
Framework	pytorch

Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration


Title	Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration
Authors	Meelis Kull, Miquel Perello-Nieto, Markus Kängsepp, Telmo Silva Filho, Hao Song, Peter Flach
Abstract	Class probabilities predicted by most multiclass classifiers are uncalibrated, often tending towards over-confidence. With neural networks, calibration can be improved by temperature scaling, a method to learn a single corrective multiplicative factor for inputs to the last softmax layer. On non-neural models the existing methods apply binary calibration in a pairwise or one-vs-rest fashion. We propose a natively multiclass calibration method applicable to classifiers from any model class, derived from Dirichlet distributions and generalising the beta calibration method from binary classification. It is easily implemented with neural nets since it is equivalent to log-transforming the uncalibrated probabilities, followed by one linear layer and softmax. Experiments demonstrate improved probabilistic predictions according to multiple measures (confidence-ECE, classwise-ECE, log-loss, Brier score) across a wide range of datasets and classifiers. Parameters of the learned Dirichlet calibration map provide insights to the biases in the uncalibrated model.
Tasks	Calibration
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12656v1
PDF	https://arxiv.org/pdf/1910.12656v1.pdf
PWC	https://paperswithcode.com/paper/beyond-temperature-scaling-obtaining-well
Repo	https://github.com/dirichletcal/dirichletcal.github.io
Framework	none

An Effective Multi-Resolution Hierarchical Granular Representation based Classifier using General Fuzzy Min-Max Neural Network


Title	An Effective Multi-Resolution Hierarchical Granular Representation based Classifier using General Fuzzy Min-Max Neural Network
Authors	Thanh Tung Khuat, Fang Chen, Bogdan Gabrys
Abstract	Motivated by the practical demands for simplification of data towards being consistent with human thinking and problem solving as well as tolerance of uncertainty, information granules are becoming important entities in data processing at different levels of data abstraction. This paper proposes a method to construct classifiers from multi-resolution hierarchical granular representations (MRHGRC) using hyperbox fuzzy sets. The proposed approach forms a series of granular inferences hierarchically through many levels of abstraction. An attractive characteristic of our classifier is that it can maintain relatively high accuracy at a low degree of granularity based on reusing the knowledge learned from lower levels of abstraction. In addition, our approach can reduce the data size significantly as well as handling the uncertainty and incompleteness associated with data in real-world applications. The construction process of the classifier consists of two phases. The first phase is to formulate the model at the greatest level of granularity, while the later stage aims to reduce the complexity of the constructed model and deduce it from data at higher abstraction levels. Experimental outcomes conducted comprehensively on both synthetic and real datasets indicated the efficiency of our method in terms of training time and predictive performance in comparison to other types of fuzzy min-max neural networks and common machine learning algorithms.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12170v3
PDF	https://arxiv.org/pdf/1905.12170v3.pdf
PWC	https://paperswithcode.com/paper/an-effective-multi-resolution-hierarchical
Repo	https://github.com/UTS-AAi/MRHGRC
Framework	none

ANA at SemEval-2019 Task 3: Contextual Emotion detection in Conversations through hierarchical LSTMs and BERT


Title	ANA at SemEval-2019 Task 3: Contextual Emotion detection in Conversations through hierarchical LSTMs and BERT
Authors	Chenyang Huang, Amine Trabelsi, Osmar R. Zaïane
Abstract	This paper describes the system submitted by ANA Team for the SemEval-2019 Task 3: EmoContext. We propose a novel Hierarchical LSTMs for Contextual Emotion Detection (HRLCE) model. It classifies the emotion of an utterance given its conversational context. The results show that, in this task, our HRCLE outperforms the most recent state-of-the-art text classification framework: BERT. We combine the results generated by BERT and HRCLE to achieve an overall score of 0.7709 which ranked 5th on the final leader board of the competition among 165 Teams.
Tasks	Text Classification
Published	2019-03-30
URL	https://arxiv.org/abs/1904.00132v2
PDF	https://arxiv.org/pdf/1904.00132v2.pdf
PWC	https://paperswithcode.com/paper/ana-at-semeval-2019-task-3-contextual-emotion
Repo	https://github.com/chenyangh/SemEval2019Task3
Framework	pytorch