October 20, 2019

3031 words 15 mins read

Paper Group AWR 307

Paper Group AWR 307

An Infinite Parade of Giraffes: Expressive Augmentation and Complexity Layers for Cartoon Drawing. Evolving Space-Time Neural Architectures for Videos. Learning Gender-Neutral Word Embeddings. Deep Learning at Scale for the Construction of Galaxy Catalogs in the Dark Energy Survey. Predicting Citation Counts with a Neural Network. Robust Attentiona …

An Infinite Parade of Giraffes: Expressive Augmentation and Complexity Layers for Cartoon Drawing

Title An Infinite Parade of Giraffes: Expressive Augmentation and Complexity Layers for Cartoon Drawing
Authors K. G. Greene
Abstract In this paper, we explore creative image generation constrained by small data. To partially automate the creation of cartoon sketches consistent with a specific designer’s style, where acquiring a very large original image set is impossible or cost prohibitive, we exploit domain specific knowledge for a huge reduction in original image requirements, creating an effectively infinite number of cartoon giraffes from just nine original drawings. We introduce “expressive augmentations” for cartoon sketches, mathematical transformations that create broad domain appropriate variation, far beyond the usual affine transformations, and we show that chained GANs models trained on the temporal stages of drawing or “complexity layers” can effectively add character appropriate details and finish new drawings in the designer’s style. We discuss the application of these tools in design processes for textiles, graphics, architectural elements and interior design.
Tasks Image Generation
Published 2018-11-08
URL http://arxiv.org/abs/1811.07023v1
PDF http://arxiv.org/pdf/1811.07023v1.pdf
PWC https://paperswithcode.com/paper/an-infinite-parade-of-giraffes-expressive
Repo https://github.com/kggreene/kggreene.github.io
Framework none

Evolving Space-Time Neural Architectures for Videos

Title Evolving Space-Time Neural Architectures for Videos
Authors AJ Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo
Abstract We present a new method for finding video CNN architectures that capture rich spatio-temporal information in videos. Previous work, taking advantage of 3D convolutions, obtained promising results by manually designing video CNN architectures. We here develop a novel evolutionary search algorithm that automatically explores models with different types and combinations of layers to jointly learn interactions between spatial and temporal aspects of video representations. We demonstrate the generality of this algorithm by applying it to two meta-architectures, obtaining new architectures superior to manually designed architectures. Further, we propose a new component, the iTGM layer, which more efficiently utilizes its parameters to allow learning of space-time interactions over longer time horizons. The iTGM layer is often preferred by the evolutionary algorithm and allows building cost-efficient networks. The proposed approach discovers new and diverse video architectures that were previously unknown. More importantly they are both more accurate and faster than prior models, and outperform the state-of-the-art results on multiple datasets we test, including HMDB, Kinetics, and Moments in Time. We will open source the code and models, to encourage future model development.
Tasks Action Classification, Action Recognition In Videos
Published 2018-11-26
URL https://arxiv.org/abs/1811.10636v2
PDF https://arxiv.org/pdf/1811.10636v2.pdf
PWC https://paperswithcode.com/paper/evolving-space-time-neural-architectures-for
Repo https://github.com/piergiaj/evanet-iccv19
Framework tf

Learning Gender-Neutral Word Embeddings

Title Learning Gender-Neutral Word Embeddings
Authors Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei Wang, Kai-Wei Chang
Abstract Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications. However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that reflect social constructs. To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings. Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence. Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe). Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model.
Tasks Word Embeddings
Published 2018-08-29
URL http://arxiv.org/abs/1809.01496v1
PDF http://arxiv.org/pdf/1809.01496v1.pdf
PWC https://paperswithcode.com/paper/learning-gender-neutral-word-embeddings
Repo https://github.com/uclanlp/gn_glove
Framework none

Deep Learning at Scale for the Construction of Galaxy Catalogs in the Dark Energy Survey

Title Deep Learning at Scale for the Construction of Galaxy Catalogs in the Dark Energy Survey
Authors Asad Khan, E. A. Huerta, Sibo Wang, Robert Gruendl, Elise Jennings, Huihuo Zheng
Abstract The scale of ongoing and future electromagnetic surveys pose formidable challenges to classify astronomical objects. Pioneering efforts on this front include citizen science campaigns adopted by the Sloan Digital Sky Survey (SDSS). SDSS datasets have been recently used to train neural network models to classify galaxies in the Dark Energy Survey (DES) that overlap the footprint of both surveys. Herein, we demonstrate that knowledge from deep learning algorithms, pre-trained with real-object images, can be transferred to classify galaxies that overlap both SDSS and DES surveys, achieving state-of-the-art accuracy $\gtrsim99.6%$. We demonstrate that this process can be completed within just eight minutes using distributed training. While this represents a significant step towards the classification of DES galaxies that overlap previous surveys, we need to initiate the characterization of unlabelled DES galaxies in new regions of parameter space. To accelerate this program, we use our neural network classifier to label over ten thousand unlabelled DES galaxies, which do not overlap previous surveys. Furthermore, we use our neural network model as a feature extractor for unsupervised clustering and find that unlabeled DES images can be grouped together in two distinct galaxy classes based on their morphology, which provides a heuristic check that the learning is successfully transferred to the classification of unlabelled DES images. We conclude by showing that these newly labeled datasets can be combined with unsupervised recursive training to create large-scale DES galaxy catalogs in preparation for the Large Synoptic Survey Telescope era.
Tasks
Published 2018-12-05
URL https://arxiv.org/abs/1812.02183v2
PDF https://arxiv.org/pdf/1812.02183v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-and-data-clustering-for
Repo https://github.com/awe2/GZ2_project
Framework none

Predicting Citation Counts with a Neural Network

Title Predicting Citation Counts with a Neural Network
Authors Tobias Mistele, Tom Price, Sabine Hossenfelder
Abstract We here describe and present results of a simple neural network that predicts individual researchers’ future citation counts based on a variety of data from the researchers’ past. For publications available on the open access-server arXiv.org we find a higher predictability than previous studies.
Tasks
Published 2018-06-12
URL http://arxiv.org/abs/1806.04641v2
PDF http://arxiv.org/pdf/1806.04641v2.pdf
PWC https://paperswithcode.com/paper/predicting-citation-counts-with-a-neural
Repo https://github.com/tmistele/predicting-citation-counts-net
Framework none

Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction

Title Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction
Authors Bo Yang, Sen Wang, Andrew Markham, Niki Trigoni
Abstract We study the problem of recovering an underlying 3D shape from a set of images. Existing learning based approaches usually resort to recurrent neural nets, e.g., GRU, or intuitive pooling operations, e.g., max/mean poolings, to fuse multiple deep features encoded from input images. However, GRU based approaches are unable to consistently estimate 3D shapes given different permutations of the same set of input images as the recurrent unit is permutation variant. It is also unlikely to refine the 3D shape given more images due to the long-term memory loss of GRU. Commonly used pooling approaches are limited to capturing partial information, e.g., max/mean values, ignoring other valuable features. In this paper, we present a new feed-forward neural module, named AttSets, together with a dedicated training algorithm, named FASet, to attentively aggregate an arbitrarily sized deep feature set for multi-view 3D reconstruction. The AttSets module is permutation invariant, computationally efficient and flexible to implement, while the FASet algorithm enables the AttSets based network to be remarkably robust and generalize to an arbitrary number of input images. We thoroughly evaluate FASet and the properties of AttSets on multiple large public datasets. Extensive experiments show that AttSets together with FASet algorithm significantly outperforms existing aggregation approaches.
Tasks 3D Object Reconstruction, 3D Reconstruction
Published 2018-08-02
URL https://arxiv.org/abs/1808.00758v2
PDF https://arxiv.org/pdf/1808.00758v2.pdf
PWC https://paperswithcode.com/paper/attentional-aggregation-of-deep-feature-sets
Repo https://github.com/Yang7879/AttSets
Framework tf

ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents

Title ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents
Authors Stefano Mezza, Alessandra Cervone, Giuliano Tortoreto, Evgeny A. Stepanov, Giuseppe Riccardi
Abstract Dialogue Act (DA) tagging is crucial for spoken language understanding systems, as it provides a general representation of speakers’ intents, not bound to a particular dialogue system. Unfortunately, publicly available data sets with DA annotation are all based on different annotation schemes and thus incompatible with each other. Moreover, their schemes often do not cover all aspects necessary for open-domain human-machine interaction. In this paper, we propose a methodology to map several publicly available corpora to a subset of the ISO standard, in order to create a large task-independent training corpus for DA classification. We show the feasibility of using this corpus to train a domain-independent DA tagger testing it on out-of-domain conversational data, and argue the importance of training on multiple corpora to achieve robustness across different DA categories.
Tasks Spoken Language Understanding
Published 2018-06-12
URL http://arxiv.org/abs/1806.04327v1
PDF http://arxiv.org/pdf/1806.04327v1.pdf
PWC https://paperswithcode.com/paper/iso-standard-domain-independent-dialogue-act
Repo https://github.com/ColingPaper2018/DialogueAct-Tagger
Framework none

Sokoto Coventry Fingerprint Dataset

Title Sokoto Coventry Fingerprint Dataset
Authors Yahaya Isah Shehu, Ariel Ruiz-Garcia, Vasile Palade, Anne James
Abstract This paper presents the Sokoto Coventry Fingerprint Dataset (SOCOFing), a biometric fingerprint database designed for academic research purposes. SOCOFing is made up of 6,000 fingerprint images from 600 African subjects. SOCOFing contains unique attributes such as labels for gender, hand and finger name as well as synthetically altered versions with three different levels of alteration for obliteration, central rotation, and z-cut. The dataset is freely available for noncommercial research purposes at: https://www.kaggle.com/ruizgara/socofing
Tasks
Published 2018-07-24
URL http://arxiv.org/abs/1807.10609v1
PDF http://arxiv.org/pdf/1807.10609v1.pdf
PWC https://paperswithcode.com/paper/sokoto-coventry-fingerprint-dataset
Repo https://github.com/dhammo2/CrimeScienceDataset
Framework none

Glow: Generative Flow with Invertible 1x1 Convolutions

Title Glow: Generative Flow with Invertible 1x1 Convolutions
Authors Diederik P. Kingma, Prafulla Dhariwal
Abstract Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using an invertible 1x1 convolution. Using our method we demonstrate a significant improvement in log-likelihood on standard benchmarks. Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient realistic-looking synthesis and manipulation of large images. The code for our model is available at https://github.com/openai/glow
Tasks Image Generation
Published 2018-07-09
URL http://arxiv.org/abs/1807.03039v2
PDF http://arxiv.org/pdf/1807.03039v2.pdf
PWC https://paperswithcode.com/paper/glow-generative-flow-with-invertible-1x1
Repo https://github.com/openai/glow
Framework tf

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Title Monocular Total Capture: Posing Face, Body, and Hands in the Wild
Authors Donglai Xiang, Hanbyul Joo, Yaser Sheikh
Abstract We present the first method to capture the 3D total motion of a target person from a monocular view input. Given an image or a monocular video, our method reconstructs the motion from body, face, and fingers represented by a 3D deformable mesh model. We use an efficient representation called 3D Part Orientation Fields (POFs), to encode the 3D orientations of all body parts in the common 2D image space. POFs are predicted by a Fully Convolutional Network (FCN), along with the joint confidence maps. To train our network, we collect a new 3D human motion dataset capturing diverse total body motion of 40 subjects in a multiview system. We leverage a 3D deformable human model to reconstruct total body pose from the CNN outputs by exploiting the pose and shape prior in the model. We also present a texture-based tracking method to obtain temporally coherent motion capture output. We perform thorough quantitative evaluations including comparison with the existing body-specific and hand-specific methods, and performance analysis on camera viewpoint and human pose changes. Finally, we demonstrate the results of our total body motion capture on various challenging in-the-wild videos. Our code and newly collected human motion dataset will be publicly shared.
Tasks 3D Human Pose Estimation, Hand Pose Estimation, Motion Capture
Published 2018-12-04
URL http://arxiv.org/abs/1812.01598v1
PDF http://arxiv.org/pdf/1812.01598v1.pdf
PWC https://paperswithcode.com/paper/monocular-total-capture-posing-face-body-and
Repo https://github.com/CMU-Perceptual-Computing-Lab/MonocularTotalCapture
Framework tf

Decoupling Dynamics and Reward for Transfer Learning

Title Decoupling Dynamics and Reward for Transfer Learning
Authors Amy Zhang, Harsh Satija, Joelle Pineau
Abstract Current reinforcement learning (RL) methods can successfully learn single tasks but often generalize poorly to modest perturbations in task domain or training procedure. In this work, we present a decoupled learning strategy for RL that creates a shared representation space where knowledge can be robustly transferred. We separate learning the task representation, the forward dynamics, the inverse dynamics and the reward function of the domain, and show that this decoupling improves performance within the task, transfers well to changes in dynamics and reward, and can be effectively used for online planning. Empirical results show good performance in both continuous and discrete RL domains.
Tasks Transfer Learning
Published 2018-04-27
URL http://arxiv.org/abs/1804.10689v2
PDF http://arxiv.org/pdf/1804.10689v2.pdf
PWC https://paperswithcode.com/paper/decoupling-dynamics-and-reward-for-transfer
Repo https://github.com/facebookresearch/ddr
Framework pytorch

Style Aggregated Network for Facial Landmark Detection

Title Style Aggregated Network for Facial Landmark Detection
Authors Xuanyi Dong, Yan Yan, Wanli Ouyang, Yi Yang
Abstract Recent advances in facial landmark detection achieve success by learning discriminative features from rich deformation of face shapes and poses. Besides the variance of faces themselves, the intrinsic variance of image styles, e.g., grayscale vs. color images, light vs. dark, intense vs. dull, and so on, has constantly been overlooked. This issue becomes inevitable as increasing web images are collected from various sources for training neural networks. In this work, we propose a style-aggregated approach to deal with the large intrinsic variance of image styles for facial landmark detection. Our method transforms original face images to style-aggregated images by a generative adversarial module. The proposed scheme uses the style-aggregated image to maintain face images that are more robust to environmental changes. Then the original face images accompanying with style-aggregated ones play a duet to train a landmark detector which is complementary to each other. In this way, for each face, our method takes two images as input, i.e., one in its original style and the other in the aggregated style. In experiments, we observe that the large variance of image styles would degenerate the performance of facial landmark detectors. Moreover, we show the robustness of our method to the large variance of image styles by comparing to a variant of our approach, in which the generative adversarial module is removed, and no style-aggregated images are used. Our approach is demonstrated to perform well when compared with state-of-the-art algorithms on benchmark datasets AFLW and 300-W. Code is publicly available on GitHub: https://github.com/D-X-Y/SAN
Tasks Facial Landmark Detection
Published 2018-03-12
URL http://arxiv.org/abs/1803.04108v4
PDF http://arxiv.org/pdf/1803.04108v4.pdf
PWC https://paperswithcode.com/paper/style-aggregated-network-for-facial-landmark
Repo https://github.com/D-X-Y/SAN
Framework pytorch

Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution

Title Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution
Authors Yang You, Yujing Lou, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Cewu Lu, Weiming Wang
Abstract Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown. In this paper, we propose a brand new point-set learning framework PRIN, namely, Pointwise Rotation-Invariant Network, focusing on rotation-invariant feature extraction in point clouds analysis. We construct spherical signals by Density Aware Adaptive Sampling to deal with distorted point distributions in spherical space. In addition, we propose Spherical Voxel Convolution and Point Re-sampling to extract rotation-invariant features for each point. Our network can be applied to tasks ranging from object classification, part segmentation, to 3D feature matching and label alignment. We show that, on the dataset with randomly rotated point clouds, PRIN demonstrates better performance than state-of-the-art methods without any data augmentation. We also provide theoretical analysis for the rotation-invariance achieved by our methods.
Tasks 3D Feature Matching, Data Augmentation, Object Classification
Published 2018-11-23
URL https://arxiv.org/abs/1811.09361v5
PDF https://arxiv.org/pdf/1811.09361v5.pdf
PWC https://paperswithcode.com/paper/prin-pointwise-rotation-invariant-network
Repo https://github.com/qq456cvb/PRIN
Framework pytorch

SAFE: Self-Attentive Function Embeddings for Binary Similarity

Title SAFE: Self-Attentive Function Embeddings for Binary Similarity
Authors Luca Massarelli, Giuseppe Antonio Di Luna, Fabio Petroni, Leonardo Querzoni, Roberto Baldoni
Abstract The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).
Tasks Vulnerability Detection
Published 2018-11-13
URL https://arxiv.org/abs/1811.05296v4
PDF https://arxiv.org/pdf/1811.05296v4.pdf
PWC https://paperswithcode.com/paper/safe-self-attentive-function-embeddings-for
Repo https://github.com/gadiluna/SAFE
Framework tf

SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities

Title SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities
Authors Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, Zhaoxuan Chen
Abstract The detection of software vulnerabilities (or vulnerabilities for short) is an important problem that has yet to be tackled, as manifested by many vulnerabilities reported on a daily basis. This calls for machine learning methods to automate vulnerability detection. Deep learning is attractive for this purpose because it does not require human experts to manually define features. Despite the tremendous success of deep learning in other domains, its applicability to vulnerability detection is not systematically understood. In order to fill this void, we propose the first systematic framework for using deep learning to detect vulnerabilities. The framework, dubbed Syntax-based, Semantics-based, and Vector Representations (SySeVR), focuses on obtaining program representations that can accommodate syntax and semantic information pertinent to vulnerabilities. Our experiments with 4 software products demonstrate the usefulness of the framework: we detect 15 vulnerabilities that are not reported in the National Vulnerability Database. Among these 15 vulnerabilities, 7 are unknown and have been reported to the vendors, and the other 8 have been “silently” patched by the vendors when releasing newer versions of the products.
Tasks Vulnerability Detection
Published 2018-07-18
URL http://arxiv.org/abs/1807.06756v2
PDF http://arxiv.org/pdf/1807.06756v2.pdf
PWC https://paperswithcode.com/paper/sysevr-a-framework-for-using-deep-learning-to
Repo https://github.com/NIPC-DL/covec
Framework pytorch
comments powered by Disqus