October 20, 2019

3091 words 15 mins read

Paper Group AWR 349

Paper Group AWR 349

HiDDeN: Hiding Data With Deep Networks. Non-Stationary Texture Synthesis by Adversarial Expansion. Rapid Autonomous Car Control based on Spatial and Temporal Visual Cues. Photometric Depth Super-Resolution. Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network. Modeling of nonlinear audio effects with end-to-end deep …

HiDDeN: Hiding Data With Deep Networks

Title HiDDeN: Hiding Data With Deep Networks
Authors Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei
Abstract Recent work has shown that deep neural networks are highly sensitive to tiny perturbations of input images, giving rise to adversarial examples. Though this property is usually considered a weakness of learned models, we explore whether it can be beneficial. We find that neural networks can learn to use invisible perturbations to encode a rich amount of useful information. In fact, one can exploit this capability for the task of data hiding. We jointly train encoder and decoder networks, where given an input message and cover image, the encoder produces a visually indistinguishable encoded image, from which the decoder can recover the original message. We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression. Even though JPEG is non-differentiable, we show that a robust model can be trained using differentiable approximations. Finally, we demonstrate that adversarial training improves the visual quality of encoded images.
Tasks
Published 2018-07-26
URL http://arxiv.org/abs/1807.09937v1
PDF http://arxiv.org/pdf/1807.09937v1.pdf
PWC https://paperswithcode.com/paper/hidden-hiding-data-with-deep-networks
Repo https://github.com/jirenz/HiDDeN
Framework pytorch

Non-Stationary Texture Synthesis by Adversarial Expansion

Title Non-Stationary Texture Synthesis by Adversarial Expansion
Authors Yang Zhou, Zhen Zhu, Xiang Bai, Dani Lischinski, Daniel Cohen-Or, Hui Huang
Abstract The real world exhibits an abundance of non-stationary textures. Examples include textures with large-scale structures, as well as spatially variant and inhomogeneous textures. While existing example-based texture synthesis methods can cope well with stationary textures, non-stationary textures still pose a considerable challenge, which remains unresolved. In this paper, we propose a new approach for example-based non-stationary texture synthesis. Our approach uses a generative adversarial network (GAN), trained to double the spatial extent of texture blocks extracted from a specific texture exemplar. Once trained, the fully convolutional generator is able to expand the size of the entire exemplar, as well as of any of its sub-blocks. We demonstrate that this conceptually simple approach is highly effective for capturing large-scale structures, as well as other non-stationary attributes of the input exemplar. As a result, it can cope with challenging textures, which, to our knowledge, no other existing method can handle.
Tasks Texture Synthesis
Published 2018-05-11
URL http://arxiv.org/abs/1805.04487v1
PDF http://arxiv.org/pdf/1805.04487v1.pdf
PWC https://paperswithcode.com/paper/non-stationary-texture-synthesis-by
Repo https://github.com/jessemelpolio/non-stationary_texture_syn
Framework pytorch

Rapid Autonomous Car Control based on Spatial and Temporal Visual Cues

Title Rapid Autonomous Car Control based on Spatial and Temporal Visual Cues
Authors Surya Dantuluri
Abstract We present a novel approach to modern car control utilizing a combination of Deep Convolutional Neural Networks and Long Short-Term Memory Systems: Both of which are a subsection of Hierarchical Representations Learning, more commonly known as Deep Learning. Using Deep Convolutional Neural Networks and Long Short-Term Memory Systems (DCNN/LSTM), we propose an end-to-end approach to accurately predict steering angles and throttle values. We use this algorithm on our latest robot, El Toro Grande 1 (ETG) which is equipped with a variety of sensors in order to localize itself in its environment. Using previous training data and the data that it collects during circuit and drag races, it predicts throttle and steering angles in order to stay on path and avoid colliding into other robots. This allows ETG to theoretically race on any track with sufficient training data.
Tasks
Published 2018-07-22
URL http://arxiv.org/abs/1807.08233v1
PDF http://arxiv.org/pdf/1807.08233v1.pdf
PWC https://paperswithcode.com/paper/rapid-autonomous-car-control-based-on-spatial
Repo https://github.com/dantuluri/raccoon
Framework none

Photometric Depth Super-Resolution

Title Photometric Depth Super-Resolution
Authors Bjoern Haefner, Songyou Peng, Alok Verma, Yvain Quéau, Daniel Cremers
Abstract This study explores the use of photometric techniques (shape-from-shading and uncalibrated photometric stereo) for upsampling the low-resolution depth map from an RGB-D sensor to the higher resolution of the companion RGB image. A single-shot variational approach is first put forward, which is effective as long as the target’s reflectance is piecewise-constant. It is then shown that this dependency upon a specific reflectance model can be relaxed by focusing on a specific class of objects (e.g., faces), and delegate reflectance estimation to a deep neural network. A multi-shot strategy based on randomly varying lighting conditions is eventually discussed. It requires no training or prior on the reflectance, yet this comes at the price of a dedicated acquisition setup. Both quantitative and qualitative evaluations illustrate the effectiveness of the proposed methods on synthetic and real-world scenarios.
Tasks Super-Resolution
Published 2018-09-26
URL https://arxiv.org/abs/1809.10097v2
PDF https://arxiv.org/pdf/1809.10097v2.pdf
PWC https://paperswithcode.com/paper/photometric-depth-super-resolution
Repo https://github.com/pengsongyou/SRmeetsPS
Framework none

Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network

Title Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network
Authors Zizhao Zhang, Yuanpu Xie, Lin Yang
Abstract This paper presents a novel method to deal with the challenging task of generating photographic images conditioned on semantic image descriptions. Our method introduces accompanying hierarchical-nested adversarial objectives inside the network hierarchies, which regularize mid-level representations and assist generator training to capture the complex image statistics. We present an extensile single-stream generator architecture to better adapt the jointed discriminators and push generated images up to high resolutions. We adopt a multi-purpose adversarial loss to encourage more effective image and text information usage in order to improve the semantic consistency and image fidelity simultaneously. Furthermore, we introduce a new visual-semantic similarity measure to evaluate the semantic consistency of generated images. With extensive experimental validation on three public datasets, our method significantly improves previous state of the arts on all datasets over different evaluation metrics.
Tasks Image Generation, Semantic Similarity, Semantic Textual Similarity
Published 2018-02-26
URL http://arxiv.org/abs/1802.09178v2
PDF http://arxiv.org/pdf/1802.09178v2.pdf
PWC https://paperswithcode.com/paper/photographic-text-to-image-synthesis-with-a
Repo https://github.com/ypxie/HDGan
Framework pytorch

Modeling of nonlinear audio effects with end-to-end deep neural networks

Title Modeling of nonlinear audio effects with end-to-end deep neural networks
Authors Marco Martínez, Joshua D. Reiss
Abstract In the context of music production, distortion effects are mainly used for aesthetic reasons and are usually applied to electric musical instruments. Most existing methods for nonlinear modeling are often either simplified or optimized to a very specific circuit. In this work, we investigate deep learning architectures for audio processing and we aim to find a general purpose end-to-end deep neural network to perform modeling of nonlinear audio effects. We show the network modeling various nonlinearities and we discuss the generalization capabilities among different instruments.
Tasks
Published 2018-10-15
URL http://arxiv.org/abs/1810.06603v2
PDF http://arxiv.org/pdf/1810.06603v2.pdf
PWC https://paperswithcode.com/paper/modeling-of-nonlinear-audio-effects-with-end
Repo https://github.com/mchijmma/modeling-nonlinear
Framework none

Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning

Title Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning
Authors Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn
Abstract Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time. Given that it is impractical to train separate policies to accommodate all situations the agent may see in the real world, this work proposes to learn how to quickly and effectively adapt online to new tasks. To enable sample-efficient learning, we consider learning online adaptation in the context of model-based reinforcement learning. Our approach uses meta-learning to train a dynamics model prior such that, when combined with recent data, this prior can be rapidly adapted to the local context. Our experiments demonstrate online adaptation for continuous control tasks on both simulated and real-world agents. We first show simulated agents adapting their behavior online to novel terrains, crippled body parts, and highly-dynamic environments. We also illustrate the importance of incorporating online adaptation into autonomous agents that operate in the real world by applying our method to a real dynamic legged millirobot. We demonstrate the agent’s learned ability to quickly adapt online to a missing leg, adjust to novel terrains and slopes, account for miscalibration or errors in pose estimation, and compensate for pulling payloads.
Tasks Continuous Control, Meta-Learning, Pose Estimation
Published 2018-03-30
URL http://arxiv.org/abs/1803.11347v6
PDF http://arxiv.org/pdf/1803.11347v6.pdf
PWC https://paperswithcode.com/paper/learning-to-adapt-in-dynamic-real-world
Repo https://github.com/iclavera/learning_to_adapt
Framework tf

Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach

Title Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach
Authors Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, Cho-Jui Hsieh
Abstract We study the problem of attacking a machine learning model in the hard-label black-box setting, where no model information is revealed except that the attacker can make queries to probe the corresponding hard-label decisions. This is a very challenging problem since the direct extension of state-of-the-art white-box attacks (e.g., CW or PGD) to the hard-label black-box setting will require minimizing a non-continuous step function, which is combinatorial and cannot be solved by a gradient-based optimizer. The only current approach is based on random walk on the boundary, which requires lots of queries and lacks convergence guarantees. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. For example, using the Randomized Gradient-Free method, we are able to bound the number of iterations needed for our algorithm to achieve stationary points. We demonstrate that our proposed method outperforms the previous random walk approach to attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Decision Trees (GBDT).
Tasks
Published 2018-07-12
URL http://arxiv.org/abs/1807.04457v1
PDF http://arxiv.org/pdf/1807.04457v1.pdf
PWC https://paperswithcode.com/paper/query-efficient-hard-label-black-box-attackan
Repo https://github.com/SachJbp/Black-Box-Adversarial-Attack-in-Discrete-Domain
Framework none

Geometry meets semantics for semi-supervised monocular depth estimation

Title Geometry meets semantics for semi-supervised monocular depth estimation
Authors Pierluigi Zama Ramirez, Matteo Poggi, Fabio Tosi, Stefano Mattoccia, Luigi Di Stefano
Abstract Depth estimation from a single image represents a very exciting challenge in computer vision. While other image-based depth sensing techniques leverage on the geometry between different viewpoints (e.g., stereo or structure from motion), the lack of these cues within a single image renders ill-posed the monocular depth estimation task. For inference, state-of-the-art encoder-decoder architectures for monocular depth estimation rely on effective feature representations learned at training time. For unsupervised training of these models, geometry has been effectively exploited by suitable images warping losses computed from views acquired by a stereo rig or a moving camera. In this paper, we make a further step forward showing that learning semantic information from images enables to improve effectively monocular depth estimation as well. In particular, by leveraging on semantically labeled images together with unsupervised signals gained by geometry through an image warping loss, we propose a deep learning approach aimed at joint semantic segmentation and depth estimation. Our overall learning framework is semi-supervised, as we deploy groundtruth data only in the semantic domain. At training time, our network learns a common feature representation for both tasks and a novel cross-task loss function is proposed. The experimental findings show how, jointly tackling depth prediction and semantic segmentation, allows to improve depth estimation accuracy. In particular, on the KITTI dataset our network outperforms state-of-the-art methods for monocular depth estimation.
Tasks Depth Estimation, Monocular Depth Estimation, Semantic Segmentation
Published 2018-10-09
URL http://arxiv.org/abs/1810.04093v2
PDF http://arxiv.org/pdf/1810.04093v2.pdf
PWC https://paperswithcode.com/paper/geometry-meets-semantics-for-semi-supervised
Repo https://github.com/CVLAB-Unibo/Semantic-Mono-Depth
Framework tf

Structured Neural Summarization

Title Structured Neural Summarization
Authors Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt
Abstract Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input. Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data such as text. In an extensive evaluation, we show that the resulting hybrid sequence-graph models outperform both pure sequence models as well as pure graph models on a range of summarization tasks.
Tasks
Published 2018-11-05
URL http://arxiv.org/abs/1811.01824v2
PDF http://arxiv.org/pdf/1811.01824v2.pdf
PWC https://paperswithcode.com/paper/structured-neural-summarization
Repo https://github.com/CoderPat/structured-neural-summarization
Framework tf

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Title Modeling Others using Oneself in Multi-Agent Reinforcement Learning
Authors Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus
Abstract We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other players’ hidden goals from their observed behavior in order to solve the tasks. We propose a new approach for learning in these domains: Self Other-Modeling (SOM), in which an agent uses its own policy to predict the other agent’s actions and update its belief of their hidden state in an online manner. We evaluate this approach on three different tasks and show that the agents are able to learn better policies using their estimate of the other players’ hidden states, in both cooperative and adversarial settings.
Tasks Multi-agent Reinforcement Learning
Published 2018-02-26
URL http://arxiv.org/abs/1802.09640v3
PDF http://arxiv.org/pdf/1802.09640v3.pdf
PWC https://paperswithcode.com/paper/modeling-others-using-oneself-in-multi-agent
Repo https://github.com/cts198859/deeprl_dist
Framework tf

Data-driven Analysis of Complex Networks and their Model-generated Counterparts

Title Data-driven Analysis of Complex Networks and their Model-generated Counterparts
Authors Marcell Nagy, Roland Molontay
Abstract Data-driven analysis of complex networks has been in the focus of research for decades. An important question is to discover the relation between various network characteristics in real-world networks and how these relationships vary across network domains. A related research question is to study how well the network models can capture the observed relations between the graph metrics. In this paper, we apply statistical and machine learning techniques to answer the aforementioned questions. We study 400 real-world networks along with 2400 networks generated by five frequently used network models with previously fitted parameters to make the generated graphs as similar to the real network as possible. We find that the correlation profiles of the structural measures significantly differ across network domains and the domain can be efficiently determined using a small selection of graph metrics. The goodness-of-fit of the network models and the best performing models themselves highly depend on the domains. Using machine learning techniques, it turned out to be relatively easy to decide if a network is real or model-generated. We also investigate what structural properties make it possible to achieve a good accuracy, i.e. what features the network models cannot capture.
Tasks
Published 2018-10-19
URL http://arxiv.org/abs/1810.08498v3
PDF http://arxiv.org/pdf/1810.08498v3.pdf
PWC https://paperswithcode.com/paper/data-driven-analysis-of-complex-networks-and
Repo https://github.com/marcessz/Social-Networks
Framework none

Nugget Proposal Networks for Chinese Event Detection

Title Nugget Proposal Networks for Chinese Event Detection
Authors Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
Abstract Neural network based models commonly regard event detection as a word-wise classification task, which suffer from the mismatch problem between words and event triggers, especially in languages without natural word delimiters such as Chinese. In this paper, we propose Nugget Proposal Networks (NPNs), which can solve the word-trigger mismatch problem by directly proposing entire trigger nuggets centered at each character regardless of word boundaries. Specifically, NPNs perform event detection in a character-wise paradigm, where a hybrid representation for each character is first learned to capture both structural and semantic information from both characters and words. Then based on learned representations, trigger nuggets are proposed and categorized by exploiting character compositional structures of Chinese event triggers. Experiments on both ACE2005 and TAC KBP 2017 datasets show that NPNs significantly outperform the state-of-the-art methods.
Tasks
Published 2018-05-01
URL http://arxiv.org/abs/1805.00249v1
PDF http://arxiv.org/pdf/1805.00249v1.pdf
PWC https://paperswithcode.com/paper/nugget-proposal-networks-for-chinese-event
Repo https://github.com/sanmusunrise/NPNs
Framework tf

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Title Combining Fact Extraction and Verification with Neural Semantic Matching Networks
Authors Yixin Nie, Haonan Chen, Mohit Bansal
Abstract The increasing concern with misinformation has stimulated research efforts on automatic fact checking. The recently-released FEVER dataset introduced a benchmark fact-verification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. In this paper, we present a connected system consisting of three homogeneous neural semantic matching models that conduct document retrieval, sentence selection, and claim verification jointly for fact extraction and verification. For evidence retrieval (document retrieval and sentence selection), unlike traditional vector space IR models in which queries and sources are matched in some pre-designed term vector space, we develop neural models to perform deep semantic matching from raw textual input, assuming no intermediate term representation and no access to structured external knowledge bases. We also show that Pageview frequency can also help improve the performance of evidence retrieval results, that later can be matched by using our neural semantic matching network. For claim verification, unlike previous approaches that simply feed upstream retrieved evidence and the claim to a natural language inference (NLI) model, we further enhance the NLI model by providing it with internal semantic relatedness scores (hence integrating it with the evidence retrieval modules) and ontological WordNet features. Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and WordNet features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set.
Tasks Natural Language Inference
Published 2018-11-16
URL http://arxiv.org/abs/1811.07039v1
PDF http://arxiv.org/pdf/1811.07039v1.pdf
PWC https://paperswithcode.com/paper/combining-fact-extraction-and-verification
Repo https://github.com/easonnie/combine-FEVER-NSMN
Framework pytorch

Demystifying MMD GANs

Title Demystifying MMD GANs
Authors Mikołaj Bińkowski, Dougal J. Sutherland, Michael Arbel, Arthur Gretton
Abstract We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs. As our main theoretical contribution, we clarify the situation with bias in GAN loss functions raised by recent work: we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased, but learning a discriminator based on samples leads to biased gradients for the generator parameters. We also discuss the issue of kernel choice for the MMD critic, and characterize the kernel corresponding to the energy distance used for the Cramer GAN critic. Being an integral probability metric, the MMD benefits from training strategies recently developed for Wasserstein GANs. In experiments, the MMD GAN is able to employ a smaller critic network than the Wasserstein GAN, resulting in a simpler and faster-training algorithm with matching performance. We also propose an improved measure of GAN convergence, the Kernel Inception Distance, and show how to use it to dynamically adapt learning rates during GAN training.
Tasks
Published 2018-01-04
URL http://arxiv.org/abs/1801.01401v4
PDF http://arxiv.org/pdf/1801.01401v4.pdf
PWC https://paperswithcode.com/paper/demystifying-mmd-gans
Repo https://github.com/mbinkowski/MMD-GAN
Framework tf
comments powered by Disqus