January 28, 2020

3435 words 17 mins read

Paper Group ANR 1003

Paper Group ANR 1003

Adversarial Language Games for Advanced Natural Language Intelligence. Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression. ZstGAN: An Adversarial Approach for Unsupervised Zero-Shot Image-to-Image Translation. Network Transplanting (extended abstract). Automated Non-Destructive Inspection of Fused Filament Fabrication Compon …

Adversarial Language Games for Advanced Natural Language Intelligence

Title Adversarial Language Games for Advanced Natural Language Intelligence
Authors Yuan Yao, Haoxi Zhong, Zhengyan Zhang, Xu Han, Xiaozhi Wang, Chaojun Xiao, Guoyang Zeng, Zhiyuan Liu, Maosong Sun
Abstract While adversarial games have been well studied in various board games and electronic sports games, etc., such adversarial games remain a nearly blank field in natural language processing. As natural language is inherently an interactive game, we propose a challenging pragmatics game called Adversarial Taboo, in which an attacker and a defender compete with each other through sequential natural language interactions. The attacker is tasked with inducing the defender to speak a target word invisible to the defender, while the defender is tasked with detecting the target word before being induced by the attacker. In Adversarial Taboo, a successful attacker must hide its intention and subtly induce the defender, while a competitive defender must be cautious with its utterances and infer the intention of the attacker. To instantiate the game, we create a game environment and a competition platform. Sufficient pilot experiments and empirical studies on several baseline attack and defense strategies show promising and interesting results. Based on the analysis on the game and experiments, we discuss multiple promising directions for future research.
Tasks Board Games
Published 2019-11-05
URL https://arxiv.org/abs/1911.01622v2
PDF https://arxiv.org/pdf/1911.01622v2.pdf
PWC https://paperswithcode.com/paper/adversarial-language-games-for-advanced
Repo
Framework

Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression

Title Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression
Authors Kai Wang, Jianfei Yang, Da Guo, Kaipeng Zhang, Xiaojiang Peng, Yu Qiao
Abstract This paper presents our approach for the engagement intensity regression task of EmotiW 2019. The task is to predict the engagement intensity value of a student when he or she is watching an online MOOCs video in various conditions. Based on our winner solution last year, we mainly explore head features and body features with a bootstrap strategy and two novel loss functions in this paper. We maintain the framework of multi-instance learning with long short-term memory (LSTM) network, and make three contributions. First, besides of the gaze and head pose features, we explore facial landmark features in our framework. Second, inspired by the fact that engagement intensity can be ranked in values, we design a rank loss as a regularization which enforces a distance margin between the features of distant category pairs and adjacent category pairs. Third, we use the classical bootstrap aggregation method to perform model ensemble which randomly samples a certain training data by several times and then averages the model predictions. We evaluate the performance of our method and discuss the influence of each part on the validation dataset. Our methods finally win 3rd place with MSE of 0.0626 on the testing set.
Tasks
Published 2019-07-08
URL https://arxiv.org/abs/1907.03422v1
PDF https://arxiv.org/pdf/1907.03422v1.pdf
PWC https://paperswithcode.com/paper/bootstrap-model-ensemble-and-rank-loss-for
Repo
Framework

ZstGAN: An Adversarial Approach for Unsupervised Zero-Shot Image-to-Image Translation

Title ZstGAN: An Adversarial Approach for Unsupervised Zero-Shot Image-to-Image Translation
Authors Jianxin Lin, Yingce Xia, Sen Liu, Tao Qin, Zhibo Chen
Abstract Image-to-image translation models have shown remarkable ability on transferring images among different domains. Most of existing work follows the setting that the source domain and target domain keep the same at training and inference phases, which cannot be generalized to the scenarios for translating an image from an unseen domain to an another unseen domain. In this work, we propose the Unsupervised Zero-Shot Image-to-image Translation (UZSIT) problem, which aims to learn a model that can transfer translation knowledge from seen domains to unseen domains. Accordingly, we propose a framework called ZstGAN: By introducing an adversarial training scheme, ZstGAN learns to model each domain with domain-specific feature distribution that is semantically consistent on vision and attribute modalities. Then the domain-invariant features are disentangled with an shared encoder for image generation. We carry out extensive experiments on CUB and FLO datasets, and the results demonstrate the effectiveness of proposed method on UZSIT task. Moreover, ZstGAN shows significant accuracy improvements over state-of-the-art zero-shot learning methods on CUB and FLO.
Tasks Image Generation, Image-to-Image Translation, Zero-Shot Learning
Published 2019-06-01
URL https://arxiv.org/abs/1906.00184v1
PDF https://arxiv.org/pdf/1906.00184v1.pdf
PWC https://paperswithcode.com/paper/190600184
Repo
Framework

Network Transplanting (extended abstract)

Title Network Transplanting (extended abstract)
Authors Quanshi Zhang, Yu Yang, Qian Yu, Ying Nian Wu
Abstract This paper focuses on a new task, i.e., transplanting a category-and-task-specific neural network to a generic, modular network without strong supervision. We design a functionally interpretable structure for the generic network. Like building LEGO blocks, we teach the generic network a new category by directly transplanting the module corresponding to the category from a pre-trained network with a few or even without sample annotations. Our method incrementally adds new categories to the generic network but does not affect representations of existing categories. In this way, our method breaks the typical bottleneck of learning a net for massive tasks and categories, i.e., the requirement of collecting samples for all tasks and categories at the same time before the learning begins. Thus, we use a new distillation algorithm, namely back-distillation, to overcome specific challenges of network transplanting. Our method without training samples even outperformed the baseline with 100 training samples.
Tasks
Published 2019-01-21
URL http://arxiv.org/abs/1901.06978v1
PDF http://arxiv.org/pdf/1901.06978v1.pdf
PWC https://paperswithcode.com/paper/network-transplanting-extended-abstract
Repo
Framework

Automated Non-Destructive Inspection of Fused Filament Fabrication Components Using Thermographic Signal Reconstruction

Title Automated Non-Destructive Inspection of Fused Filament Fabrication Components Using Thermographic Signal Reconstruction
Authors Joshua E. Siegel, Maria F. Beemer, Steven M. Shepard
Abstract Manufacturers struggle to produce low-cost, robust and complex components at manufacturing lot-size one. Additive processes like Fused Filament Fabrication (FFF) inexpensively produce complex geometries, but defects limit viability in critical applications. We present an approach to high-accuracy, high-throughput and low-cost automated non-destructive testing (NDT) for FFF interlayer delamination using Flash Thermography (FT) data processed with Thermographic Signal Reconstruction (TSR) and Artificial Intelligence (AI). A Deep Neural Network (DNN) attains 95.4% per-pixel accuracy when differentiating four delamination thicknesses 5mm subsurface in PolyLactic Acid (PLA) widgets, and 98.6% accuracy in differentiating acceptable from unacceptable condition for the same components. Automated inspection enables time- and cost-efficient 100% inspection for delamination defects, supporting FFF’s use in critical and small-batch applications.
Tasks
Published 2019-07-05
URL https://arxiv.org/abs/1907.02634v1
PDF https://arxiv.org/pdf/1907.02634v1.pdf
PWC https://paperswithcode.com/paper/automated-non-destructive-inspection-of-fused
Repo
Framework

Solving high-dimensional optimal stopping problems using deep learning

Title Solving high-dimensional optimal stopping problems using deep learning
Authors Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, Timo Welti
Abstract Nowadays many financial derivatives which are traded on stock and futures exchanges, such as American or Bermudan options, are of early exercise type. Often the pricing of early exercise options gives rise to high-dimensional optimal stopping problems, since the dimension corresponds to the number of underlyings in the associated hedging portfolio. High-dimensional optimal stopping problems are, however, notoriously difficult to solve due to the well-known curse of dimensionality. In this work we propose an algorithm for solving such problems, which is based on deep learning and computes, in the context of early exercise option pricing, both approximations for an optimal exercise strategy and the price of the considered option. The proposed algorithm can also be applied to optimal stopping problems that arise in other areas where the underlying stochastic process can be efficiently simulated. We present numerical results for a large number of example problems, which include the pricing of many high-dimensional American and Bermudan options such as, for example, Bermudan max-call options in up to 5000 dimensions. Most of the obtained results are compared to reference values computed by exploiting the specific problem design or, where available, to reference values from the literature. These numerical results suggest that the proposed algorithm is highly effective in the case of many underlyings, in terms of both accuracy and speed.
Tasks
Published 2019-08-05
URL https://arxiv.org/abs/1908.01602v2
PDF https://arxiv.org/pdf/1908.01602v2.pdf
PWC https://paperswithcode.com/paper/solving-high-dimensional-optimal-stopping
Repo
Framework

Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection

Title Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection
Authors Changhee Han, Leonardo Rundo, Ryosuke Araki, Yudai Nagano, Yujiro Furukawa, Giancarlo Mauri, Hideki Nakayama, Hideaki Hayashi
Abstract Convolutional Neural Networks (CNNs) achieve excellent computer-assisted diagnosis with sufficient annotated training data. However, most medical imaging datasets are small and fragmented. In this context, Generative Adversarial Networks (GANs) can synthesize realistic/diverse additional training images to fill the data lack in the real image distribution; researchers have improved classification by augmenting data with noise-to-image (e.g., random noise samples to diverse pathological images) or image-to-image GANs (e.g., a benign image to a malignant one). Yet, no research has reported results combining noise-to-image and image-to-image GANs for further performance boost. Therefore, to maximize the DA effect with the GAN combinations, we propose a two-step GAN-based DA that generates and refines brain Magnetic Resonance (MR) images with/without tumors separately: (i) Progressive Growing of GANs (PGGANs), multi-stage noise-to-image GAN for high-resolution MR image generation, first generates realistic/diverse 256 X 256 images; (ii) Multimodal UNsupervised Image-to-image Translation (MUNIT) that combines GANs/Variational AutoEncoders or SimGAN that uses a DA-focused GAN loss, further refines the texture/shape of the PGGAN-generated images similarly to the real ones. We thoroughly investigate CNN-based tumor classification results, also considering the influence of pre-training on ImageNet and discarding weird-looking GAN-generated images. The results show that, when combined with classic DA, our two-step GAN-based DA can significantly outperform the classic DA alone, in tumor detection (i.e., boosting sensitivity 93.67% to 97.48%) and also in other medical imaging tasks.
Tasks Data Augmentation, Image Augmentation, Image Generation, Image-to-Image Translation, Multimodal Unsupervised Image-To-Image Translation, Unsupervised Image-To-Image Translation
Published 2019-05-31
URL https://arxiv.org/abs/1905.13456v3
PDF https://arxiv.org/pdf/1905.13456v3.pdf
PWC https://paperswithcode.com/paper/combining-noise-to-image-and-image-to-image
Repo
Framework

Deep learning and sub-word-unit approach in written art generation

Title Deep learning and sub-word-unit approach in written art generation
Authors Krzysztof Wołk, Emilia Zawadzka-Gosk, Wojciech Czarnowski
Abstract Automatic poetry generation is novel and interesting application of natural language processing research. It became more popular during the last few years due to the rapid development of technology and neural computing power. This line of research can be applied to the study of linguistics and literature, for social science experiments, or simply for entertainment. The most effective known method of artificial poem generation uses recurrent neural networks (RNN). We also used RNNs to generate poems in the style of Adam Mickiewicz. Our network was trained on the Sir Thaddeus poem. For data pre-processing, we used a specialized stemming tool, which is one of the major innovations and contributions of this work. Our experiment was conducted on the source text, divided into sub-word units (at a level of resolution close to syllables). This approach is novel and is not often employed in the published literature. The subwords units seem to be a natural choice for analysis of the Polish language, as the language is morphologically rich due to cases, gender forms and a large vocabulary. Moreover, Sir Thaddeus contains rhymes, so the analysis of syllables can be meaningful. We verified our model with different settings for the temperature parameter, which controls the randomness of the generated text. We also compared our results with similar models trained on the same text but divided into characters (which is the most common approach alongside the use of full word units). The differences were tremendous. Our solution generated much better poems that were able to follow the metre and vocabulary of the source data text.
Tasks
Published 2019-01-22
URL http://arxiv.org/abs/1901.07426v1
PDF http://arxiv.org/pdf/1901.07426v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-and-sub-word-unit-approach-in
Repo
Framework

Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks

Title Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks
Authors Xiang Li, Chan Lu, Danni Cheng, Wei-Hong Li, Mei Cao, Bo Liu, Jiechao Ma, Wei-Shi Zheng
Abstract Visible watermark plays an important role in image copyright protection and the robustness of a visible watermark to an attack is shown to be essential. To evaluate and improve the effectiveness of watermark, watermark removal attracts increasing attention and becomes a hot research top. Current methods cast the watermark removal as an image-to-image translation problem where the encode-decode architectures with pixel-wise loss are adopted to transfer the transparent watermarked pixels into unmarked pixels. However, when a number of realistic images are presented, the watermarks are more likely to be unknown and diverse (i.e., the watermarks might be opaque or semi-transparent; the category and pattern of watermarks are unknown). When applying existing methods to the real-world scenarios, they mostly can not satisfactorily reconstruct the hidden information obscured under the complex and various watermarks (i.e., the residual watermark traces remain and the reconstructed images lack reality). To address this difficulty, in this paper, we present a new watermark processing framework using the conditional generative adversarial networks (cGANs) for visible watermark removal in the real-world application. The proposed method enables the watermark removal solution to be more closed to the photo-realistic reconstruction using a patch-based discriminator conditioned on the watermarked images, which is adversarially trained to differentiate the difference between the recovered images and original watermark-free images. Extensive experimental results on a large-scale visible watermark dataset demonstrate the effectiveness of the proposed method and clearly indicate that our proposed approach can produce more photo-realistic and convincing results compared with the state-of-the-art methods.
Tasks Image-to-Image Translation
Published 2019-05-30
URL https://arxiv.org/abs/1905.12845v3
PDF https://arxiv.org/pdf/1905.12845v3.pdf
PWC https://paperswithcode.com/paper/towards-photo-realistic-visible-watermark
Repo
Framework

Attention on Abstract Visual Reasoning

Title Attention on Abstract Visual Reasoning
Authors Lukas Hahne, Timo Lüddecke, Florentin Wörgötter, David Kappel
Abstract Attention mechanisms have been boosting the performance of deep learning models on a wide range of applications, ranging from speech understanding to program induction. However, despite experiments from psychology which suggest that attention plays an essential role in visual reasoning, the full potential of attention mechanisms has so far not been explored to solve abstract cognitive tasks on image data. In this work, we propose a hybrid network architecture, grounded on self-attention and relational reasoning. We call this new model Attention Relation Network (ARNe). ARNe combines features from the recently introduced Transformer and the Wild Relation Network (WReN). We test ARNe on the Procedurally Generated Matrices (PGMs) datasets for abstract visual reasoning. ARNe excels the WReN model on this task by 11.28 ppt. Relational concepts between objects are efficiently learned demanding only 35% of the training samples to surpass reported accuracy of the base line model. Our proposed hybrid model, represents an alternative on learning abstract relations using self-attention and demonstrates that the Transformer network is also well suited for abstract visual reasoning.
Tasks Relational Reasoning, Visual Reasoning
Published 2019-11-14
URL https://arxiv.org/abs/1911.05990v1
PDF https://arxiv.org/pdf/1911.05990v1.pdf
PWC https://paperswithcode.com/paper/attention-on-abstract-visual-reasoning
Repo
Framework

Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs

Title Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs
Authors Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, Heiko Hoffmann
Abstract The unprecedented success of deep neural networks in various applications have made these networks a prime target for adversarial exploitation. In this paper, we introduce a benchmark technique for detecting backdoor attacks (aka Trojan attacks) on deep convolutional neural networks (CNNs). We introduce the concept of Universal Litmus Patterns (ULPs), which enable one to reveal backdoor attacks by feeding these universal patterns to the network and analyzing the output (i.e., classifying as clean' or corrupted’). This detection is fast because it requires only a few forward passes through a CNN. We demonstrate the effectiveness of ULPs for detecting backdoor attacks on thousands of networks trained on three benchmark datasets, namely the German Traffic Sign Recognition Benchmark (GTSRB), MNIST, and CIFAR10.
Tasks Traffic Sign Recognition
Published 2019-06-26
URL https://arxiv.org/abs/1906.10842v1
PDF https://arxiv.org/pdf/1906.10842v1.pdf
PWC https://paperswithcode.com/paper/universal-litmus-patterns-revealing-backdoor
Repo
Framework

De-identification without losing faces

Title De-identification without losing faces
Authors Yuezun Li, Siwei Lyu
Abstract Training of deep learning models for computer vision requires large image or video datasets from real world. Often, in collecting such datasets, we need to protect the privacy of the people captured in the images or videos, while still preserve the useful attributes such as facial expressions. In this work, we describe a new face de-identification method that can preserve essential facial attributes in the faces while concealing the identities. Our method takes advantage of the recent advances in face attribute transfer models, while maintaining a high visual quality. Instead of changing factors of the original faces or synthesizing faces completely, our method use a trained facial attribute transfer model to map non-identity related facial attributes to the face of donors, who are a small number (usually 2 to 3) of consented subjects. Using the donors’ faces ensures that the natural appearance of the synthesized faces, while ensuring the identity of the synthesized faces are changed. On the other hand, the FATM blends the donors’ facial attributes to those of the original faces to diversify the appearance of the synthesized faces. Experimental results on several sets of images and videos demonstrate the effectiveness of our face de-ID algorithm.
Tasks
Published 2019-02-12
URL http://arxiv.org/abs/1902.04202v1
PDF http://arxiv.org/pdf/1902.04202v1.pdf
PWC https://paperswithcode.com/paper/de-identification-without-losing-faces
Repo
Framework

Fully Automatic Segmentation of 3D Brain Ultrasound: Learning from Coarse Annotations

Title Fully Automatic Segmentation of 3D Brain Ultrasound: Learning from Coarse Annotations
Authors Julia Rackerseder, Rüdiger Göbl, Nassir Navab, Christoph Hennersperger
Abstract Intra-operative ultrasound is an increasingly important imaging modality in neurosurgery. However, manual interaction with imaging data during the procedures, for example to select landmarks or perform segmentation, is difficult and can be time consuming. Yet, as registration to other imaging modalities is required in most cases, some annotation is necessary. We propose a segmentation method based on DeepVNet and specifically evaluate the integration of pre-training with simulated ultrasound sweeps to improve automatic segmentation and enable a fully automatic initialization of registration. In this view, we show that despite training on coarse and incomplete semi-automatic annotations, our approach is able to capture the desired superficial structures such as \textit{sulci}, the \textit{cerebellar tentorium}, and the \textit{falx cerebri}. We perform a five-fold cross-validation on the publicly available RESECT dataset. Trained on the dataset alone, we report a Dice and Jaccard coefficient of $0.45 \pm 0.09$ and $0.30 \pm 0.07$ respectively, as well as an average distance of $0.78 \pm 0.36~mm$. With the suggested pre-training, we computed a Dice and Jaccard coefficient of $0.47 \pm 0.10$ and $0.31 \pm 0.08$, and an average distance of $0.71 \pm 0.38~mm$. The qualitative evaluation suggest that with pre-training the network can learn to generalize better and provide refined and more complete segmentations in comparison to incomplete annotations provided as input.
Tasks
Published 2019-04-18
URL http://arxiv.org/abs/1904.08655v1
PDF http://arxiv.org/pdf/1904.08655v1.pdf
PWC https://paperswithcode.com/paper/fully-automatic-segmentation-of-3d-brain
Repo
Framework

Learning Digital Camera Pipeline for Extreme Low-Light Imaging

Title Learning Digital Camera Pipeline for Extreme Low-Light Imaging
Authors Syed Waqas Zamir, Aditya Arora, Salman Khan, Fahad Shahbaz Khan, Ling Shao
Abstract In low-light conditions, a conventional camera imaging pipeline produces sub-optimal images that are usually dark and noisy due to a low photon count and low signal-to-noise ratio (SNR). We present a data-driven approach that learns the desired properties of well-exposed images and reflects them in images that are captured in extremely low ambient light environments, thereby significantly improving the visual quality of these low-light images. We propose a new loss function that exploits the characteristics of both pixel-wise and perceptual metrics, enabling our deep neural network to learn the camera processing pipeline to transform the short-exposure, low-light RAW sensor data to well-exposed sRGB images. The results show that our method outperforms the state-of-the-art according to psychophysical tests as well as pixel-wise standard metrics and recent learning-based perceptual image quality measures.
Tasks
Published 2019-04-11
URL http://arxiv.org/abs/1904.05939v1
PDF http://arxiv.org/pdf/1904.05939v1.pdf
PWC https://paperswithcode.com/paper/learning-digital-camera-pipeline-for-extreme
Repo
Framework

The Randomized Midpoint Method for Log-Concave Sampling

Title The Randomized Midpoint Method for Log-Concave Sampling
Authors Ruoqi Shen, Yin Tat Lee
Abstract Sampling from log-concave distributions is a well researched problem that has many applications in statistics and machine learning. We study the distributions of the form $p^{}\propto\exp(-f(x))$, where $f:\mathbb{R}^{d}\rightarrow\mathbb{R}$ has an $L$-Lipschitz gradient and is $m$-strongly convex. In our paper, we propose a Markov chain Monte Carlo (MCMC) algorithm based on the underdamped Langevin diffusion (ULD). It can achieve $\epsilon\cdot D$ error (in 2-Wasserstein distance) in $\tilde{O}\left(\kappa^{7/6}/\epsilon^{1/3}+\kappa/\epsilon^{2/3}\right)$ steps, where $D\overset{\mathrm{def}}{=}\sqrt{\frac{d}{m}}$ is the effective diameter of the problem and $\kappa\overset{\mathrm{def}}{=}\frac{L}{m}$ is the condition number. Our algorithm performs significantly faster than the previously best known algorithm for solving this problem, which requires $\tilde{O}\left(\kappa^{1.5}/\epsilon\right)$ steps. Moreover, our algorithm can be easily parallelized to require only $O(\kappa\log\frac{1}{\epsilon})$ parallel steps. To solve the sampling problem, we propose a new framework to discretize stochastic differential equations. We apply this framework to discretize and simulate ULD, which converges to the target distribution $p^{}$. The framework can be used to solve not only the log-concave sampling problem, but any problem that involves simulating (stochastic) differential equations.
Tasks
Published 2019-09-12
URL https://arxiv.org/abs/1909.05503v1
PDF https://arxiv.org/pdf/1909.05503v1.pdf
PWC https://paperswithcode.com/paper/the-randomized-midpoint-method-for-log
Repo
Framework
comments powered by Disqus