February 1, 2020

3296 words 16 mins read

Paper Group AWR 78

Paper Group AWR 78

Illumination-Based Data Augmentation for Robust Background Subtraction. Structural Similarity based Anatomical and Functional Brain Imaging Fusion. Boosting: Why You Can Use the HP Filter. One-element Batch Training by Moving Window. Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation. HARE: a Flexible Highlight …

Illumination-Based Data Augmentation for Robust Background Subtraction

Title Illumination-Based Data Augmentation for Robust Background Subtraction
Authors Dimitrios Sakkos, Hubert P. H. Shum, Edmond S. L. Ho
Abstract A core challenge in background subtraction (BGS) is handling videos with sudden illumination changes in consecutive frames. In this paper, we tackle the problem from a data point-of-view using data augmentation. Our method performs data augmentation that not only creates endless data on the fly, but also features semantic transformations of illumination which enhance the generalisation of the model. It successfully simulates flashes and shadows by applying the Euclidean distance transform over a binary mask that is randomly generated. Such data allows us to effectively train an illumination-invariant deep learning model for BGS. Experimental results demonstrate the contribution of the synthetics in the ability of the models to perform BGS even when significant illumination changes take place. The source code of the project is made publicly available at https://github.com/dksakkos/illumination_augmentation.
Tasks Data Augmentation, Video Background Subtraction, Video Object Segmentation
Published 2019-10-18
URL https://arxiv.org/abs/1910.08470v1
PDF https://arxiv.org/pdf/1910.08470v1.pdf
PWC https://paperswithcode.com/paper/illumination-based-data-augmentation-for
Repo https://github.com/dksakkos/illumination_augmentation
Framework none

Structural Similarity based Anatomical and Functional Brain Imaging Fusion

Title Structural Similarity based Anatomical and Functional Brain Imaging Fusion
Authors Nishant Kumar, Nico Hoffmann, Martin Oelschlägel, Edmund Koch, Matthias Kirsch, Stefan Gumhold
Abstract Multimodal medical image fusion helps in combining contrasting features from two or more input imaging modalities to represent fused information in a single image. One of the pivotal clinical applications of medical image fusion is the merging of anatomical and functional modalities for fast diagnosis of malignant tissues. In this paper, we present a novel end-to-end unsupervised learning-based Convolutional Neural Network (CNN) for fusing the high and low frequency components of MRI-PET grayscale image pairs, publicly available at ADNI, by exploiting Structural Similarity Index (SSIM) as the loss function during training. We then apply color coding for the visualization of the fused image by quantifying the contribution of each input image in terms of the partial derivatives of the fused image. We find that our fusion and visualization approach results in better visual perception of the fused image, while also comparing favorably to previous methods when applying various quantitative assessment metrics.
Tasks
Published 2019-08-11
URL https://arxiv.org/abs/1908.03958v4
PDF https://arxiv.org/pdf/1908.03958v4.pdf
PWC https://paperswithcode.com/paper/structural-similarity-based-anatomical-and
Repo https://github.com/nish03/FunFuseAn
Framework tf

Boosting: Why You Can Use the HP Filter

Title Boosting: Why You Can Use the HP Filter
Authors Peter C. B. Phillips, Zhentao Shi
Abstract The Hodrick-Prescott (HP) filter is one of the most widely used econometric methods in applied macroeconomic research. Like all nonparametric methods, the HP filter depends critically on a tuning parameter that controls the degree of smoothing. Yet in contrast to modern nonparametric methods and applied work with these procedures, empirical practice with the HP filter almost universally relies on standard settings for the tuning parameter that have been suggested largely by experimentation with macroeconomic data and heuristic reasoning. As recent research (Phillips and Jin, 2015) has shown, standard settings may not be adequate in removing trends, particularly stochastic trends, in economic data. This paper proposes an easy-to-implement practical procedure of iterating the HP smoother that is intended to make the filter a smarter smoothing device for trend estimation and trend elimination. We call this iterated HP technique the boosted HP filter in view of its connection to $L_{2}$-boosting in machine learning. The paper develops limit theory to show that the boosted HP (bHP) filter asymptotically recovers trend mechanisms that involve unit root processes, deterministic polynomial drifts, and polynomial drifts with structural breaks. A stopping criterion is used to automate the iterative HP algorithm, making it a data-determined method that is ready for modern data-rich environments in economic research. The methodology is illustrated using three real data examples that highlight the differences between simple HP filtering, the data-determined boosted filter, and an alternative autoregressive approach. These examples show that the bHP filter is helpful in analyzing a large collection of heterogeneous macroeconomic time series that manifest various degrees of persistence, trend behavior, and volatility.
Tasks Time Series
Published 2019-05-01
URL https://arxiv.org/abs/1905.00175v2
PDF https://arxiv.org/pdf/1905.00175v2.pdf
PWC https://paperswithcode.com/paper/boosting-the-hodrick-prescott-filter
Repo https://github.com/chenyang45/BoostedHP
Framework none

One-element Batch Training by Moving Window

Title One-element Batch Training by Moving Window
Authors Przemysław Spurek, Szymon Knop, Jacek Tabor, Igor Podolak, Bartosz Wójcik
Abstract Several deep models, esp. the generative, compare the samples from two distributions (e.g. WAE like AutoEncoder models, set-processing deep networks, etc) in their cost functions. Using all these methods one cannot train the model directly taking small size (in extreme – one element) batches, due to the fact that samples are to be compared. We propose a generic approach to training such models using one-element mini-batches. The idea is based on splitting the batch in latent into parts: previous, i.e. historical, elements used for latent space distribution matching and the current ones, used both for latent distribution computation and the minimization process. Due to the smaller memory requirements, this allows to train networks on higher resolution images then in the classical approach.
Tasks
Published 2019-05-30
URL https://arxiv.org/abs/1905.12947v2
PDF https://arxiv.org/pdf/1905.12947v2.pdf
PWC https://paperswithcode.com/paper/one-element-batch-training-by-moving-window
Repo https://github.com/gmum/MoW
Framework tf

Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation

Title Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation
Authors Utku Ozbulak, Arnout Van Messem, Wesley De Neve
Abstract Deep learning models, which are increasingly being used in the field of medical image analysis, come with a major security risk, namely, their vulnerability to adversarial examples. Adversarial examples are carefully crafted samples that force machine learning models to make mistakes during testing time. These malicious samples have been shown to be highly effective in misguiding classification tasks. However, research on the influence of adversarial examples on segmentation is significantly lacking. Given that a large portion of medical imaging problems are effectively segmentation problems, we analyze the impact of adversarial examples on deep learning-based image segmentation models. Specifically, we expose the vulnerability of these models to adversarial examples by proposing the Adaptive Segmentation Mask Attack (ASMA). This novel algorithm makes it possible to craft targeted adversarial examples that come with (1) high intersection-over-union rates between the target adversarial mask and the prediction and (2) with perturbation that is, for the most part, invisible to the bare eye. We lay out experimental and visual evidence by showing results obtained for the ISIC skin lesion segmentation challenge and the problem of glaucoma optic disc segmentation. An implementation of this algorithm and additional examples can be found at https://github.com/utkuozbulak/adaptive-segmentation-mask-attack.
Tasks Lesion Segmentation, Semantic Segmentation
Published 2019-07-30
URL https://arxiv.org/abs/1907.13124v1
PDF https://arxiv.org/pdf/1907.13124v1.pdf
PWC https://paperswithcode.com/paper/impact-of-adversarial-examples-on-deep
Repo https://github.com/utkuozbulak/adaptive-segmentation-mask-attack
Framework pytorch

HARE: a Flexible Highlighting Annotator for Ranking and Exploration

Title HARE: a Flexible Highlighting Annotator for Ranking and Exploration
Authors Denis Newman-Griffis, Eric Fosler-Lussier
Abstract Exploration and analysis of potential data sources is a significant challenge in the application of NLP techniques to novel information domains. We describe HARE, a system for highlighting relevant information in document collections to support ranking and triage, which provides tools for post-processing and qualitative analysis for model development and tuning. We apply HARE to the use case of narrative descriptions of mobility information in clinical data, and demonstrate its utility in comparing candidate embedding features. We provide a web-based interface for annotation visualization and document ranking, with a modular backend to support interoperability with existing annotation tools. Our system is available online at https://github.com/OSU-slatelab/HARE.
Tasks Document Ranking
Published 2019-08-29
URL https://arxiv.org/abs/1908.11302v1
PDF https://arxiv.org/pdf/1908.11302v1.pdf
PWC https://paperswithcode.com/paper/hare-a-flexible-highlighting-annotator-for
Repo https://github.com/OSU-slatelab/HARE
Framework none

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Title PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization
Authors Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, Hao Li
Abstract We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu can produce high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.
Tasks 3D Object Reconstruction From A Single Image
Published 2019-05-13
URL https://arxiv.org/abs/1905.05172v3
PDF https://arxiv.org/pdf/1905.05172v3.pdf
PWC https://paperswithcode.com/paper/pifu-pixel-aligned-implicit-function-for-high
Repo https://github.com/shunsukesaito/PIFu
Framework pytorch

Automatic Generation of Headlines for Online Math Questions

Title Automatic Generation of Headlines for Online Math Questions
Authors Ke Yuan, Dafang He, Zhuoren Jiang, Liangcai Gao, Zhi Tang, C. Lee Giles
Abstract Mathematical equations are an important part of dissemination and communication of scientific information. Students, however, often feel challenged in reading and understanding math content and equations. With the development of the Web, students are posting their math questions online. Nevertheless, constructing a concise math headline that gives a good description of the posted detailed math question is nontrivial. In this study, we explore a novel summarization task denoted as geNerating A concise Math hEadline from a detailed math question (NAME). Compared to conventional summarization tasks, this task has two extra and essential constraints: 1) Detailed math questions consist of text and math equations which require a unified framework to jointly model textual and mathematical information; 2) Unlike text, math equations contain semantic and structural features, and both of them should be captured together. To address these issues, we propose MathSum, a novel summarization model which utilizes a pointer mechanism combined with a multi-head attention mechanism for mathematical representation augmentation. The pointer mechanism can either copy textual tokens or math tokens from source questions in order to generate math headlines. The multi-head attention mechanism is designed to enrich the representation of math equations by modeling and integrating both its semantic and structural features. For evaluation, we collect and make available two sets of real-world detailed math questions along with human-written math headlines, namely EXEQ-300k and OFEQ-10k. Experimental results demonstrate that our model (MathSum) significantly outperforms state-of-the-art models for both the EXEQ-300k and OFEQ-10k datasets.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1912.00839v1
PDF https://arxiv.org/pdf/1912.00839v1.pdf
PWC https://paperswithcode.com/paper/automatic-generation-of-headlines-for-online
Repo https://github.com/yuankepku/MathSum
Framework none

SteganoGAN: High Capacity Image Steganography with GANs

Title SteganoGAN: High Capacity Image Steganography with GANs
Authors Kevin Alex Zhang, Alfredo Cuesta-Infante, Lei Xu, Kalyan Veeramachaneni
Abstract Image steganography is a procedure for hiding messages inside pictures. While other techniques such as cryptography aim to prevent adversaries from reading the secret message, steganography aims to hide the presence of the message itself. In this paper, we propose a novel technique for hiding arbitrary binary data in images using generative adversarial networks which allow us to optimize the perceptual quality of the images produced by our model. We show that our approach achieves state-of-the-art payloads of 4.4 bits per pixel, evades detection by steganalysis tools, and is effective on images from multiple datasets. To enable fair comparisons, we have released an open source library that is available online at https://github.com/DAI-Lab/SteganoGAN.
Tasks Image Steganography
Published 2019-01-12
URL http://arxiv.org/abs/1901.03892v2
PDF http://arxiv.org/pdf/1901.03892v2.pdf
PWC https://paperswithcode.com/paper/steganogan-high-capacity-image-steganography
Repo https://github.com/DAI-Lab/SteganoGAN
Framework pytorch

Emergent Tool Use From Multi-Agent Autocurricula

Title Emergent Tool Use From Multi-Agent Autocurricula
Authors Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, Igor Mordatch
Abstract Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of which creates a new pressure for the opposing team to adapt; for instance, agents learn to build multi-object shelters using moveable boxes which in turn leads to agents discovering that they can overcome obstacles using ramps. We further provide evidence that multi-agent competition may scale better with increasing environment complexity and leads to behavior that centers around far more human-relevant skills than other self-supervised reinforcement learning methods such as intrinsic motivation. Finally, we propose transfer and fine-tuning as a way to quantitatively evaluate targeted capabilities, and we compare hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07528v2
PDF https://arxiv.org/pdf/1909.07528v2.pdf
PWC https://paperswithcode.com/paper/emergent-tool-use-from-multi-agent
Repo https://github.com/Stippler/cow-simulator
Framework pytorch

Sequential Skip Prediction with Few-shot in Streamed Music Contents

Title Sequential Skip Prediction with Few-shot in Streamed Music Contents
Authors Sungkyun Chang, Seungjin Lee, Kyogu Lee
Abstract This paper provides an outline of the algorithms submitted for the WSDM Cup 2019 Spotify Sequential Skip Prediction Challenge (team name: mimbres). In the challenge, complete information including acoustic features and user interaction logs for the first half of a listening session is provided. Our goal is to predict whether the individual tracks in the second half of the session will be skipped or not, only given acoustic features. We proposed two different kinds of algorithms that were based on metric learning and sequence learning. The experimental results showed that the sequence learning approach performed significantly better than the metric learning approach. Moreover, we conducted additional experiments to find that significant performance gain can be achieved using complete user log information.
Tasks Metric Learning
Published 2019-01-24
URL http://arxiv.org/abs/1901.08203v1
PDF http://arxiv.org/pdf/1901.08203v1.pdf
PWC https://paperswithcode.com/paper/sequential-skip-prediction-with-few-shot-in
Repo https://github.com/mimbres/SeqSkip
Framework pytorch
Title Multinomial Distribution Learning for Effective Neural Architecture Search
Authors Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian
Abstract Architectures obtained by Neural Architecture Search (NAS) have achieved highly competitive performance in various computer vision tasks. However, the prohibitive computation demand of forward-backward propagation in deep neural networks and searching algorithms makes it difficult to apply NAS in practice. In this paper, we propose a Multinomial Distribution Learning for extremely effective NAS,which considers the search space as a joint multinomial distribution, i.e., the operation between two nodes is sampled from this distribution, and the optimal network structure is obtained by the operations with the most likely probability in this distribution. Therefore, NAS can be transformed to a multinomial distribution learning problem, i.e., the distribution is optimized to have a high expectation of the performance. Besides, a hypothesis that the performance ranking is consistent in every training epoch is proposed and demonstrated to further accelerate the learning process. Experiments on CIFAR10 and ImageNet demonstrate the effectiveness of our method. On CIFAR-10, the structure searched by our method achieves 2.55% test error, while being 6.0x (only 4 GPU hours on GTX1080Ti) faster compared with state-of-the-art NAS algorithms. On ImageNet, our model achieves 75.2% top1 accuracy under MobileNet settings (MobileNet V1/V2), while being 1.2x faster with measured GPU latency. Test code with pre-trained models are available at https://github.com/tanglang96/MDENAS
Tasks Neural Architecture Search
Published 2019-05-18
URL https://arxiv.org/abs/1905.07529v3
PDF https://arxiv.org/pdf/1905.07529v3.pdf
PWC https://paperswithcode.com/paper/multinomial-distribution-learning-for
Repo https://github.com/tanglang96/MDENAS
Framework pytorch

MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible

Title MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible
Authors Marcely Zanon Boito, William N. Havard, Mahault Garnerin, Éric Le Ferrand, Laurent Besacier
Abstract The CMU Wilderness Multilingual Speech Dataset (Black, 2019) is a newly published multilingual speech dataset based on recorded readings of the New Testament. It provides data to build Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models for potentially 700 languages. However, the fact that the source content (the Bible) is the same for all the languages is not exploited to date.Therefore, this article proposes to add multilingual links between speech segments in different languages, and shares a large and clean dataset of 8,130 parallel spoken utterances across 8 languages (56 language pairs). We name this corpus MaSS (Multilingual corpus of Sentence-aligned Spoken utterances). The covered languages (Basque, English, Finnish, French, Hungarian, Romanian, Russian and Spanish) allow researches on speech-to-speech alignment as well as on translation for typologically different language pairs. The quality of the final corpus is attested by human evaluation performed on a corpus subset (100 utterances, 8 language pairs). Lastly, we showcase the usefulness of the final product on a bilingual speech retrieval task.
Tasks Speech Recognition
Published 2019-07-30
URL https://arxiv.org/abs/1907.12895v3
PDF https://arxiv.org/pdf/1907.12895v3.pdf
PWC https://paperswithcode.com/paper/mass-a-large-and-clean-multilingual-corpus-of
Repo https://github.com/getalp/mass-dataset
Framework none

Cellular State Transformations using Generative Adversarial Networks

Title Cellular State Transformations using Generative Adversarial Networks
Authors Colin Targonski, Benjamin T. Shealy, Melissa C. Smith, F. Alex Feltus
Abstract We introduce a novel method to unite deep learning with biology by which generative adversarial networks (GANs) generate transcriptome perturbations and reveal condition-defining gene expression patterns. We find that a generator conditioned to perturb any input gene expression profile simulates a realistic transition between source and target RNA expression states. The perturbed samples follow a similar distribution to original samples from the dataset, also suggesting these are biologically meaningful perturbations. Finally, we show that it is possible to identify the genes most positively and negatively perturbed by the generator and that the enriched biological function of the perturbed genes are realistic. We call the framework the Transcriptome State Perturbation Generator (TSPG), which is open source software available at https://github.com/ctargon/TSPG.
Tasks
Published 2019-06-28
URL https://arxiv.org/abs/1907.00118v1
PDF https://arxiv.org/pdf/1907.00118v1.pdf
PWC https://paperswithcode.com/paper/cellular-state-transformations-using
Repo https://github.com/ctargon/TSPG
Framework tf

Inductive Matrix Completion Based on Graph Neural Networks

Title Inductive Matrix Completion Based on Graph Neural Networks
Authors Muhan Zhang, Yixin Chen
Abstract We propose an inductive matrix completion model without using side information. By factorizing the (rating) matrix into the product of low-dimensional latent embeddings of rows (users) and columns (items), a majority of existing matrix completion methods are transductive, since the learned embeddings cannot generalize to unseen rows/columns or to new matrices. To make matrix completion inductive, most previous works use content (side information), such as user’s age or movie’s genre, to make predictions. However, high-quality content is not always available, and can be hard to extract. Under the extreme setting where not any side information is available other than the matrix to complete, can we still learn an inductive matrix completion model? In this paper, we propose an Inductive Graph-based Matrix Completion (IGMC) model to address this problem. IGMC trains a graph neural network (GNN) based purely on 1-hop subgraphs around (user, item) pairs generated from the rating matrix and maps these subgraphs to their corresponding ratings. It achieves highly competitive performance with state-of-the-art transductive baselines. In addition, IGMC is inductive – it can generalize to users/items unseen during the training (given that their interactions exist), and can even transfer to new tasks. Our transfer learning experiments show that a model trained out of the MovieLens dataset can be directly used to predict Douban movie ratings with surprisingly good performance. Our work demonstrates that: 1) it is possible to train inductive matrix completion models without using side information while achieving similar or better performances than state-of-the-art transductive methods; 2) local graph patterns around a (user, item) pair are effective predictors of the rating this user gives to the item; and 3) Long-range dependencies might not be necessary for modeling recommender systems.
Tasks Matrix Completion, Recommendation Systems, Transfer Learning
Published 2019-04-26
URL https://arxiv.org/abs/1904.12058v3
PDF https://arxiv.org/pdf/1904.12058v3.pdf
PWC https://paperswithcode.com/paper/inductive-graph-pattern-learning-for
Repo https://github.com/muhanzhang/IGPL
Framework pytorch
comments powered by Disqus