January 29, 2020

3053 words 15 mins read

Paper Group ANR 580

Paper Group ANR 580

Robust Natural Language Inference Models with Example Forgetting. SpoC: Spoofing Camera Fingerprints. Stiffness: A New Perspective on Generalization in Neural Networks. Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering. Merging-ISP: Multi-Exposure High Dynamic Range Image Signal Processing. Sex Trafficking Detection w …

Robust Natural Language Inference Models with Example Forgetting

Title Robust Natural Language Inference Models with Example Forgetting
Authors Yadollah Yaghoobzadeh, Remi Tachet, T. J. Hazen, Alessandro Sordoni
Abstract We investigate whether example forgetting, a recently introduced measure of hardness of examples, can be used to select training examples in order to increase robustness of natural language understanding models in a natural language inference task (MNLI). We analyze forgetting events for MNLI and provide evidence that forgettable examples under simpler models can be used to increase robustness of the recently proposed BERT model, measured by testing an MNLI trained model on HANS, a curated test set that exhibits a shift in distribution compared to the MNLI test set. Moreover, we show that, the “large” version of BERT is more robust than its “base” version but its robustness can still be improved with our approach.
Tasks Natural Language Inference
Published 2019-11-10
URL https://arxiv.org/abs/1911.03861v1
PDF https://arxiv.org/pdf/1911.03861v1.pdf
PWC https://paperswithcode.com/paper/robust-natural-language-inference-models-with
Repo
Framework

SpoC: Spoofing Camera Fingerprints

Title SpoC: Spoofing Camera Fingerprints
Authors Davide Cozzolino, Justus Thies, Andreas Rössler, Matthias Nießner, Luisa Verdoliva
Abstract Thanks to the fast progress in synthetic media generation, creating realistic false images has become very easy. Such images can be used to wrap rich fake news with enhanced credibility, spawning a new wave of high-impact, high-risk misinformation campaigns. Therefore, there is a fast-growing interest in reliable detectors of manipulated media. The most powerful detectors, to date, rely on the subtle traces left by any device on all images acquired by it. In particular, due to proprietary in-camera processes, like demosaicing or compression, each camera model leaves trademark traces that can be exploited for forensic analyses. The absence or distortion of such traces in the target image is a strong hint of manipulation. In this paper, we challenge such detectors to gain better insight into their vulnerabilities. This is an important study in order to build better forgery detectors able to face malicious attacks. Our proposal consists of a GAN-based approach that injects camera traces into synthetic images. Given a GANgenerated image, we insert the traces of a specific camera model into it and deceive state-of-the-art detectors into believing the image was acquired by that model. Likewise, we deceive independent detectors of synthetic GAN images into believing the image is real. Experiments prove the effectiveness of the proposed method in a wide array of conditions. Moreover, no prior information on the attacked detectors is needed, but only sample images from the target camera.
Tasks Demosaicking
Published 2019-11-27
URL https://arxiv.org/abs/1911.12069v1
PDF https://arxiv.org/pdf/1911.12069v1.pdf
PWC https://paperswithcode.com/paper/spoc-spoofing-camera-fingerprints
Repo
Framework

Stiffness: A New Perspective on Generalization in Neural Networks

Title Stiffness: A New Perspective on Generalization in Neural Networks
Authors Stanislav Fort, Paweł Krzysztof Nowak, Stanislaw Jastrzebski, Srini Narayanan
Abstract In this paper we develop a new perspective on generalization of neural networks by proposing and investigating the concept of a neural network stiffness. We measure how stiff a network is by looking at how a small gradient step in the network’s parameters on one example affects the loss on another example. Higher stiffness suggests that a network is learning features that generalize. In particular, we study how stiffness depends on 1) class membership, 2) distance between data points in the input space, 3) training iteration, and 4) learning rate. We present experiments on MNIST, FASHION MNIST, and CIFAR-10/100 using fully-connected and convolutional neural networks, as well as on a transformer-based NLP model. We demonstrate the connection between stiffness and generalization, and observe its dependence on learning rate. When training on CIFAR-100, the stiffness matrix exhibits a coarse-grained behavior indicative of the model’s awareness of super-class membership. In addition, we measure how stiffness between two data points depends on their mutual input-space distance, and establish the concept of a dynamical critical length – a distance below which a parameter update based on a data point influences its neighbors.
Tasks
Published 2019-01-28
URL https://arxiv.org/abs/1901.09491v3
PDF https://arxiv.org/pdf/1901.09491v3.pdf
PWC https://paperswithcode.com/paper/stiffness-a-new-perspective-on-generalization
Repo
Framework

Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering

Title Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Authors Ramakrishna Vedantam, Karan Desai, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh
Abstract We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable. Instantiated in the context of visual question answering, our probabilistic formulation offers two key conceptual advantages over prior neural-symbolic models for VQA. Firstly, the programs generated by our model are more understandable while requiring lesser number of teaching examples. Secondly, we show that one can pose counterfactual scenarios to the model, to probe its beliefs on the programs that could lead to a specified answer given an image. Our results on the CLEVR and SHAPES datasets verify our hypotheses, showing that the model gets better program (and answer) prediction accuracy even in the low data regime, and allows one to probe the coherence and consistency of reasoning performed.
Tasks Question Answering, Visual Question Answering
Published 2019-02-21
URL https://arxiv.org/abs/1902.07864v2
PDF https://arxiv.org/pdf/1902.07864v2.pdf
PWC https://paperswithcode.com/paper/probabilistic-neural-symbolic-models-for
Repo
Framework

Merging-ISP: Multi-Exposure High Dynamic Range Image Signal Processing

Title Merging-ISP: Multi-Exposure High Dynamic Range Image Signal Processing
Authors Prashant Chaudhari, Franziska Schirrmacher, Andreas Maier, Christian Riess, Thomas Köhler
Abstract The image signal processing pipeline (ISP) is a core element of digital cameras to capture high-quality displayable images from raw data. In high dynamic range (HDR) imaging, ISPs include steps like demosaicing of raw color filter array (CFA) data at different exposure times, alignment of the exposures, conversion to HDR domain, and exposure merging into an HDR image. Traditionally, such pipelines are built by cascading algorithms addressing the individual subtasks. However, cascaded designs suffer from error propagations since simply combining multiple processing steps is not necessarily optimal for the entire imaging task. This paper proposes a multi-exposure high dynamic range image signal processing pipeline (Merging-ISP) to jointly solve all subtasks for HDR imaging. Our pipeline is modeled by a deep neural network architecture. As such, it is end-to-end trainable, circumvents the use of complex, hand-crafted algorithms in its core, and mitigates error propagation. Merging-ISP enables direct reconstructions of HDR images from multiple differently exposed raw CFA images captured from dynamic scenes. We compared Merging-ISP against different alternative cascaded pipelines. End-to-end learning leads to HDR reconstructions of high perceptual quality and quantitatively outperforms competing ISPs by more than 1 dB in terms of PSNR.
Tasks Demosaicking
Published 2019-11-12
URL https://arxiv.org/abs/1911.04762v1
PDF https://arxiv.org/pdf/1911.04762v1.pdf
PWC https://paperswithcode.com/paper/merging-isp-multi-exposure-high-dynamic-range
Repo
Framework

Sex Trafficking Detection with Ordinal Regression Neural Networks

Title Sex Trafficking Detection with Ordinal Regression Neural Networks
Authors Longshaokan Wang, Eric Laber, Yeng Saanchi, Sherrie Caltagirone
Abstract Sex trafficking is a global epidemic. Escort websites are a primary vehicle for selling the services of such trafficking victims and thus a major driver of trafficker revenue. Many law enforcement agencies do not have the resources to manually identify leads from the millions of escort ads posted across dozens of public websites. We propose an ordinal regression neural network to identify escort ads that are likely linked to sex trafficking. Our model uses a modified cost function to mitigate inconsistencies in predictions often associated with nonparametric ordinal regression and leverages recent advancements in deep learning to improve prediction accuracy. The proposed method significantly improves on the previous state-of-the-art on Trafficking-10K, an expert-annotated dataset of escort ads. Additionally, because traffickers use acronyms, deliberate typographical errors, and emojis to replace explicit keywords, we demonstrate how to expand the lexicon of trafficking flags through word embeddings and t-SNE.
Tasks Word Embeddings
Published 2019-08-15
URL https://arxiv.org/abs/1908.05434v2
PDF https://arxiv.org/pdf/1908.05434v2.pdf
PWC https://paperswithcode.com/paper/sex-trafficking-detection-with-ordinal
Repo
Framework

Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization

Title Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization
Authors Xuan Xu, Yanfang, Ye, Xin Li
Abstract Image demosaicing and super-resolution are two important tasks in color imaging pipeline. So far they have been mostly independently studied in the open literature of deep learning; little is known about the potential benefit of formulating a joint demosaicing and super-resolution (JDSR) problem. In this paper, we propose an end-to-end optimization solution to the JDSR problem and demonstrate its practical significance in computational imaging. Our technical contributions are mainly two-fold. On network design, we have developed a Densely-connected Squeeze-and-Excitation Residual Network (DSERN) for JDSR. For the first time, we address the issue of spatio-spectral attention for color images and discuss how to achieve better information flow by smooth activation for JDSR. Experimental results have shown moderate PSNR/SSIM gain can be achieved by DSERN over previous naive network architectures. On perceptual optimization, we propose to leverage the latest ideas including relativistic discriminator and pre-excitation perceptual loss function to further improve the visual quality of reconstructed images. Our extensive experiment results have shown that Texture-enhanced Relativistic average Generative Adversarial Network (TRaGAN) can produce both subjectively more pleasant images and objectively lower perceptual distortion scores than standard GAN for JDSR. We have verified the benefit of JDSR to high-quality image reconstruction from real-world Bayer pattern collected by NASA Mars Curiosity.
Tasks Demosaicking, Image Reconstruction, Super-Resolution
Published 2019-11-08
URL https://arxiv.org/abs/1911.03558v1
PDF https://arxiv.org/pdf/1911.03558v1.pdf
PWC https://paperswithcode.com/paper/joint-demosaicing-and-super-resolution-jdsr
Repo
Framework

Deep Camera: A Fully Convolutional Neural Network for Image Signal Processing

Title Deep Camera: A Fully Convolutional Neural Network for Image Signal Processing
Authors Sivalogeswaran Ratnasingam
Abstract A conventional camera performs various signal processing steps sequentially to reconstruct an image from a raw Bayer image. When performing these processing in multiple stages the residual error from each stage accumulates in the image and degrades the quality of the final reconstructed image. In this paper, we present a fully convolutional neural network (CNN) to perform defect pixel correction, denoising, white balancing, exposure correction, demosaicing, color transform, and gamma encoding. To our knowledge, this is the first CNN trained end-to-end to perform the entire image signal processing pipeline in a camera. The neural network was trained using a large image database of raw Bayer images. Through extensive experiments, we show that the proposed CNN based image signal processing system performs better than the conventional signal processing pipelines that perform the processing sequentially.
Tasks Demosaicking, Denoising
Published 2019-08-24
URL https://arxiv.org/abs/1908.09191v1
PDF https://arxiv.org/pdf/1908.09191v1.pdf
PWC https://paperswithcode.com/paper/deep-camera-a-fully-convolutional-neural
Repo
Framework

Efficient Multi-Domain Network Learning by Covariance Normalization

Title Efficient Multi-Domain Network Learning by Covariance Normalization
Authors Yunsheng Li, Nuno Vasconcelos
Abstract The problem of multi-domain learning of deep networks is considered. An adaptive layer is induced per target domain and a novel procedure, denoted covariance normalization (CovNorm), proposed to reduce its parameters. CovNorm is a data driven method of fairly simple implementation, requiring two principal component analyzes (PCA) and fine-tuning of a mini-adaptation layer. Nevertheless, it is shown, both theoretically and experimentally, to have several advantages over previous approaches, such as batch normalization or geometric matrix approximations. Furthermore, CovNorm can be deployed both when target datasets are available sequentially or simultaneously. Experiments show that, in both cases, it has performance comparable to a fully fine-tuned network, using as few as 0.13% of the corresponding parameters per target domain.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.10267v1
PDF https://arxiv.org/pdf/1906.10267v1.pdf
PWC https://paperswithcode.com/paper/efficient-multi-domain-network-learning-by
Repo
Framework

Building a mixed-lingual neural TTS system with only monolingual data

Title Building a mixed-lingual neural TTS system with only monolingual data
Authors Liumeng Xue, Wei Song, Guanghui Xu, Lei Xie, Zhizheng Wu
Abstract When deploying a Chinese neural text-to-speech (TTS) synthesis system, one of the challenges is to synthesize Chinese utterances with English phrases or words embedded. This paper looks into the problem in the encoder-decoder framework when only monolingual data from a target speaker is available. Specifically, we view the problem from two aspects: speaker consistency within an utterance and naturalness. We start the investigation with an Average Voice Model which is built from multi-speaker monolingual data, i.e. Mandarin and English data. On the basis of that, we look into speaker embedding for speaker consistency within an utterance and phoneme embedding for naturalness and intelligibility and study the choice of data for model training. We report the findings and discuss the challenges to build a mixed-lingual TTS system with only monolingual data.
Tasks
Published 2019-04-12
URL https://arxiv.org/abs/1904.06063v2
PDF https://arxiv.org/pdf/1904.06063v2.pdf
PWC https://paperswithcode.com/paper/building-a-mixed-lingual-neural-tts-system
Repo
Framework

Visualized Insights into the Optimization Landscape of Fully Convolutional Networks

Title Visualized Insights into the Optimization Landscape of Fully Convolutional Networks
Authors Jianjie Lu, Kai-yu Tong
Abstract Many image processing tasks involve image-to-image mapping, which can be addressed well by fully convolutional networks (FCN) without any heavy preprocessing. Although empirically designing and training FCNs can achieve satisfactory results, reasons for the improvement in performance are slightly ambiguous. Our study is to make progress in understanding their generalization abilities through visualizing the optimization landscapes. The visualization of objective functions is obtained by choosing a solution and projecting its vicinity onto a 3D space. We compare three FCN-based networks (two existing models and a new proposed in this paper for comparison) on multiple datasets. It has been observed in practice that the connections from the pre-pooled feature maps to the post-upsampled can achieve better results. We investigate the cause and provide experiments to shows that the skip-layer connections in FCN can promote flat optimization landscape, which is well known to generalize better. Additionally, we explore the relationship between the models generalization ability and loss surface under different batch sizes. Results show that large-batch training makes the model converge to sharp minimizers with chaotic vicinities while small-batch method leads the model to flat minimizers with smooth and nearly convex regions. Our work may contribute to insights and analysis for designing and training FCNs.
Tasks
Published 2019-01-20
URL http://arxiv.org/abs/1901.08556v1
PDF http://arxiv.org/pdf/1901.08556v1.pdf
PWC https://paperswithcode.com/paper/visualized-insights-into-the-optimization
Repo
Framework

“Machine LLRning”: Learning to Softly Demodulate

Title “Machine LLRning”: Learning to Softly Demodulate
Authors Ori Shental, Jakob Hoydis
Abstract Soft demodulation, or demapping, of received symbols back into their conveyed soft bits, or bit log-likelihood ratios (LLRs), is at the very heart of any modern receiver. In this paper, a trainable universal neural network-based demodulator architecture, dubbed “LLRnet”, is introduced. LLRnet facilitates an improved performance with significantly reduced overall computational complexity. For instance for the commonly used quadrature amplitude modulation (QAM), LLRnet demonstrates LLR estimates approaching the optimal log maximum a-posteriori inference with an order of magnitude less operations than that of the straightforward exact implementation. Link-level simulation examples for the application of LLRnet to 5G-NR and DVB-S.2 are provided. LLRnet is a (yet another) powerful example for the usefulness of applying machine learning to physical layer design.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01512v3
PDF https://arxiv.org/pdf/1907.01512v3.pdf
PWC https://paperswithcode.com/paper/machine-llrning-learning-to-softly-demodulate
Repo
Framework

An Integrated Image Filter for Enhancing Change Detection Results

Title An Integrated Image Filter for Enhancing Change Detection Results
Authors Dawei Li, Siyuan Yan, Xin Cai, Yan Cao, Sifan Wang
Abstract Change detection is a fundamental task in computer vision. Despite significant advances have been made, most of the change detection methods fail to work well in challenging scenes due to ubiquitous noise and interferences. Nowadays, post-processing methods (e.g. MRF, and CRF) aiming to enhance the binary change detection results still fall short of the requirements on universality for distinctive scenes, applicability for different types of detection methods, accuracy, and real-time performance. Inspired by the nature of image filtering, which separates noise from pixel observations and recovers the real structure of patches, we consider utilizing image filters to enhance the detection masks. In this paper, we present an integrated filter which comprises a weighted local guided image filter and a weighted spatiotemporal tree filter. The spatiotemporal tree filter leverages the global spatiotemporal information of adjacent video frames and meanwhile the guided filter carries out local window filtering of pixels, for enhancing the coarse change detection masks. The main contributions are three: (i) the proposed filter can make full use of the information of the same object in consecutive frames to improve its current detection mask by computations on a spatiotemporal minimum spanning tree; (ii) the integrated filter possesses both advantages of local filtering and global filtering; it not only has good edge-preserving property but also can handle heavily textured and colorful foreground regions; and (iii) Unlike some popular enhancement methods (MRF, and CRF) that require either a priori background probabilities or a posteriori foreground probabilities for every pixel to improve the coarse detection masks, our method is a versatile enhancement filter that can be applied after many different types of change detection methods, and is particularly suitable for video sequences.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01301v1
PDF https://arxiv.org/pdf/1907.01301v1.pdf
PWC https://paperswithcode.com/paper/an-integrated-image-filter-for-enhancing
Repo
Framework

Neural Learning of Online Consumer Credit Risk

Title Neural Learning of Online Consumer Credit Risk
Authors Di Wang, Qi Wu, Wen Zhang
Abstract This paper takes a deep learning approach to understand consumer credit risk when e-commerce platforms issue unsecured credit to finance customers’ purchase. The “NeuCredit” model can capture both serial dependences in multi-dimensional time series data when event frequencies in each dimension differ. It also captures nonlinear cross-sectional interactions among different time-evolving features. Also, the predicted default probability is designed to be interpretable such that risks can be decomposed into three components: the subjective risk indicating the consumers’ willingness to repay, the objective risk indicating their ability to repay, and the behavioral risk indicating consumers’ behavioral differences. Using a unique dataset from one of the largest global e-commerce platforms, we show that the inclusion of shopping behavioral data, besides conventional payment records, requires a deep learning approach to extract the information content of these data, which turns out significantly enhancing forecasting performance than the traditional machine learning methods.
Tasks Time Series
Published 2019-06-05
URL https://arxiv.org/abs/1906.01923v1
PDF https://arxiv.org/pdf/1906.01923v1.pdf
PWC https://paperswithcode.com/paper/neural-learning-of-online-consumer-credit
Repo
Framework

An Experimental Comparison of Old and New Decision Tree Algorithms

Title An Experimental Comparison of Old and New Decision Tree Algorithms
Authors Arman Zharmagambetov, Suryabhan Singh Hada, Miguel Á. Carreira-Perpiñán, Magzhan Gabidolla
Abstract This paper presents a detailed comparison of a recently proposed algorithm for optimizing decision trees, tree alternating optimization (TAO), with other popular, established algorithms. We compare their performance on a number of classification and regression datasets of various complexity, different size and dimensionality, across different performance factors: accuracy and tree size (in terms of the number of leaves or the depth of the tree). We find that TAO achieves higher accuracy in nearly all datasets, often by a large margin.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03054v2
PDF https://arxiv.org/pdf/1911.03054v2.pdf
PWC https://paperswithcode.com/paper/an-experimental-comparison-of-old-and-new
Repo
Framework
comments powered by Disqus