January 27, 2020

3263 words 16 mins read

Paper Group ANR 1257

EdgeFool: An Adversarial Image Enhancement Filter. Spoof detection using time-delay shallow neural network and feature switching. Pairwise Neural Machine Translation Evaluation. Recurrent Neural Networks (RNNs): A gentle Introduction and Overview. Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in Low Light. A sub-Riemannian model of …

EdgeFool: An Adversarial Image Enhancement Filter


Title	EdgeFool: An Adversarial Image Enhancement Filter
Authors	Ali Shahin Shamsabadi, Changjae Oh, Andrea Cavallaro
Abstract	Adversarial examples are intentionally perturbed images that mislead classifiers. These images can, however, be easily detected using denoising algorithms, when high-frequency spatial perturbations are used, or can be noticed by humans, when perturbations are large. In this paper, we propose EdgeFool, an adversarial image enhancement filter that learns structure-aware adversarial perturbations. EdgeFool generates adversarial images with perturbations that enhance image details via training a fully convolutional neural network end-to-end with a multi-task loss function. This loss function accounts for both image detail enhancement and class misleading objectives. We evaluate EdgeFool on three classifiers (ResNet-50, ResNet-18 and AlexNet) using two datasets (ImageNet and Private-Places365) and compare it with six adversarial methods (DeepFool, SparseFool, Carlini-Wagner, SemanticAdv, Non-targeted and Private Fast Gradient Sign Methods). Code is available at https://github.com/smartcameras/EdgeFool.git.
Tasks	Denoising, Image Enhancement
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12227v2
PDF	https://arxiv.org/pdf/1910.12227v2.pdf
PWC	https://paperswithcode.com/paper/edgefool-an-adversarial-image-enhancement
Repo
Framework

Spoof detection using time-delay shallow neural network and feature switching


Title	Spoof detection using time-delay shallow neural network and feature switching
Authors	Mari Ganesh Kumar, Suvidha Rupesh Kumar, Saranya M, B. Bharathi, Hema A. Murthy
Abstract	Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for spoof detection for both logical and physical access. The novelty of the proposed TD-SNN system vis-a-vis conventional DNN systems is that it can handle variable length utterances during testing. Performance of the proposed TD-SNN systems and the baseline Gaussian mixture models (GMMs) is analyzed on the ASV-spoof-2019 dataset. The performance of the systems is measured in terms of the minimum normalized tandem detection cost function (min-t-DCF). When studied with individual features, the TD-SNN system consistently outperforms the GMM system for physical access. For logical access, GMM surpasses TD-SNN systems for certain individual features. When combined with the decision-level feature switching (DLFS) paradigm, the best TD-SNN system outperforms the best baseline GMM system on evaluation data with a relative improvement of 48.03% and 49.47% for both logical and physical access, respectively.
Tasks	Speaker Verification, Speech Synthesis, Voice Conversion
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07453v2
PDF	https://arxiv.org/pdf/1904.07453v2.pdf
PWC	https://paperswithcode.com/paper/spoof-detection-using-x-vector-and-feature
Repo
Framework

Pairwise Neural Machine Translation Evaluation


Title	Pairwise Neural Machine Translation Evaluation
Authors	Francisco Guzman, Shafiq Joty, Lluis Marquez, Preslav Nakov
Abstract	We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation. In this framework, lexical, syntactic and semantic information from the reference and the two hypotheses is compacted into relatively small distributed vector representations, and fed into a multi-layer neural network that models the interaction between each of the hypotheses and the reference, as well as between the two hypotheses. These compact representations are in turn based on word and sentence embeddings, which are learned using neural networks. The framework is flexible, allows for efficient learning and classification, and yields correlation with humans that rivals the state of the art.
Tasks	Machine Translation, Sentence Embeddings
Published	2019-12-05
URL	https://arxiv.org/abs/1912.03135v1
PDF	https://arxiv.org/pdf/1912.03135v1.pdf
PWC	https://paperswithcode.com/paper/pairwise-neural-machine-translation-2
Repo
Framework

Recurrent Neural Networks (RNNs): A gentle Introduction and Overview


Title	Recurrent Neural Networks (RNNs): A gentle Introduction and Overview
Authors	Robin M. Schmidt
Abstract	State-of-the-art solutions in the areas of “Language Modelling & Generating Text”, “Speech Recognition”, “Generating Image Descriptions” or “Video Tagging” have been using Recurrent Neural Networks as the foundation for their approaches. Understanding the underlying concepts is therefore of tremendous importance if we want to keep up with recent or upcoming publications in those areas. In this work we give a short overview over some of the most important concepts in the realm of Recurrent Neural Networks which enables readers to easily understand the fundamentals such as but not limited to “Backpropagation through Time” or “Long Short-Term Memory Units” as well as some of the more recent advances like the “Attention Mechanism” or “Pointer Networks”. We also give recommendations for further reading regarding more complex topics where it is necessary.
Tasks	Language Modelling, Speech Recognition
Published	2019-11-23
URL	https://arxiv.org/abs/1912.05911v1
PDF	https://arxiv.org/pdf/1912.05911v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-rnns-a-gentle
Repo
Framework

Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in Low Light


Title	Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in Low Light
Authors	Eunah Jung, Nan Yang, Daniel Cremers
Abstract	We propose the concept of a multi-frame GAN (MFGAN) and demonstrate its potential as an image sequence enhancement for stereo visual odometry in low light conditions. We base our method on an invertible adversarial network to transfer the beneficial features of brightly illuminated scenes to the sequence in poor illumination without costly paired datasets. In order to preserve the coherent geometric cues for the translated sequence, we present a novel network architecture as well as a novel loss term combining temporal and stereo consistencies based on optical flow estimation. We demonstrate that the enhanced sequences improve the performance of state-of-the-art feature-based and direct stereo visual odometry methods on both synthetic and real datasets in challenging illumination. We also show that MFGAN outperforms other state-of-the-art image enhancement and style transfer methods by a large margin in terms of visual odometry.
Tasks	Image Enhancement, Optical Flow Estimation, Style Transfer, Visual Odometry
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06632v1
PDF	https://arxiv.org/pdf/1910.06632v1.pdf
PWC	https://paperswithcode.com/paper/multi-frame-gan-image-enhancement-for-stereo
Repo
Framework

A sub-Riemannian model of the visual cortex with frequency and phase


Title	A sub-Riemannian model of the visual cortex with frequency and phase
Authors	E. Baspinar, A. Sarti, G. Citti
Abstract	In this paper we present a novel model of the primary visual cortex (V1) based on orientation, frequency and phase selective behavior of the V1 simple cells. We start from the first level mechanisms of visual perception: receptive profiles. The model interprets V1 as a fiber bundle over the 2-dimensional retinal plane by introducing orientation, frequency and phase as intrinsic variables. Each receptive profile on the fiber is mathematically interpreted as a rotated, frequency modulated and phase shifted Gabor function. We start from the Gabor function and show that it induces in a natural way the model geometry and the associated horizontal connectivity modeling the neural connectivity patterns in V1. We provide an image enhancement algorithm employing the model framework. The algorithm is capable of exploiting not only orientation but also frequency and phase information existing intrinsically in a 2-dimensional input image. We provide the experimental results corresponding to the enhancement algorithm.
Tasks	Image Enhancement
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04992v1
PDF	https://arxiv.org/pdf/1910.04992v1.pdf
PWC	https://paperswithcode.com/paper/a-sub-riemannian-model-of-the-visual-cortex
Repo
Framework

Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks


Title	Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks
Authors	David Pfau, James S. Spencer, Alexander G. de G. Matthews, W. M. C. Foulkes
Abstract	Given access to accurate solutions of the many-electron Schr"odinger equation, nearly all chemistry could be derived from first principles. Exact wavefunctions of interesting chemical systems are out of reach because they are NP-hard to compute in general, but approximations can be found using polynomially-scaling algorithms. The key challenge for many of these algorithms is the choice of wavefunction approximation, or Ansatz, which must trade off between efficiency and accuracy. Neural networks have shown impressive power as accurate practical function approximators and promise as a compact wavefunction Ansatz for spin systems, but problems in electronic structure require wavefunctions that obey Fermi-Dirac statistics. Here we introduce a novel deep learning architecture, the Fermionic Neural Network, as a powerful wavefunction Ansatz for many-electron systems. The Fermionic Neural Network is able to achieve accuracy beyond other variational quantum Monte Carlo Ans"atze on a variety of atoms and small molecules. Using no data other than atomic positions and charges, we predict the dissociation curves of the nitrogen molecule and hydrogen chain, two challenging strongly-correlated systems, to significantly higher accuracy than the coupled cluster method, widely considered the most accurate scalable method for quantum chemistry at equilibrium geometry. This demonstrates that deep neural networks can improve the accuracy of variational quantum Monte Carlo to the point where it outperforms other ab-initio quantum chemistry methods, opening the possibility of accurate direct optimisation of wavefunctions for previously intractable molecules and solids.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02487v2
PDF	https://arxiv.org/pdf/1909.02487v2.pdf
PWC	https://paperswithcode.com/paper/ab-initio-solution-of-the-many-electron
Repo
Framework

A Network-based Multimodal Data Fusion Approach for Characterizing Dynamic Multimodal Physiological Patterns


Title	A Network-based Multimodal Data Fusion Approach for Characterizing Dynamic Multimodal Physiological Patterns
Authors	Miaolin Fan, Chun-An Chou, Sheng-Che Yen, Yingzi Lin
Abstract	Characterizing the dynamic interactive patterns of complex systems helps gain in-depth understanding of how components interrelate with each other while performing certain functions as a whole. In this study, we present a novel multimodal data fusion approach to construct a complex network, which models the interactions of biological subsystems in the human body under emotional states through physiological responses. Joint recurrence plot and temporal network metrics are employed to integrate the multimodal information at the signal level. A benchmark public dataset of is used for evaluating our model.
Tasks
Published	2019-01-03
URL	http://arxiv.org/abs/1901.00877v1
PDF	http://arxiv.org/pdf/1901.00877v1.pdf
PWC	https://paperswithcode.com/paper/a-network-based-multimodal-data-fusion
Repo
Framework

Deep Learning for Low-Field to High-Field MR: Image Quality Transfer with Probabilistic Decimation Simulator


Title	Deep Learning for Low-Field to High-Field MR: Image Quality Transfer with Probabilistic Decimation Simulator
Authors	Hongxiang Lin, Matteo Figini, Ryutaro Tanno, Stefano B. Blumberg, Enrico Kaden, Godwin Ogbole, Biobele J. Brown, Felice D’Arco, David W. Carmichael, Ikeoluwa Lagunju, Helen J. Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander
Abstract	MR images scanned at low magnetic field ($<1$T) have lower resolution in the slice direction and lower contrast, due to a relatively small signal-to-noise ratio (SNR) than those from high field (typically 1.5T and 3T). We adapt the recent idea of Image Quality Transfer (IQT) to enhance very low-field structural images aiming to estimate the resolution, spatial coverage, and contrast of high-field images. Analogous to many learning-based image enhancement techniques, IQT generates training data from high-field scans alone by simulating low-field images through a pre-defined decimation model. However, the ground truth decimation model is not well-known in practice, and lack of its specification can bias the trained model, aggravating performance on the real low-field scans. In this paper we propose a probabilistic decimation simulator to improve robustness of model training. It is used to generate and augment various low-field images whose parameters are random variables and sampled from an empirical distribution related to tissue-specific SNR on a 0.36T scanner. The probabilistic decimation simulator is model-agnostic, that is, it can be used with any super-resolution networks. Furthermore we propose a variant of U-Net architecture to improve its learning performance. We show promising qualitative results from clinical low-field images confirming the strong efficacy of IQT in an important new application area: epilepsy diagnosis in sub-Saharan Africa where only low-field scanners are normally available.
Tasks	Image Enhancement, Super-Resolution
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06763v1
PDF	https://arxiv.org/pdf/1909.06763v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-low-field-to-high-field-mr
Repo
Framework

Hierarchical Bayesian Personalized Recommendation: A Case Study and Beyond


Title	Hierarchical Bayesian Personalized Recommendation: A Case Study and Beyond
Authors	Zitao Liu, Zhexuan Xu, Yan Yan
Abstract	Items in modern recommender systems are often organized in hierarchical structures. These hierarchical structures and the data within them provide valuable information for building personalized recommendation systems. In this paper, we propose a general hierarchical Bayesian learning framework, i.e., \emph{HBayes}, to learn both the structures and associated latent factors. Furthermore, we develop a variational inference algorithm that is able to learn model parameters with fast empirical convergence rate. The proposed HBayes is evaluated on two real-world datasets from different domains. The results demonstrate the benefits of our approach on item recommendation tasks, and show that it can outperform the state-of-the-art models in terms of precision, recall, and normalized discounted cumulative gain. To encourage the reproducible results, we make our code public on a git repo: \url{https://tinyurl.com/ycruhk4t}.
Tasks	Recommendation Systems
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07371v1
PDF	https://arxiv.org/pdf/1908.07371v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-bayesian-personalized
Repo
Framework

A method for Cloud Mapping in the Field of View of the Infra-Red Camera during the EUSO-SPB1 flight


Title	A method for Cloud Mapping in the Field of View of the Infra-Red Camera during the EUSO-SPB1 flight
Authors	Alessandro Bruno, Anna Anzalone, Carlo Vigorito
Abstract	EUSO-SPB1 was released on April 24th, 2017, from the NASA balloon launch site in Wanaka (New Zealand) and landed on the South Pacific Ocean on May 7th. The data collected by the instruments onboard the balloon were analyzed to search UV pulse signatures of UHECR (Ultra High Energy Cosmic Rays) air showers. Indirect measurements of UHECRs can be affected by cloud presence during nighttime, therefore it is crucial to know the meteorological conditions during the observation period of the detector. During the flight, the onboard EUSO-SPB1 UCIRC camera (University of Chicago Infra-Red Camera), acquired images in the field of view of the UV telescope. The available nighttime and daytime images include information on meteorological conditions of the atmosphere observed in two infra-red bands. The presence of clouds has been investigated employing a method developed to provide a dense cloudiness map for each available infra-red image. The final masks are intended to give pixel cloudiness information at the IR-camera pixel resolution that is nearly 4-times higher than the one of the UV-camera. In this work, cloudiness maps are obtained by using an expert system based on the analysis of different low-level image features. Furthermore, an image enhancement step was needed to be applied as a preprocessing step to deal with uncalibrated data.
Tasks	Image Enhancement
Published	2019-09-12
URL	https://arxiv.org/abs/1909.05917v1
PDF	https://arxiv.org/pdf/1909.05917v1.pdf
PWC	https://paperswithcode.com/paper/a-method-for-cloud-mapping-in-the-field-of
Repo
Framework

Cantonese Automatic Speech Recognition Using Transfer Learning from Mandarin


Title	Cantonese Automatic Speech Recognition Using Transfer Learning from Mandarin
Authors	Bryan Li, Xinyue Wang, Homayoon Beigi
Abstract	We propose a system to develop a basic automatic speech recognizer(ASR) for Cantonese, a low-resource language, through transfer learning of Mandarin, a high-resource language. We take a time-delayed neural network trained on Mandarin, and perform weight transfer of several layers to a newly initialized model for Cantonese. We experiment with the number of layers transferred, their learning rates, and pretraining i-vectors. Key findings are that this approach allows for quicker training time with less data. We find that for every epoch, log-probability is smaller for transfer learning models compared to a Cantonese-only model. The transfer learning models show slight improvement in CER.
Tasks	Speech Recognition, Transfer Learning
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09271v1
PDF	https://arxiv.org/pdf/1911.09271v1.pdf
PWC	https://paperswithcode.com/paper/cantonese-automatic-speech-recognition-using
Repo
Framework

Adaptive Sequential Experiments with Unknown Information Flows


Title	Adaptive Sequential Experiments with Unknown Information Flows
Authors	Yonatan Gur, Ahmadreza Momeni
Abstract	Systems that make sequential decisions in the presence of partial feedback on actions often need to strike a balance between maximizing immediate payoffs based on available information, and acquiring new information that may be essential for maximizing future payoffs. This trade-off is captured by the multi-armed bandit (MAB) framework that has been studied and applied for designing sequential experiments when at each time epoch a single observation is collected on the action that was selected at that epoch. However, in many practical settings additional information may become available between decision epochs. We introduce a generalized MAB formulation in which auxiliary information on each arm may appear arbitrarily over time. By obtaining matching lower and upper bounds, we characterize the minimax complexity of this family of MAB problems as a function of the information arrival process, and study how salient characteristics of this process impact policy design and achievable performance. We establish the robustness of a Thompson sampling policy in the presence of additional information, but observe that other policies that are of practical importance do not exhibit such robustness. We therefore introduce a broad adaptive exploration approach for designing policies that, without any prior knowledge on the information arrival process, attain the best performance (in terms of regret rate) that is achievable when the information arrival process is a priori known. Our approach is based on adjusting MAB policies designed to perform well in the absence of auxiliary information by using dynamically customized virtual time indexes to endogenously control the exploration rate of the policy. We demonstrate our approach through appropriately adjusting known MAB policies and establishing improved performance bounds for these policies in the presence of auxiliary information.
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1907.00107v1
PDF	https://arxiv.org/pdf/1907.00107v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-sequential-experiments-with-unknown
Repo
Framework

Efficient hinging hyperplanes neural network and its application in nonlinear system identification


Title	Efficient hinging hyperplanes neural network and its application in nonlinear system identification
Authors	Jun Xu, Qinghua Tao, Zhen Li, Xiangming Xi, Johan A. K. Suykens, Shuning Wang
Abstract	In this paper, the efficient hinging hyperplanes (EHH) neural network is proposed based on the model of hinging hyperplanes (HH). The EHH neural network is a distributed representation, the training of which involves solving several convex optimization problems and is fast. It is proved that for every EHH neural network, there is an equivalent adaptive hinging hyperplanes (AHH) tree, which was also proposed based on the model of HH and find good applications in system identification. The construction of the EHH neural network includes 2 stages. First the initial structure of the EHH neural network is randomly determined and the Lasso regression is used to choose the appropriate network. To alleviate the impact of randomness, secondly, the stacking strategy is employed to formulate a more general network structure. Different from other neural networks, the EHH neural network has interpretability ability, which can be easily obtained through its ANOVA decomposition (or interaction matrix). The interpretability can then be used as a suggestion for input variable selection. The EHH neural network is applied in nonlinear system identification, the simulation results show that the regression vector selected is reasonable and the identification speed is fast, while at the same time, the simulation accuracy is satisfactory.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06518v2
PDF	https://arxiv.org/pdf/1905.06518v2.pdf
PWC	https://paperswithcode.com/paper/efficient-hinging-hyperplanes-neural-network
Repo
Framework

Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models


Title	Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models
Authors	Matthew LeMay, Shijian Li, Tian Guo
Abstract	Deep learning models are increasingly used for end-user applications, supporting both novel features such as facial recognition, and traditional features, e.g. web search. To accommodate high inference throughput, it is common to host a single pre-trained Convolutional Neural Network (CNN) in dedicated cloud-based servers with hardware accelerators such as Graphics Processing Units (GPUs). However, GPUs can be orders of magnitude more expensive than traditional Central Processing Unit (CPU) servers. These resources could also be under-utilized facing dynamic workloads, which may result in inflated serving costs. One potential way to alleviate this problem is by allowing hosted models to share the underlying resources, which we refer to as multi-tenant inference serving. One of the key challenges is maximizing the resource efficiency for multi-tenant serving given hardware with diverse characteristics, models with unique response time Service Level Agreement (SLA), and dynamic inference workloads. In this paper, we present Perseus, a measurement framework that provides the basis for understanding the performance and cost trade-offs of multi-tenant model serving. We implemented Perseus in Python atop a popular cloud inference server called Nvidia TensorRT Inference Server. Leveraging Perseus, we evaluated the inference throughput and cost for serving various models and demonstrated that multi-tenant model serving led to up to 12% cost reduction.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02322v2
PDF	https://arxiv.org/pdf/1912.02322v2.pdf
PWC	https://paperswithcode.com/paper/perseus-characterizing-performance-and-cost
Repo
Framework