January 26, 2020

3002 words 15 mins read

Paper Group ANR 1523

What and Where to Translate: Local Mask-based Image-to-Image Translation. Bayesian Nonparametric Federated Learning of Neural Networks. Adaptive Quantile Low-Rank Matrix Factorization. Eye Contact Correction using Deep Neural Networks. Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation. Learning from Multi …

What and Where to Translate: Local Mask-based Image-to-Image Translation


Title	What and Where to Translate: Local Mask-based Image-to-Image Translation
Authors	Wonwoong Cho, Seunghwan Choi, Junwoo Park, David Keetae Park, Tao Qin, Jaegul Choo
Abstract	Recently, image-to-image translation has obtained significant attention. Among many, those approaches based on an exemplar image that contains the target style information has been actively studied, due to its capability to handle multimodality as well as its applicability in practical use. However, two intrinsic problems exist in the existing methods: what and where to transfer. First, those methods extract style from an entire exemplar which includes noisy information, which impedes a translation model from properly extracting the intended style of the exemplar. That is, we need to carefully determine what to transfer from the exemplar. Second, the extracted style is applied to the entire input image, which causes unnecessary distortion in irrelevant image regions. In response, we need to decide where to transfer the extracted style. In this paper, we propose a novel approach that extracts out a local mask from the exemplar that determines what style to transfer, and another local mask from the input image that determines where to transfer the extracted style. The main novelty of this paper lies in (1) the highway adaptive instance normalization technique and (2) an end-to-end translation framework which achieves an outstanding performance in reflecting a style of an exemplar. We demonstrate the quantitative and qualitative evaluation results to confirm the advantages of our proposed approach.
Tasks	Image-to-Image Translation
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03598v2
PDF	https://arxiv.org/pdf/1906.03598v2.pdf
PWC	https://paperswithcode.com/paper/what-and-where-to-translate-local-mask-based
Repo
Framework

Bayesian Nonparametric Federated Learning of Neural Networks


Title	Bayesian Nonparametric Federated Learning of Neural Networks
Authors	Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang, Yasaman Khazaeni
Abstract	In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. We then demonstrate the efficacy of our approach on federated learning problems simulated from two popular image classification datasets.
Tasks	Image Classification
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12022v1
PDF	https://arxiv.org/pdf/1905.12022v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-nonparametric-federated-learning-of
Repo
Framework

Adaptive Quantile Low-Rank Matrix Factorization


Title	Adaptive Quantile Low-Rank Matrix Factorization
Authors	Shuang Xu, Chun-Xia Zhang, Jiangshe Zhang
Abstract	Low-rank matrix factorization (LRMF) has received much popularity owing to its successful applications in both computer vision and data mining. By assuming noise to come from a Gaussian, Laplace or mixture of Gaussian distributions, significant efforts have been made on optimizing the (weighted) $L_1$ or $L_2$-norm loss between an observed matrix and its bilinear factorization. However, the type of noise distribution is generally unknown in real applications and inappropriate assumptions will inevitably deteriorate the behavior of LRMF. On the other hand, real data are often corrupted by skew rather than symmetric noise. To tackle this problem, this paper presents a novel LRMF model called AQ-LRMF by modeling noise with a mixture of asymmetric Laplace distributions. An efficient algorithm based on the expectation-maximization (EM) algorithm is also offered to estimate the parameters involved in AQ-LRMF. The AQ-LRMF model possesses the advantage that it can approximate noise well no matter whether the real noise is symmetric or skew. The core idea of AQ-LRMF lies in solving a weighted $L_1$ problem with weights being learned from data. The experiments conducted on synthetic and real datasets show that AQ-LRMF outperforms several state-of-the-art techniques. Furthermore, AQ-LRMF also has the superiority over the other algorithms in terms of capturing local structural information contained in real images.
Tasks
Published	2019-01-01
URL	https://arxiv.org/abs/1901.00140v3
PDF	https://arxiv.org/pdf/1901.00140v3.pdf
PWC	https://paperswithcode.com/paper/adaptive-quantile-low-rank-matrix
Repo
Framework

Eye Contact Correction using Deep Neural Networks


Title	Eye Contact Correction using Deep Neural Networks
Authors	Leo F. Isikdogan, Timo Gerasimow, Gilad Michael
Abstract	In a typical video conferencing setup, it is hard to maintain eye contact during a call since it requires looking into the camera rather than the display. We propose an eye contact correction model that restores the eye contact regardless of the relative position of the camera and display. Unlike previous solutions, our model redirects the gaze from an arbitrary direction to the center without requiring a redirection angle or camera/display/user geometry as inputs. We use a deep convolutional neural network that inputs a monocular image and produces a vector field and a brightness map to correct the gaze. We train this model in a bi-directional way on a large set of synthetically generated photorealistic images with perfect labels. The learned model is a robust eye contact corrector which also predicts the input gaze implicitly at no additional cost. Our system is primarily designed to improve the quality of video conferencing experience. Therefore, we use a set of control mechanisms to prevent creepy results and to ensure a smooth and natural video conferencing experience. The entire eye contact correction system runs end-to-end in real-time on a commodity CPU and does not require any dedicated hardware, making our solution feasible for a variety of devices.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05378v2
PDF	https://arxiv.org/pdf/1906.05378v2.pdf
PWC	https://paperswithcode.com/paper/eye-contact-correction-using-deep-neural
Repo
Framework

Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation


Title	Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation
Authors	Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin, Tamás Keviczky
Abstract	Electronic power inverters are capable of quickly delivering reactive power to maintain customer voltages within operating tolerances and to reduce system losses in distribution grids. This paper proposes a systematic and data-driven approach to determine reactive power inverter output as a function of local measurements in a manner that obtains near optimal results. First, we use a network model and historic load and generation data and do optimal power flow to compute globally optimal reactive power injections for all controllable inverters in the network. Subsequently, we use regression to find a function for each inverter that maps its local historical data to an approximation of its optimal reactive power injection. The resulting functions then serve as decentralized controllers in the participating inverters to predict the optimal injection based on a new local measurements. The method achieves near-optimal results when performing voltage- and capacity-constrained loss minimization and voltage flattening, and allows for an efficient volt-VAR optimization (VVO) scheme in which legacy control equipment collaborates with existing inverters to facilitate safe operation of distribution networks with higher levels of distributed generation.
Tasks
Published	2019-02-20
URL	http://arxiv.org/abs/1902.08594v1
PDF	http://arxiv.org/pdf/1902.08594v1.pdf
PWC	https://paperswithcode.com/paper/regression-based-inverter-control-for
Repo
Framework

Learning from Multiple Corrupted Sources, with Application to Learning from Label Proportions


Title	Learning from Multiple Corrupted Sources, with Application to Learning from Label Proportions
Authors	Clayton Scott, Jianxin Zhang
Abstract	We study binary classification in the setting where the learner is presented with multiple corrupted training samples, with possibly different sample sizes and degrees of corruption, and introduce an approach based on minimizing a weighted combination of corruption-corrected empirical risks. We establish a generalization error bound, and further show that the bound is optimized when the weights are certain interpretable and intuitive functions of the sample sizes and degrees of corruptions. We then apply this setting to the problem of learning with label proportions (LLP), and propose an algorithm that enjoys the most general statistical performance guarantees known for LLP. Experiments demonstrate the utility of our theory.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04665v1
PDF	https://arxiv.org/pdf/1910.04665v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-multiple-corrupted-sources-with
Repo
Framework

Semantics for Global and Local Interpretation of Deep Neural Networks


Title	Semantics for Global and Local Interpretation of Deep Neural Networks
Authors	Jindong Gu, Volker Tresp
Abstract	Deep neural networks (DNNs) with high expressiveness have achieved state-of-the-art performance in many tasks. However, their distributed feature representations are difficult to interpret semantically. In this work, human-interpretable semantic concepts are associated with vectors in feature space. The association process is mathematically formulated as an optimization problem. The semantic vectors obtained from the optimal solution are applied to interpret deep neural networks globally and locally. The global interpretations are useful to understand the knowledge learned by DNNs. The interpretation of local behaviors can help to understand individual decisions made by DNNs better. The empirical experiments demonstrate how to use identified semantics to interpret the existing DNNs.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09085v1
PDF	https://arxiv.org/pdf/1910.09085v1.pdf
PWC	https://paperswithcode.com/paper/semantics-for-global-and-local-interpretation
Repo
Framework

ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks


Title	ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
Authors	Monika Sharma, Shikha Gupta, Arindam Chowdhury, Lovekesh Vig
Abstract	Despite the improvements in perception accuracies brought about via deep learning, developing systems combining accurate visual perception with the ability to reason over the visual percepts remains extremely challenging. A particular application area of interest from an accessibility perspective is that of reasoning over statistical charts such as bar and pie charts. To this end, we formulate the problem of reasoning over statistical charts as a classification task using MAC-Networks to give answers from a predefined vocabulary of generic answers. Additionally, we enhance the capabilities of MAC-Networks to give chart-specific answers to open-ended questions by replacing the classification layer by a regression layer to localize the textual answers present over the images. We call our network ChartNet, and demonstrate its efficacy on predicting both in vocabulary and out of vocabulary answers. To test our methods, we generated our own dataset of statistical chart images and corresponding question answer pairs. Results show that ChartNet consistently outperform other state-of-the-art methods on reasoning over these questions and may be a viable candidate for applications containing images of statistical charts.
Tasks	Visual Reasoning
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09375v1
PDF	https://arxiv.org/pdf/1911.09375v1.pdf
PWC	https://paperswithcode.com/paper/chartnet-visual-reasoning-over-statistical
Repo
Framework

Revisiting Precision and Recall Definition for Generative Model Evaluation


Title	Revisiting Precision and Recall Definition for Generative Model Evaluation
Authors	Loïc Simon, Ryan Webster, Julien Rabin
Abstract	In this article we revisit the definition of Precision-Recall (PR) curves for generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than providing a scalar for generative quality, PR curves distinguish mode-collapse (poor recall) and bad quality (poor precision). We first generalize their formulation to arbitrary measures, hence removing any restriction to finite support. We also expose a bridge between PR curves and type I and type II error rates of likelihood ratio classifiers on the task of discriminating between samples of the two distributions. Building upon this new perspective, we propose a novel algorithm to approximate precision-recall curves, that shares some interesting methodological properties with the hypothesis testing technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest of the proposed formulation over the original approach on controlled multi-modal datasets.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05441v1
PDF	https://arxiv.org/pdf/1905.05441v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-precision-and-recall-definition
Repo
Framework

Semi-Supervised Few-Shot Learning with Prototypical Random Walks


Title	Semi-Supervised Few-Shot Learning with Prototypical Random Walks
Authors	Ahmed Ayyad, Nassir Navab, Mohamed Elhoseiny, Shadi Albarqouni
Abstract	Recent progress has shown that few-shot learning can be improved with access to unlabelled data, known as semi-supervised few-shot learning(SS-FSL). We introduce an SS-FSL approach, dubbed as Prototypical Random Walk Networks(PRWN), built on top of Prototypical Networks (PN). We develop a random walk semi-supervised loss that enables the network to learn representations that are compact and well-separated. Our work is related to the very recent development on graph-based approaches for few-shot learning. However, we show that compact and well-separated class representations can be achieved by modeling our prototypical random walk notion without needing additional graph-NN parameters or requiring a transductive setting where collective test set is provided. Our model outperforms prior art in most benchmarks with significant improvements in some cases. For example, in a mini-Imagenet 5-shot classification task, we obtain 69.65$%$ accuracy to the 64.59$%$ state-of-the-art. Our model, trained with 40$%$ of the data as labelled, compares competitively against fully supervised prototypical networks, trained on 100$%$ of the labels, even outperforming it in the 1-shot mini-Imagenet case with 50.89$%$ to 49.4$%$ accuracy. We also show that our model is resistant to distractors, unlabeled data that does not belong to any of the training classes, and hence reflecting robustness to labelled/unlabelled class distribution mismatch. We also performed a challenging discriminative power test, showing a relative improvement on top of the baseline of $\approx$14% on 20 classes on mini-Imagenet and $\approx$60% on 800 classes on Omniglot. Code will be made available.
Tasks	Few-Shot Learning, Omniglot
Published	2019-03-06
URL	https://arxiv.org/abs/1903.02164v2
PDF	https://arxiv.org/pdf/1903.02164v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-few-shot-learning-with-local
Repo
Framework

Learning to Customize Language Model for Generation-based dialog systems


Title	Learning to Customize Language Model for Generation-based dialog systems
Authors	Yiping Song, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang
Abstract	Personalized conversation systems have received increasing attention recently. Existing personalized conversation models tend to employ the meta-learning framework that first finds the initial parameters, then fine-tunes on a few personal utterances. However, fine-tuning can only make slight changes to the initial parameters, resulting in similar language models for different users. In this paper, we propose to customize a conversation model with unique network structures for each user. Concretely, we introduce a private network to the language model, whose structure will evolve during training to better capture the unique characteristics of the user. The private network is only trained on the corpora of the corresponding user, and similar users can share partial private structure for data reuse purpose. Experiment results show that our algorithm excels all the baselines in terms of personality, quality, and diversity measurement.
Tasks	Language Modelling, Meta-Learning
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14326v1
PDF	https://arxiv.org/pdf/1910.14326v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-customize-language-model-for
Repo
Framework

Data-driven PDE discovery with evolutionary approach


Title	Data-driven PDE discovery with evolutionary approach
Authors	Michail Maslyaev, Alexander Hvatov, Anna Kalyuzhnaya
Abstract	The data-driven models allow one to define the model structure in cases when a priori information is not sufficient to build other types of models. The possible way to obtain physical interpretation is the data-driven differential equation discovery techniques. The existing methods of PDE (partial derivative equations) discovery are bound with the sparse regression. However, sparse regression is restricting the resulting model form, since the terms for PDE are defined before regression. The evolutionary approach described in the article has a symbolic regression as the background instead and thus has fewer restrictions on the PDE form. The evolutionary method of PDE discovery (EPDE) is described and tested on several canonical PDEs. The question of robustness is examined on a noised data example.
Tasks
Published	2019-03-19
URL	http://arxiv.org/abs/1903.08011v2
PDF	http://arxiv.org/pdf/1903.08011v2.pdf
PWC	https://paperswithcode.com/paper/data-driven-pde-discovery-with-evolutionary
Repo
Framework

A Genetic Algorithm Enabled Similarity-Based Attack on Cancellable Biometrics


Title	A Genetic Algorithm Enabled Similarity-Based Attack on Cancellable Biometrics
Authors	Xingbo Dong, Zhe Jin, Andrew Teoh Beng Jin
Abstract	Cancellable biometrics (CB) as a means for biometric template protection approach refers to an irreversible yet similarity preserving transformation on the original template. With similarity preserving property, the matching between template and query instance can be performed in the transform domain without jeopardizing accuracy performance. Unfortunately, this trait invites a class of attack, namely similarity-based attack (SA). SA produces a preimage, an inverse of transformed template, which can be exploited for impersonation and cross-matching. In this paper, we propose a Genetic Algorithm enabled similarity-based attack framework (GASAF) to demonstrate that CB schemes whose possess similarity preserving property are highly vulnerable to similarity-based attack. Besides that, a set of new metrics is designed to measure the effectiveness of the similarity-based attack. We conduct the experiment on two representative CB schemes, i.e. BioHashing and Bloom-filter. The experimental results attest the vulnerability under this type of attack.
Tasks
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03021v2
PDF	https://arxiv.org/pdf/1905.03021v2.pdf
PWC	https://paperswithcode.com/paper/a-genetic-algorithm-enabled-similarity-based
Repo
Framework

Variational Registration of Multiple Images with the SVD based SqN Distance Measure


Title	Variational Registration of Multiple Images with the SVD based SqN Distance Measure
Authors	Kai Brehmer, Hari Om Aggrawal, Stefan Heldmann, Jan Modersitzki
Abstract	Image registration, especially the quantification of image similarity, is an important task in image processing. Various approaches for the comparison of two images are discussed in the literature. However, although most of these approaches perform very well in a two image scenario, an extension to a multiple images scenario deserves attention. In this article, we discuss and compare registration methods for multiple images. Our key assumption is, that information about the singular values of a feature matrix of images can be used for alignment. We introduce, discuss and relate three recent approaches from the literature: the Schatten q-norm based SqN distance measure, a rank based approach, and a feature volume based approach. We also present results for typical applications such as dynamic image sequences or stacks of histological sections. Our results indicate that the SqN approach is in fact a suitable distance measure for image registration. Moreover, our examples also indicate that the results obtained by SqN are superior to those obtained by its competitors.
Tasks	Image Registration
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09732v1
PDF	https://arxiv.org/pdf/1907.09732v1.pdf
PWC	https://paperswithcode.com/paper/variational-registration-of-multiple-images
Repo
Framework

Recovering Dropped Pronouns in Chinese Conversations via Modeling Their Referents


Title	Recovering Dropped Pronouns in Chinese Conversations via Modeling Their Referents
Authors	Jingxuan Yang, Jianzhuo Tong, Si Li, Sheng Gao, Jun Guo, Nianwen Xue
Abstract	Pronouns are often dropped in Chinese sentences, and this happens more frequently in conversational genres as their referents can be easily understood from context. Recovering dropped pronouns is essential to applications such as Information Extraction where the referents of these dropped pronouns need to be resolved, or Machine Translation when Chinese is the source language. In this work, we present a novel end-to-end neural network model to recover dropped pronouns in conversational data. Our model is based on a structured attention mechanism that models the referents of dropped pronouns utilizing both sentence-level and word-level information. Results on three different conversational genres show that our approach achieves a significant improvement over the current state of the art.
Tasks	Machine Translation
Published	2019-05-17
URL	https://arxiv.org/abs/1906.02128v1
PDF	https://arxiv.org/pdf/1906.02128v1.pdf
PWC	https://paperswithcode.com/paper/190602128
Repo
Framework