Paper Group ANR 1523
What and Where to Translate: Local Mask-based Image-to-Image Translation. Bayesian Nonparametric Federated Learning of Neural Networks. Adaptive Quantile Low-Rank Matrix Factorization. Eye Contact Correction using Deep Neural Networks. Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation. Learning from Multi …
What and Where to Translate: Local Mask-based Image-to-Image Translation
Title | What and Where to Translate: Local Mask-based Image-to-Image Translation |
Authors | Wonwoong Cho, Seunghwan Choi, Junwoo Park, David Keetae Park, Tao Qin, Jaegul Choo |
Abstract | Recently, image-to-image translation has obtained significant attention. Among many, those approaches based on an exemplar image that contains the target style information has been actively studied, due to its capability to handle multimodality as well as its applicability in practical use. However, two intrinsic problems exist in the existing methods: what and where to transfer. First, those methods extract style from an entire exemplar which includes noisy information, which impedes a translation model from properly extracting the intended style of the exemplar. That is, we need to carefully determine what to transfer from the exemplar. Second, the extracted style is applied to the entire input image, which causes unnecessary distortion in irrelevant image regions. In response, we need to decide where to transfer the extracted style. In this paper, we propose a novel approach that extracts out a local mask from the exemplar that determines what style to transfer, and another local mask from the input image that determines where to transfer the extracted style. The main novelty of this paper lies in (1) the highway adaptive instance normalization technique and (2) an end-to-end translation framework which achieves an outstanding performance in reflecting a style of an exemplar. We demonstrate the quantitative and qualitative evaluation results to confirm the advantages of our proposed approach. |
Tasks | Image-to-Image Translation |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03598v2 |
https://arxiv.org/pdf/1906.03598v2.pdf | |
PWC | https://paperswithcode.com/paper/what-and-where-to-translate-local-mask-based |
Repo | |
Framework | |
Bayesian Nonparametric Federated Learning of Neural Networks
Title | Bayesian Nonparametric Federated Learning of Neural Networks |
Authors | Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang, Yasaman Khazaeni |
Abstract | In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. We then demonstrate the efficacy of our approach on federated learning problems simulated from two popular image classification datasets. |
Tasks | Image Classification |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12022v1 |
https://arxiv.org/pdf/1905.12022v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-nonparametric-federated-learning-of |
Repo | |
Framework | |
Adaptive Quantile Low-Rank Matrix Factorization
Title | Adaptive Quantile Low-Rank Matrix Factorization |
Authors | Shuang Xu, Chun-Xia Zhang, Jiangshe Zhang |
Abstract | Low-rank matrix factorization (LRMF) has received much popularity owing to its successful applications in both computer vision and data mining. By assuming noise to come from a Gaussian, Laplace or mixture of Gaussian distributions, significant efforts have been made on optimizing the (weighted) $L_1$ or $L_2$-norm loss between an observed matrix and its bilinear factorization. However, the type of noise distribution is generally unknown in real applications and inappropriate assumptions will inevitably deteriorate the behavior of LRMF. On the other hand, real data are often corrupted by skew rather than symmetric noise. To tackle this problem, this paper presents a novel LRMF model called AQ-LRMF by modeling noise with a mixture of asymmetric Laplace distributions. An efficient algorithm based on the expectation-maximization (EM) algorithm is also offered to estimate the parameters involved in AQ-LRMF. The AQ-LRMF model possesses the advantage that it can approximate noise well no matter whether the real noise is symmetric or skew. The core idea of AQ-LRMF lies in solving a weighted $L_1$ problem with weights being learned from data. The experiments conducted on synthetic and real datasets show that AQ-LRMF outperforms several state-of-the-art techniques. Furthermore, AQ-LRMF also has the superiority over the other algorithms in terms of capturing local structural information contained in real images. |
Tasks | |
Published | 2019-01-01 |
URL | https://arxiv.org/abs/1901.00140v3 |
https://arxiv.org/pdf/1901.00140v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-quantile-low-rank-matrix |
Repo | |
Framework | |
Eye Contact Correction using Deep Neural Networks
Title | Eye Contact Correction using Deep Neural Networks |
Authors | Leo F. Isikdogan, Timo Gerasimow, Gilad Michael |
Abstract | In a typical video conferencing setup, it is hard to maintain eye contact during a call since it requires looking into the camera rather than the display. We propose an eye contact correction model that restores the eye contact regardless of the relative position of the camera and display. Unlike previous solutions, our model redirects the gaze from an arbitrary direction to the center without requiring a redirection angle or camera/display/user geometry as inputs. We use a deep convolutional neural network that inputs a monocular image and produces a vector field and a brightness map to correct the gaze. We train this model in a bi-directional way on a large set of synthetically generated photorealistic images with perfect labels. The learned model is a robust eye contact corrector which also predicts the input gaze implicitly at no additional cost. Our system is primarily designed to improve the quality of video conferencing experience. Therefore, we use a set of control mechanisms to prevent creepy results and to ensure a smooth and natural video conferencing experience. The entire eye contact correction system runs end-to-end in real-time on a commodity CPU and does not require any dedicated hardware, making our solution feasible for a variety of devices. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05378v2 |
https://arxiv.org/pdf/1906.05378v2.pdf | |
PWC | https://paperswithcode.com/paper/eye-contact-correction-using-deep-neural |
Repo | |
Framework | |
Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation
Title | Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation |
Authors | Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin, Tamás Keviczky |
Abstract | Electronic power inverters are capable of quickly delivering reactive power to maintain customer voltages within operating tolerances and to reduce system losses in distribution grids. This paper proposes a systematic and data-driven approach to determine reactive power inverter output as a function of local measurements in a manner that obtains near optimal results. First, we use a network model and historic load and generation data and do optimal power flow to compute globally optimal reactive power injections for all controllable inverters in the network. Subsequently, we use regression to find a function for each inverter that maps its local historical data to an approximation of its optimal reactive power injection. The resulting functions then serve as decentralized controllers in the participating inverters to predict the optimal injection based on a new local measurements. The method achieves near-optimal results when performing voltage- and capacity-constrained loss minimization and voltage flattening, and allows for an efficient volt-VAR optimization (VVO) scheme in which legacy control equipment collaborates with existing inverters to facilitate safe operation of distribution networks with higher levels of distributed generation. |
Tasks | |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.08594v1 |
http://arxiv.org/pdf/1902.08594v1.pdf | |
PWC | https://paperswithcode.com/paper/regression-based-inverter-control-for |
Repo | |
Framework | |
Learning from Multiple Corrupted Sources, with Application to Learning from Label Proportions
Title | Learning from Multiple Corrupted Sources, with Application to Learning from Label Proportions |
Authors | Clayton Scott, Jianxin Zhang |
Abstract | We study binary classification in the setting where the learner is presented with multiple corrupted training samples, with possibly different sample sizes and degrees of corruption, and introduce an approach based on minimizing a weighted combination of corruption-corrected empirical risks. We establish a generalization error bound, and further show that the bound is optimized when the weights are certain interpretable and intuitive functions of the sample sizes and degrees of corruptions. We then apply this setting to the problem of learning with label proportions (LLP), and propose an algorithm that enjoys the most general statistical performance guarantees known for LLP. Experiments demonstrate the utility of our theory. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04665v1 |
https://arxiv.org/pdf/1910.04665v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-multiple-corrupted-sources-with |
Repo | |
Framework | |
Semantics for Global and Local Interpretation of Deep Neural Networks
Title | Semantics for Global and Local Interpretation of Deep Neural Networks |
Authors | Jindong Gu, Volker Tresp |
Abstract | Deep neural networks (DNNs) with high expressiveness have achieved state-of-the-art performance in many tasks. However, their distributed feature representations are difficult to interpret semantically. In this work, human-interpretable semantic concepts are associated with vectors in feature space. The association process is mathematically formulated as an optimization problem. The semantic vectors obtained from the optimal solution are applied to interpret deep neural networks globally and locally. The global interpretations are useful to understand the knowledge learned by DNNs. The interpretation of local behaviors can help to understand individual decisions made by DNNs better. The empirical experiments demonstrate how to use identified semantics to interpret the existing DNNs. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09085v1 |
https://arxiv.org/pdf/1910.09085v1.pdf | |
PWC | https://paperswithcode.com/paper/semantics-for-global-and-local-interpretation |
Repo | |
Framework | |
ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks
Title | ChartNet: Visual Reasoning over Statistical Charts using MAC-Networks |
Authors | Monika Sharma, Shikha Gupta, Arindam Chowdhury, Lovekesh Vig |
Abstract | Despite the improvements in perception accuracies brought about via deep learning, developing systems combining accurate visual perception with the ability to reason over the visual percepts remains extremely challenging. A particular application area of interest from an accessibility perspective is that of reasoning over statistical charts such as bar and pie charts. To this end, we formulate the problem of reasoning over statistical charts as a classification task using MAC-Networks to give answers from a predefined vocabulary of generic answers. Additionally, we enhance the capabilities of MAC-Networks to give chart-specific answers to open-ended questions by replacing the classification layer by a regression layer to localize the textual answers present over the images. We call our network ChartNet, and demonstrate its efficacy on predicting both in vocabulary and out of vocabulary answers. To test our methods, we generated our own dataset of statistical chart images and corresponding question answer pairs. Results show that ChartNet consistently outperform other state-of-the-art methods on reasoning over these questions and may be a viable candidate for applications containing images of statistical charts. |
Tasks | Visual Reasoning |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09375v1 |
https://arxiv.org/pdf/1911.09375v1.pdf | |
PWC | https://paperswithcode.com/paper/chartnet-visual-reasoning-over-statistical |
Repo | |
Framework | |
Revisiting Precision and Recall Definition for Generative Model Evaluation
Title | Revisiting Precision and Recall Definition for Generative Model Evaluation |
Authors | Loïc Simon, Ryan Webster, Julien Rabin |
Abstract | In this article we revisit the definition of Precision-Recall (PR) curves for generative models proposed by Sajjadi et al. (arXiv:1806.00035). Rather than providing a scalar for generative quality, PR curves distinguish mode-collapse (poor recall) and bad quality (poor precision). We first generalize their formulation to arbitrary measures, hence removing any restriction to finite support. We also expose a bridge between PR curves and type I and type II error rates of likelihood ratio classifiers on the task of discriminating between samples of the two distributions. Building upon this new perspective, we propose a novel algorithm to approximate precision-recall curves, that shares some interesting methodological properties with the hypothesis testing technique from Lopez-Paz et al (arXiv:1610.06545). We demonstrate the interest of the proposed formulation over the original approach on controlled multi-modal datasets. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05441v1 |
https://arxiv.org/pdf/1905.05441v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-precision-and-recall-definition |
Repo | |
Framework | |
Semi-Supervised Few-Shot Learning with Prototypical Random Walks
Title | Semi-Supervised Few-Shot Learning with Prototypical Random Walks |
Authors | Ahmed Ayyad, Nassir Navab, Mohamed Elhoseiny, Shadi Albarqouni |
Abstract | Recent progress has shown that few-shot learning can be improved with access to unlabelled data, known as semi-supervised few-shot learning(SS-FSL). We introduce an SS-FSL approach, dubbed as Prototypical Random Walk Networks(PRWN), built on top of Prototypical Networks (PN). We develop a random walk semi-supervised loss that enables the network to learn representations that are compact and well-separated. Our work is related to the very recent development on graph-based approaches for few-shot learning. However, we show that compact and well-separated class representations can be achieved by modeling our prototypical random walk notion without needing additional graph-NN parameters or requiring a transductive setting where collective test set is provided. Our model outperforms prior art in most benchmarks with significant improvements in some cases. For example, in a mini-Imagenet 5-shot classification task, we obtain 69.65$%$ accuracy to the 64.59$%$ state-of-the-art. Our model, trained with 40$%$ of the data as labelled, compares competitively against fully supervised prototypical networks, trained on 100$%$ of the labels, even outperforming it in the 1-shot mini-Imagenet case with 50.89$%$ to 49.4$%$ accuracy. We also show that our model is resistant to distractors, unlabeled data that does not belong to any of the training classes, and hence reflecting robustness to labelled/unlabelled class distribution mismatch. We also performed a challenging discriminative power test, showing a relative improvement on top of the baseline of $\approx$14% on 20 classes on mini-Imagenet and $\approx$60% on 800 classes on Omniglot. Code will be made available. |
Tasks | Few-Shot Learning, Omniglot |
Published | 2019-03-06 |
URL | https://arxiv.org/abs/1903.02164v2 |
https://arxiv.org/pdf/1903.02164v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-few-shot-learning-with-local |
Repo | |
Framework | |
Learning to Customize Language Model for Generation-based dialog systems
Title | Learning to Customize Language Model for Generation-based dialog systems |
Authors | Yiping Song, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang |
Abstract | Personalized conversation systems have received increasing attention recently. Existing personalized conversation models tend to employ the meta-learning framework that first finds the initial parameters, then fine-tunes on a few personal utterances. However, fine-tuning can only make slight changes to the initial parameters, resulting in similar language models for different users. In this paper, we propose to customize a conversation model with unique network structures for each user. Concretely, we introduce a private network to the language model, whose structure will evolve during training to better capture the unique characteristics of the user. The private network is only trained on the corpora of the corresponding user, and similar users can share partial private structure for data reuse purpose. Experiment results show that our algorithm excels all the baselines in terms of personality, quality, and diversity measurement. |
Tasks | Language Modelling, Meta-Learning |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14326v1 |
https://arxiv.org/pdf/1910.14326v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-customize-language-model-for |
Repo | |
Framework | |
Data-driven PDE discovery with evolutionary approach
Title | Data-driven PDE discovery with evolutionary approach |
Authors | Michail Maslyaev, Alexander Hvatov, Anna Kalyuzhnaya |
Abstract | The data-driven models allow one to define the model structure in cases when a priori information is not sufficient to build other types of models. The possible way to obtain physical interpretation is the data-driven differential equation discovery techniques. The existing methods of PDE (partial derivative equations) discovery are bound with the sparse regression. However, sparse regression is restricting the resulting model form, since the terms for PDE are defined before regression. The evolutionary approach described in the article has a symbolic regression as the background instead and thus has fewer restrictions on the PDE form. The evolutionary method of PDE discovery (EPDE) is described and tested on several canonical PDEs. The question of robustness is examined on a noised data example. |
Tasks | |
Published | 2019-03-19 |
URL | http://arxiv.org/abs/1903.08011v2 |
http://arxiv.org/pdf/1903.08011v2.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-pde-discovery-with-evolutionary |
Repo | |
Framework | |
A Genetic Algorithm Enabled Similarity-Based Attack on Cancellable Biometrics
Title | A Genetic Algorithm Enabled Similarity-Based Attack on Cancellable Biometrics |
Authors | Xingbo Dong, Zhe Jin, Andrew Teoh Beng Jin |
Abstract | Cancellable biometrics (CB) as a means for biometric template protection approach refers to an irreversible yet similarity preserving transformation on the original template. With similarity preserving property, the matching between template and query instance can be performed in the transform domain without jeopardizing accuracy performance. Unfortunately, this trait invites a class of attack, namely similarity-based attack (SA). SA produces a preimage, an inverse of transformed template, which can be exploited for impersonation and cross-matching. In this paper, we propose a Genetic Algorithm enabled similarity-based attack framework (GASAF) to demonstrate that CB schemes whose possess similarity preserving property are highly vulnerable to similarity-based attack. Besides that, a set of new metrics is designed to measure the effectiveness of the similarity-based attack. We conduct the experiment on two representative CB schemes, i.e. BioHashing and Bloom-filter. The experimental results attest the vulnerability under this type of attack. |
Tasks | |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.03021v2 |
https://arxiv.org/pdf/1905.03021v2.pdf | |
PWC | https://paperswithcode.com/paper/a-genetic-algorithm-enabled-similarity-based |
Repo | |
Framework | |
Variational Registration of Multiple Images with the SVD based SqN Distance Measure
Title | Variational Registration of Multiple Images with the SVD based SqN Distance Measure |
Authors | Kai Brehmer, Hari Om Aggrawal, Stefan Heldmann, Jan Modersitzki |
Abstract | Image registration, especially the quantification of image similarity, is an important task in image processing. Various approaches for the comparison of two images are discussed in the literature. However, although most of these approaches perform very well in a two image scenario, an extension to a multiple images scenario deserves attention. In this article, we discuss and compare registration methods for multiple images. Our key assumption is, that information about the singular values of a feature matrix of images can be used for alignment. We introduce, discuss and relate three recent approaches from the literature: the Schatten q-norm based SqN distance measure, a rank based approach, and a feature volume based approach. We also present results for typical applications such as dynamic image sequences or stacks of histological sections. Our results indicate that the SqN approach is in fact a suitable distance measure for image registration. Moreover, our examples also indicate that the results obtained by SqN are superior to those obtained by its competitors. |
Tasks | Image Registration |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09732v1 |
https://arxiv.org/pdf/1907.09732v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-registration-of-multiple-images |
Repo | |
Framework | |
Recovering Dropped Pronouns in Chinese Conversations via Modeling Their Referents
Title | Recovering Dropped Pronouns in Chinese Conversations via Modeling Their Referents |
Authors | Jingxuan Yang, Jianzhuo Tong, Si Li, Sheng Gao, Jun Guo, Nianwen Xue |
Abstract | Pronouns are often dropped in Chinese sentences, and this happens more frequently in conversational genres as their referents can be easily understood from context. Recovering dropped pronouns is essential to applications such as Information Extraction where the referents of these dropped pronouns need to be resolved, or Machine Translation when Chinese is the source language. In this work, we present a novel end-to-end neural network model to recover dropped pronouns in conversational data. Our model is based on a structured attention mechanism that models the referents of dropped pronouns utilizing both sentence-level and word-level information. Results on three different conversational genres show that our approach achieves a significant improvement over the current state of the art. |
Tasks | Machine Translation |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1906.02128v1 |
https://arxiv.org/pdf/1906.02128v1.pdf | |
PWC | https://paperswithcode.com/paper/190602128 |
Repo | |
Framework | |