January 26, 2020

2782 words 14 mins read

Paper Group ANR 1439

Patch redundancy in images: a statistical testing framework and some applications. Facial Landmark Correlation Analysis. A Unified Framework for Lifelong Learning in Deep Neural Networks. Optimistic Proximal Policy Optimization. Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation. BMVC 2019: Workshop on Inte …

Patch redundancy in images: a statistical testing framework and some applications


Title	Patch redundancy in images: a statistical testing framework and some applications
Authors	De Bortoli Valentin, Desolneux Agnès, Galerne Bruno, Leclaire Arthur
Abstract	In this work we introduce a statistical framework in order to analyze the spatial redundancy in natural images. This notion of spatial redundancy must be defined locally and thus we give some examples of functions (auto-similarity and template similarity) which, given one or two images, computes a similarity measurement between patches. Two patches are said to be similar if the similarity measurement is small enough. To derive a criterion for taking a decision on the similarity between two patches we present an a contrario model. Namely, two patches are said to be similar if the associated similarity measurement is unlikely to happen in a background model. Choosing Gaussian random fields as background models we derive non-asymptotic expressions for the probability distribution function of similarity measurements. We introduce a fast algorithm in order to assess redundancy in natural images and present applications in denoising, periodicity analysis and texture ranking.
Tasks	Denoising
Published	2019-04-12
URL	http://arxiv.org/abs/1904.06428v1
PDF	http://arxiv.org/pdf/1904.06428v1.pdf
PWC	https://paperswithcode.com/paper/patch-redundancy-in-images-a-statistical
Repo
Framework

Facial Landmark Correlation Analysis


Title	Facial Landmark Correlation Analysis
Authors	Yongzhe Yan, Stefan Duffner, Priyanka Phutane, Anthony Berthelier, Christophe Blanc, Christophe Garcia, Thierry Chateau
Abstract	We present a facial landmark position correlation analysis as well as its applications. Although numerous facial landmark detection methods have been presented in the literature, few of them concern the intrinsic relationship among the landmarks. In order to reveal and interpret this relationship, we propose to analyze the facial landmark correlation by using Canonical Correlation Analysis (CCA). We experimentally show that dense facial landmark annotations in current benchmarks are strongly correlated, and we propose several applications based on this analysis. First, we give insights into the predictions from different facial landmark detection models (including cascaded random forests, cascaded Convolutional Neural Networks (CNNs), heatmap regression models) and interpret how CNNs progressively learn to predict facial landmarks. Second, we propose a few-shot learning method that allows to considerably reduce manual effort for dense landmark annotation. To this end, we select a portion of landmarks from the dense annotation format to form a sparse format, which is mostly correlated to the rest of them. Thanks to the strong correlation among the landmarks, the entire set of dense facial landmarks can then be inferred from the annotation in the sparse format by transfer learning. Unlike the previous methods, we mainly focus on how to find the most efficient sparse format to annotate. Overall, our correlation analysis provides new perspectives for the research on facial landmark detection.
Tasks	Facial Landmark Detection, Few-Shot Learning, Transfer Learning
Published	2019-11-24
URL	https://arxiv.org/abs/1911.10576v1
PDF	https://arxiv.org/pdf/1911.10576v1.pdf
PWC	https://paperswithcode.com/paper/facial-landmark-correlation-analysis
Repo
Framework

A Unified Framework for Lifelong Learning in Deep Neural Networks


Title	A Unified Framework for Lifelong Learning in Deep Neural Networks
Authors	Charles X. Ling, Tanner Bohn
Abstract	Humans can learn a variety of concepts and skills incrementally over the course of their lives while exhibiting an array of desirable properties, such as non-forgetting, concept rehearsal, forward transfer and backward transfer of knowledge, few-shot learning, and selective forgetting. Previous approaches to lifelong machine learning can only demonstrate subsets of these properties, often by combining multiple complex mechanisms. In this Perspective, we propose a powerful unified framework that can demonstrate all of the properties by utilizing a small number of weight consolidation parameters in deep neural networks. In addition, we are able to draw many parallels between the behaviours and mechanisms of our proposed framework and those surrounding human learning, such as memory loss or sleep deprivation. This Perspective serves as a conduit for two-way inspiration to further understand lifelong learning in machines and humans.
Tasks	Few-Shot Learning
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09704v2
PDF	https://arxiv.org/pdf/1911.09704v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-lifelong-learning-in
Repo
Framework

Optimistic Proximal Policy Optimization


Title	Optimistic Proximal Policy Optimization
Authors	Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
Abstract	Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains. However, it is known that learning a good policy is difficult in a domain where rewards are rare. We propose a method, optimistic proximal policy optimization (OPPO) to alleviate this difficulty. OPPO considers the uncertainty of the estimated total return and optimistically evaluates the policy based on that amount. We show that OPPO outperforms the existing methods in a tabular task.
Tasks
Published	2019-06-25
URL	https://arxiv.org/abs/1906.11075v1
PDF	https://arxiv.org/pdf/1906.11075v1.pdf
PWC	https://paperswithcode.com/paper/optimistic-proximal-policy-optimization
Repo
Framework

Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation


Title	Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Authors	Sébastien Jean, Ankur Bapna, Orhan Firat
Abstract	Most neural machine translation systems still translate sentences in isolation. To make further progress, a promising line of research additionally considers the surrounding context in order to provide the model potentially missing source-side information, as well as to maintain a coherent output. One difficulty in training such larger-context (i.e. document-level) machine translation systems is that context may be missing from many parallel examples. To circumvent this issue, two-stage approaches, in which sentence-level translations are post-edited in context, have recently been proposed. In this paper, we instead consider the viability of filling in the missing context. In particular, we consider three distinct approaches to generate the missing context: using random contexts, applying a copy heuristic or generating it with a language model. In particular, the copy heuristic significantly helps with lexical coherence, while using completely random contexts hurts performance on many long-distance linguistic phenomena. We also validate the usefulness of tagged back-translation. In addition to improving BLEU scores as expected, using back-translated data helps larger-context machine translation systems to better capture long-range phenomena.
Tasks	Language Modelling, Machine Translation
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14075v1
PDF	https://arxiv.org/pdf/1910.14075v1.pdf
PWC	https://paperswithcode.com/paper/fill-in-the-blanks-imputing-missing-sentences
Repo
Framework

BMVC 2019: Workshop on Interpretable and Explainable Machine Vision


Title	BMVC 2019: Workshop on Interpretable and Explainable Machine Vision
Authors	Alun Preece
Abstract	Proceedings of the BMVC 2019 Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK, September 12, 2019.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07245v1
PDF	https://arxiv.org/pdf/1909.07245v1.pdf
PWC	https://paperswithcode.com/paper/bmvc-2019-workshop-on-interpretable-and
Repo
Framework

Transformation of XML Documents with Prolog


Title	Transformation of XML Documents with Prolog
Authors	René Haberland, Igor L. Bratchikov
Abstract	Transforming XML documents with conventional XML languages, like XSL-T, is disadvantageous because there is too lax abstraction on the target language and it is rather difficult to recognize rule-oriented transformations. Prolog as a programming language of declarative paradigm is especially good for implementation of analysis of formal languages. Prolog seems also to be good for term manipulation, complex schema-transformation and text retrieval. In this report an appropriate model for XML documents is proposed, the basic transformation language for Prolog LTL is defined and the expressiveness power compared with XSL-T is demonstrated, the implementations used throughout are multi paradigmatic.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08361v1
PDF	https://arxiv.org/pdf/1906.08361v1.pdf
PWC	https://paperswithcode.com/paper/transformation-of-xml-documents-with-prolog
Repo
Framework

Learn Electronic Health Records by Fully Decentralized Federated Learning


Title	Learn Electronic Health Records by Fully Decentralized Federated Learning
Authors	Songtao Lu, Yawen Zhang, Yunlong Wang, Christina Mack
Abstract	Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network. In this paper, we focus on improving the communication efficiency for fully decentralized federated learning over a graph, where the algorithm performs local updates for several iterations and then enables communications among the nodes. In such a way, the communication rounds of exchanging the common interest of parameters can be saved significantly without loss of optimality of the solutions. Multiple numerical simulations based on large, real-world electronic health record databases showcase the superiority of the decentralized federated learning compared with classic methods.
Tasks
Published	2019-12-04
URL	https://arxiv.org/abs/1912.01792v2
PDF	https://arxiv.org/pdf/1912.01792v2.pdf
PWC	https://paperswithcode.com/paper/learn-electronic-health-records-by-fully
Repo
Framework

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation


Title	Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
Authors	Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster
Abstract	We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows. This approach naturally exhibits very low latency and high final quality, but at the cost of incremental instability as the output is continuously refined. We experiment with a pipeline of industry-grade speech recognition and translation tools, augmented with simple inference heuristics to improve stability. We use TED Talks as a source of multilingual test data, developing our techniques on English-to-German spoken language translation. Our minimalist approach to simultaneous translation allows us to easily scale our final evaluation to six more target languages, dramatically improving incremental stability for all of them.
Tasks	Machine Translation, Speech Recognition
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03393v1
PDF	https://arxiv.org/pdf/1912.03393v1.pdf
PWC	https://paperswithcode.com/paper/re-translation-strategies-for-long-form
Repo
Framework

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee


Title	Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
Authors	Wei Hu, Zhiyuan Li, Dingli Yu
Abstract	Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other hand, simple regularization methods like early-stopping can often achieve highly nontrivial performance on clean test data in these scenarios, a phenomenon not theoretically understood. This paper proposes and analyzes two simple and intuitive regularization methods: (i) regularization by the distance between the network parameters to initialization, and (ii) adding a trainable auxiliary variable to the network output for each training example. Theoretically, we prove that gradient descent training with either of these two methods leads to a generalization guarantee on the clean data distribution despite being trained using noisy labels. Our generalization analysis relies on the connection between wide neural network and neural tangent kernel (NTK). The generalization bound is independent of the network size, and is comparable to the bound one can get when there is no label noise. Experimental results verify the effectiveness of these methods on noisily labeled datasets.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11368v3
PDF	https://arxiv.org/pdf/1905.11368v3.pdf
PWC	https://paperswithcode.com/paper/understanding-generalization-of-deep-neural
Repo
Framework

Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning


Title	Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning
Authors	Shi Pu, Alex Olshevsky, Ioannis Ch. Paschalidis
Abstract	We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of n nodes asymptotically converges to the optimal solution at a comparable rate to a centralized method with the same computational power as the entire network. We explain this property through an example involving the training of ML models and sketch a short mathematical analysis for comparing the performance of distributed stochastic gradient descent (DSGD) with centralized stochastic gradient decent (SGD).
Tasks	Distributed Optimization, Stochastic Optimization
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12345v5
PDF	https://arxiv.org/pdf/1906.12345v5.pdf
PWC	https://paperswithcode.com/paper/asymptotic-network-independence-in
Repo
Framework

Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps


Title	Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps
Authors	Igor Gilitschenski, Guy Rosman, Arjun Gupta, Sertac Karaman, Daniela Rus
Abstract	In this paper, we propose a novel approach for agent motion prediction in cluttered environments. One of the main challenges in predicting agent motion is accounting for location and context-specific information. Our main contribution is the concept of learning context maps to improve the prediction task. Context maps are a set of location-specific latent maps that are trained alongside the predictor. Thus, the proposed maps are capable of capturing location context beyond visual context cues (e.g. usual average speeds and typical trajectories) or predefined map primitives (lanes and stop lines). We pose context map learning as a multi-task training problem and describe our map model and its incorporation into a state-of-the-art trajectory predictor. In extensive experiments, it is shown that use of maps can significantly improve predictor accuracy and be additionally boosted by providing even partial knowledge of map semantics.
Tasks	motion prediction, Trajectory Prediction
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06785v1
PDF	https://arxiv.org/pdf/1912.06785v1.pdf
PWC	https://paperswithcode.com/paper/deep-context-map-agent-trajectory-prediction
Repo
Framework

Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems


Title	Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems
Authors	Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, Yongfeng Zhang
Abstract	Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-$k$ recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-$k$ recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-$k$ ranking tasks and the economic profits of the system.
Tasks	Recommendation Systems
Published	2019-02-03
URL	http://arxiv.org/abs/1902.00851v1
PDF	http://arxiv.org/pdf/1902.00851v1.pdf
PWC	https://paperswithcode.com/paper/value-aware-recommendation-based-on
Repo
Framework

Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories


Title	Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories
Authors	Chaoyun Zhang, Marco Fiore, Paul Patras
Abstract	Network slicing is increasingly used to partition network infrastructure between different mobile services. Precise service-wise mobile traffic forecasting becomes essential in this context, as mobile operators seek to pre-allocate resources to each slice in advance, to meet the distinct requirements of individual services. This paper attacks the problem of multi-service mobile traffic forecasting using a sequence-to-sequence (S2S) learning paradigm and convolutional long short-term memories (ConvLSTMs). The proposed architecture is designed so as to effectively extract complex spatiotemporal features of mobile network traffic and predict with high accuracy the future demands for individual services at city scale. We conduct experiments on a mobile traffic dataset collected in a large European metropolis, demonstrating that the proposed S2S-ConvLSTM can forecast the mobile traffic volume produced by tens of different services in advance of up to one hour, by just using measurements taken during the past hour. In particular, our solution achieves mean absolute errors (MAE) at antenna level that are below 13KBps, outperforming other deep learning approaches by up to 31.2%.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09771v1
PDF	https://arxiv.org/pdf/1905.09771v1.pdf
PWC	https://paperswithcode.com/paper/multi-service-mobile-traffic-forecasting-via
Repo
Framework

Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation


Title	Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation
Authors	Or Yair, Felix Dietrich, Ronen Talmon, Ioannis G. Kevrekidis
Abstract	The problem of domain adaptation has become central in many applications from a broad range of fields. Recently, it was proposed to use Optimal Transport (OT) to solve it. In this paper, we model the difference between the two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for domain adaptation in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices, whose structure provided a useful context in recent applications. We demonstrate the polar factorization theorem on this manifold. Due to the uniqueness of the weighted Riemannian mean, and by exploiting existing regularized OT algorithms, we formulate a simple algorithm that maps the source domain to the target domain. We test our algorithm on two Brain-Computer Interface (BCI) data sets and observe state of the art performance.
Tasks	Domain Adaptation
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00616v3
PDF	https://arxiv.org/pdf/1906.00616v3.pdf
PWC	https://paperswithcode.com/paper/190600616
Repo
Framework