January 26, 2020

2782 words 14 mins read

Paper Group ANR 1439

Paper Group ANR 1439

Patch redundancy in images: a statistical testing framework and some applications. Facial Landmark Correlation Analysis. A Unified Framework for Lifelong Learning in Deep Neural Networks. Optimistic Proximal Policy Optimization. Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation. BMVC 2019: Workshop on Inte …

Patch redundancy in images: a statistical testing framework and some applications

Title Patch redundancy in images: a statistical testing framework and some applications
Authors De Bortoli Valentin, Desolneux Agnès, Galerne Bruno, Leclaire Arthur
Abstract In this work we introduce a statistical framework in order to analyze the spatial redundancy in natural images. This notion of spatial redundancy must be defined locally and thus we give some examples of functions (auto-similarity and template similarity) which, given one or two images, computes a similarity measurement between patches. Two patches are said to be similar if the similarity measurement is small enough. To derive a criterion for taking a decision on the similarity between two patches we present an a contrario model. Namely, two patches are said to be similar if the associated similarity measurement is unlikely to happen in a background model. Choosing Gaussian random fields as background models we derive non-asymptotic expressions for the probability distribution function of similarity measurements. We introduce a fast algorithm in order to assess redundancy in natural images and present applications in denoising, periodicity analysis and texture ranking.
Tasks Denoising
Published 2019-04-12
URL http://arxiv.org/abs/1904.06428v1
PDF http://arxiv.org/pdf/1904.06428v1.pdf
PWC https://paperswithcode.com/paper/patch-redundancy-in-images-a-statistical
Repo
Framework

Facial Landmark Correlation Analysis

Title Facial Landmark Correlation Analysis
Authors Yongzhe Yan, Stefan Duffner, Priyanka Phutane, Anthony Berthelier, Christophe Blanc, Christophe Garcia, Thierry Chateau
Abstract We present a facial landmark position correlation analysis as well as its applications. Although numerous facial landmark detection methods have been presented in the literature, few of them concern the intrinsic relationship among the landmarks. In order to reveal and interpret this relationship, we propose to analyze the facial landmark correlation by using Canonical Correlation Analysis (CCA). We experimentally show that dense facial landmark annotations in current benchmarks are strongly correlated, and we propose several applications based on this analysis. First, we give insights into the predictions from different facial landmark detection models (including cascaded random forests, cascaded Convolutional Neural Networks (CNNs), heatmap regression models) and interpret how CNNs progressively learn to predict facial landmarks. Second, we propose a few-shot learning method that allows to considerably reduce manual effort for dense landmark annotation. To this end, we select a portion of landmarks from the dense annotation format to form a sparse format, which is mostly correlated to the rest of them. Thanks to the strong correlation among the landmarks, the entire set of dense facial landmarks can then be inferred from the annotation in the sparse format by transfer learning. Unlike the previous methods, we mainly focus on how to find the most efficient sparse format to annotate. Overall, our correlation analysis provides new perspectives for the research on facial landmark detection.
Tasks Facial Landmark Detection, Few-Shot Learning, Transfer Learning
Published 2019-11-24
URL https://arxiv.org/abs/1911.10576v1
PDF https://arxiv.org/pdf/1911.10576v1.pdf
PWC https://paperswithcode.com/paper/facial-landmark-correlation-analysis
Repo
Framework

A Unified Framework for Lifelong Learning in Deep Neural Networks

Title A Unified Framework for Lifelong Learning in Deep Neural Networks
Authors Charles X. Ling, Tanner Bohn
Abstract Humans can learn a variety of concepts and skills incrementally over the course of their lives while exhibiting an array of desirable properties, such as non-forgetting, concept rehearsal, forward transfer and backward transfer of knowledge, few-shot learning, and selective forgetting. Previous approaches to lifelong machine learning can only demonstrate subsets of these properties, often by combining multiple complex mechanisms. In this Perspective, we propose a powerful unified framework that can demonstrate all of the properties by utilizing a small number of weight consolidation parameters in deep neural networks. In addition, we are able to draw many parallels between the behaviours and mechanisms of our proposed framework and those surrounding human learning, such as memory loss or sleep deprivation. This Perspective serves as a conduit for two-way inspiration to further understand lifelong learning in machines and humans.
Tasks Few-Shot Learning
Published 2019-11-21
URL https://arxiv.org/abs/1911.09704v2
PDF https://arxiv.org/pdf/1911.09704v2.pdf
PWC https://paperswithcode.com/paper/a-unified-framework-for-lifelong-learning-in
Repo
Framework

Optimistic Proximal Policy Optimization

Title Optimistic Proximal Policy Optimization
Authors Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
Abstract Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains. However, it is known that learning a good policy is difficult in a domain where rewards are rare. We propose a method, optimistic proximal policy optimization (OPPO) to alleviate this difficulty. OPPO considers the uncertainty of the estimated total return and optimistically evaluates the policy based on that amount. We show that OPPO outperforms the existing methods in a tabular task.
Tasks
Published 2019-06-25
URL https://arxiv.org/abs/1906.11075v1
PDF https://arxiv.org/pdf/1906.11075v1.pdf
PWC https://paperswithcode.com/paper/optimistic-proximal-policy-optimization
Repo
Framework

Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation

Title Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Authors Sébastien Jean, Ankur Bapna, Orhan Firat
Abstract Most neural machine translation systems still translate sentences in isolation. To make further progress, a promising line of research additionally considers the surrounding context in order to provide the model potentially missing source-side information, as well as to maintain a coherent output. One difficulty in training such larger-context (i.e. document-level) machine translation systems is that context may be missing from many parallel examples. To circumvent this issue, two-stage approaches, in which sentence-level translations are post-edited in context, have recently been proposed. In this paper, we instead consider the viability of filling in the missing context. In particular, we consider three distinct approaches to generate the missing context: using random contexts, applying a copy heuristic or generating it with a language model. In particular, the copy heuristic significantly helps with lexical coherence, while using completely random contexts hurts performance on many long-distance linguistic phenomena. We also validate the usefulness of tagged back-translation. In addition to improving BLEU scores as expected, using back-translated data helps larger-context machine translation systems to better capture long-range phenomena.
Tasks Language Modelling, Machine Translation
Published 2019-10-30
URL https://arxiv.org/abs/1910.14075v1
PDF https://arxiv.org/pdf/1910.14075v1.pdf
PWC https://paperswithcode.com/paper/fill-in-the-blanks-imputing-missing-sentences
Repo
Framework

BMVC 2019: Workshop on Interpretable and Explainable Machine Vision

Title BMVC 2019: Workshop on Interpretable and Explainable Machine Vision
Authors Alun Preece
Abstract Proceedings of the BMVC 2019 Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK, September 12, 2019.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.07245v1
PDF https://arxiv.org/pdf/1909.07245v1.pdf
PWC https://paperswithcode.com/paper/bmvc-2019-workshop-on-interpretable-and
Repo
Framework

Transformation of XML Documents with Prolog

Title Transformation of XML Documents with Prolog
Authors René Haberland, Igor L. Bratchikov
Abstract Transforming XML documents with conventional XML languages, like XSL-T, is disadvantageous because there is too lax abstraction on the target language and it is rather difficult to recognize rule-oriented transformations. Prolog as a programming language of declarative paradigm is especially good for implementation of analysis of formal languages. Prolog seems also to be good for term manipulation, complex schema-transformation and text retrieval. In this report an appropriate model for XML documents is proposed, the basic transformation language for Prolog LTL is defined and the expressiveness power compared with XSL-T is demonstrated, the implementations used throughout are multi paradigmatic.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.08361v1
PDF https://arxiv.org/pdf/1906.08361v1.pdf
PWC https://paperswithcode.com/paper/transformation-of-xml-documents-with-prolog
Repo
Framework

Learn Electronic Health Records by Fully Decentralized Federated Learning

Title Learn Electronic Health Records by Fully Decentralized Federated Learning
Authors Songtao Lu, Yawen Zhang, Yunlong Wang, Christina Mack
Abstract Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network. In this paper, we focus on improving the communication efficiency for fully decentralized federated learning over a graph, where the algorithm performs local updates for several iterations and then enables communications among the nodes. In such a way, the communication rounds of exchanging the common interest of parameters can be saved significantly without loss of optimality of the solutions. Multiple numerical simulations based on large, real-world electronic health record databases showcase the superiority of the decentralized federated learning compared with classic methods.
Tasks
Published 2019-12-04
URL https://arxiv.org/abs/1912.01792v2
PDF https://arxiv.org/pdf/1912.01792v2.pdf
PWC https://paperswithcode.com/paper/learn-electronic-health-records-by-fully
Repo
Framework

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

Title Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
Authors Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster
Abstract We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows. This approach naturally exhibits very low latency and high final quality, but at the cost of incremental instability as the output is continuously refined. We experiment with a pipeline of industry-grade speech recognition and translation tools, augmented with simple inference heuristics to improve stability. We use TED Talks as a source of multilingual test data, developing our techniques on English-to-German spoken language translation. Our minimalist approach to simultaneous translation allows us to easily scale our final evaluation to six more target languages, dramatically improving incremental stability for all of them.
Tasks Machine Translation, Speech Recognition
Published 2019-12-06
URL https://arxiv.org/abs/1912.03393v1
PDF https://arxiv.org/pdf/1912.03393v1.pdf
PWC https://paperswithcode.com/paper/re-translation-strategies-for-long-form
Repo
Framework

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Title Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
Authors Wei Hu, Zhiyuan Li, Dingli Yu
Abstract Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other hand, simple regularization methods like early-stopping can often achieve highly nontrivial performance on clean test data in these scenarios, a phenomenon not theoretically understood. This paper proposes and analyzes two simple and intuitive regularization methods: (i) regularization by the distance between the network parameters to initialization, and (ii) adding a trainable auxiliary variable to the network output for each training example. Theoretically, we prove that gradient descent training with either of these two methods leads to a generalization guarantee on the clean data distribution despite being trained using noisy labels. Our generalization analysis relies on the connection between wide neural network and neural tangent kernel (NTK). The generalization bound is independent of the network size, and is comparable to the bound one can get when there is no label noise. Experimental results verify the effectiveness of these methods on noisily labeled datasets.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.11368v3
PDF https://arxiv.org/pdf/1905.11368v3.pdf
PWC https://paperswithcode.com/paper/understanding-generalization-of-deep-neural
Repo
Framework

Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning

Title Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning
Authors Shi Pu, Alex Olshevsky, Ioannis Ch. Paschalidis
Abstract We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of n nodes asymptotically converges to the optimal solution at a comparable rate to a centralized method with the same computational power as the entire network. We explain this property through an example involving the training of ML models and sketch a short mathematical analysis for comparing the performance of distributed stochastic gradient descent (DSGD) with centralized stochastic gradient decent (SGD).
Tasks Distributed Optimization, Stochastic Optimization
Published 2019-06-28
URL https://arxiv.org/abs/1906.12345v5
PDF https://arxiv.org/pdf/1906.12345v5.pdf
PWC https://paperswithcode.com/paper/asymptotic-network-independence-in
Repo
Framework

Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps

Title Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps
Authors Igor Gilitschenski, Guy Rosman, Arjun Gupta, Sertac Karaman, Daniela Rus
Abstract In this paper, we propose a novel approach for agent motion prediction in cluttered environments. One of the main challenges in predicting agent motion is accounting for location and context-specific information. Our main contribution is the concept of learning context maps to improve the prediction task. Context maps are a set of location-specific latent maps that are trained alongside the predictor. Thus, the proposed maps are capable of capturing location context beyond visual context cues (e.g. usual average speeds and typical trajectories) or predefined map primitives (lanes and stop lines). We pose context map learning as a multi-task training problem and describe our map model and its incorporation into a state-of-the-art trajectory predictor. In extensive experiments, it is shown that use of maps can significantly improve predictor accuracy and be additionally boosted by providing even partial knowledge of map semantics.
Tasks motion prediction, Trajectory Prediction
Published 2019-12-14
URL https://arxiv.org/abs/1912.06785v1
PDF https://arxiv.org/pdf/1912.06785v1.pdf
PWC https://paperswithcode.com/paper/deep-context-map-agent-trajectory-prediction
Repo
Framework

Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems

Title Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems
Authors Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, Yongfeng Zhang
Abstract Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-$k$ recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-$k$ recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-$k$ ranking tasks and the economic profits of the system.
Tasks Recommendation Systems
Published 2019-02-03
URL http://arxiv.org/abs/1902.00851v1
PDF http://arxiv.org/pdf/1902.00851v1.pdf
PWC https://paperswithcode.com/paper/value-aware-recommendation-based-on
Repo
Framework

Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories

Title Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories
Authors Chaoyun Zhang, Marco Fiore, Paul Patras
Abstract Network slicing is increasingly used to partition network infrastructure between different mobile services. Precise service-wise mobile traffic forecasting becomes essential in this context, as mobile operators seek to pre-allocate resources to each slice in advance, to meet the distinct requirements of individual services. This paper attacks the problem of multi-service mobile traffic forecasting using a sequence-to-sequence (S2S) learning paradigm and convolutional long short-term memories (ConvLSTMs). The proposed architecture is designed so as to effectively extract complex spatiotemporal features of mobile network traffic and predict with high accuracy the future demands for individual services at city scale. We conduct experiments on a mobile traffic dataset collected in a large European metropolis, demonstrating that the proposed S2S-ConvLSTM can forecast the mobile traffic volume produced by tens of different services in advance of up to one hour, by just using measurements taken during the past hour. In particular, our solution achieves mean absolute errors (MAE) at antenna level that are below 13KBps, outperforming other deep learning approaches by up to 31.2%.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09771v1
PDF https://arxiv.org/pdf/1905.09771v1.pdf
PWC https://paperswithcode.com/paper/multi-service-mobile-traffic-forecasting-via
Repo
Framework

Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation

Title Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation
Authors Or Yair, Felix Dietrich, Ronen Talmon, Ioannis G. Kevrekidis
Abstract The problem of domain adaptation has become central in many applications from a broad range of fields. Recently, it was proposed to use Optimal Transport (OT) to solve it. In this paper, we model the difference between the two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for domain adaptation in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices, whose structure provided a useful context in recent applications. We demonstrate the polar factorization theorem on this manifold. Due to the uniqueness of the weighted Riemannian mean, and by exploiting existing regularized OT algorithms, we formulate a simple algorithm that maps the source domain to the target domain. We test our algorithm on two Brain-Computer Interface (BCI) data sets and observe state of the art performance.
Tasks Domain Adaptation
Published 2019-06-03
URL https://arxiv.org/abs/1906.00616v3
PDF https://arxiv.org/pdf/1906.00616v3.pdf
PWC https://paperswithcode.com/paper/190600616
Repo
Framework
comments powered by Disqus