Paper Group ANR 1439
Patch redundancy in images: a statistical testing framework and some applications. Facial Landmark Correlation Analysis. A Unified Framework for Lifelong Learning in Deep Neural Networks. Optimistic Proximal Policy Optimization. Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation. BMVC 2019: Workshop on Inte …
Patch redundancy in images: a statistical testing framework and some applications
Title | Patch redundancy in images: a statistical testing framework and some applications |
Authors | De Bortoli Valentin, Desolneux Agnès, Galerne Bruno, Leclaire Arthur |
Abstract | In this work we introduce a statistical framework in order to analyze the spatial redundancy in natural images. This notion of spatial redundancy must be defined locally and thus we give some examples of functions (auto-similarity and template similarity) which, given one or two images, computes a similarity measurement between patches. Two patches are said to be similar if the similarity measurement is small enough. To derive a criterion for taking a decision on the similarity between two patches we present an a contrario model. Namely, two patches are said to be similar if the associated similarity measurement is unlikely to happen in a background model. Choosing Gaussian random fields as background models we derive non-asymptotic expressions for the probability distribution function of similarity measurements. We introduce a fast algorithm in order to assess redundancy in natural images and present applications in denoising, periodicity analysis and texture ranking. |
Tasks | Denoising |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06428v1 |
http://arxiv.org/pdf/1904.06428v1.pdf | |
PWC | https://paperswithcode.com/paper/patch-redundancy-in-images-a-statistical |
Repo | |
Framework | |
Facial Landmark Correlation Analysis
Title | Facial Landmark Correlation Analysis |
Authors | Yongzhe Yan, Stefan Duffner, Priyanka Phutane, Anthony Berthelier, Christophe Blanc, Christophe Garcia, Thierry Chateau |
Abstract | We present a facial landmark position correlation analysis as well as its applications. Although numerous facial landmark detection methods have been presented in the literature, few of them concern the intrinsic relationship among the landmarks. In order to reveal and interpret this relationship, we propose to analyze the facial landmark correlation by using Canonical Correlation Analysis (CCA). We experimentally show that dense facial landmark annotations in current benchmarks are strongly correlated, and we propose several applications based on this analysis. First, we give insights into the predictions from different facial landmark detection models (including cascaded random forests, cascaded Convolutional Neural Networks (CNNs), heatmap regression models) and interpret how CNNs progressively learn to predict facial landmarks. Second, we propose a few-shot learning method that allows to considerably reduce manual effort for dense landmark annotation. To this end, we select a portion of landmarks from the dense annotation format to form a sparse format, which is mostly correlated to the rest of them. Thanks to the strong correlation among the landmarks, the entire set of dense facial landmarks can then be inferred from the annotation in the sparse format by transfer learning. Unlike the previous methods, we mainly focus on how to find the most efficient sparse format to annotate. Overall, our correlation analysis provides new perspectives for the research on facial landmark detection. |
Tasks | Facial Landmark Detection, Few-Shot Learning, Transfer Learning |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10576v1 |
https://arxiv.org/pdf/1911.10576v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-landmark-correlation-analysis |
Repo | |
Framework | |
A Unified Framework for Lifelong Learning in Deep Neural Networks
Title | A Unified Framework for Lifelong Learning in Deep Neural Networks |
Authors | Charles X. Ling, Tanner Bohn |
Abstract | Humans can learn a variety of concepts and skills incrementally over the course of their lives while exhibiting an array of desirable properties, such as non-forgetting, concept rehearsal, forward transfer and backward transfer of knowledge, few-shot learning, and selective forgetting. Previous approaches to lifelong machine learning can only demonstrate subsets of these properties, often by combining multiple complex mechanisms. In this Perspective, we propose a powerful unified framework that can demonstrate all of the properties by utilizing a small number of weight consolidation parameters in deep neural networks. In addition, we are able to draw many parallels between the behaviours and mechanisms of our proposed framework and those surrounding human learning, such as memory loss or sleep deprivation. This Perspective serves as a conduit for two-way inspiration to further understand lifelong learning in machines and humans. |
Tasks | Few-Shot Learning |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09704v2 |
https://arxiv.org/pdf/1911.09704v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-lifelong-learning-in |
Repo | |
Framework | |
Optimistic Proximal Policy Optimization
Title | Optimistic Proximal Policy Optimization |
Authors | Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka |
Abstract | Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains. However, it is known that learning a good policy is difficult in a domain where rewards are rare. We propose a method, optimistic proximal policy optimization (OPPO) to alleviate this difficulty. OPPO considers the uncertainty of the estimated total return and optimistically evaluates the policy based on that amount. We show that OPPO outperforms the existing methods in a tabular task. |
Tasks | |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.11075v1 |
https://arxiv.org/pdf/1906.11075v1.pdf | |
PWC | https://paperswithcode.com/paper/optimistic-proximal-policy-optimization |
Repo | |
Framework | |
Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation
Title | Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation |
Authors | Sébastien Jean, Ankur Bapna, Orhan Firat |
Abstract | Most neural machine translation systems still translate sentences in isolation. To make further progress, a promising line of research additionally considers the surrounding context in order to provide the model potentially missing source-side information, as well as to maintain a coherent output. One difficulty in training such larger-context (i.e. document-level) machine translation systems is that context may be missing from many parallel examples. To circumvent this issue, two-stage approaches, in which sentence-level translations are post-edited in context, have recently been proposed. In this paper, we instead consider the viability of filling in the missing context. In particular, we consider three distinct approaches to generate the missing context: using random contexts, applying a copy heuristic or generating it with a language model. In particular, the copy heuristic significantly helps with lexical coherence, while using completely random contexts hurts performance on many long-distance linguistic phenomena. We also validate the usefulness of tagged back-translation. In addition to improving BLEU scores as expected, using back-translated data helps larger-context machine translation systems to better capture long-range phenomena. |
Tasks | Language Modelling, Machine Translation |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14075v1 |
https://arxiv.org/pdf/1910.14075v1.pdf | |
PWC | https://paperswithcode.com/paper/fill-in-the-blanks-imputing-missing-sentences |
Repo | |
Framework | |
BMVC 2019: Workshop on Interpretable and Explainable Machine Vision
Title | BMVC 2019: Workshop on Interpretable and Explainable Machine Vision |
Authors | Alun Preece |
Abstract | Proceedings of the BMVC 2019 Workshop on Interpretable and Explainable Machine Vision, Cardiff, UK, September 12, 2019. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07245v1 |
https://arxiv.org/pdf/1909.07245v1.pdf | |
PWC | https://paperswithcode.com/paper/bmvc-2019-workshop-on-interpretable-and |
Repo | |
Framework | |
Transformation of XML Documents with Prolog
Title | Transformation of XML Documents with Prolog |
Authors | René Haberland, Igor L. Bratchikov |
Abstract | Transforming XML documents with conventional XML languages, like XSL-T, is disadvantageous because there is too lax abstraction on the target language and it is rather difficult to recognize rule-oriented transformations. Prolog as a programming language of declarative paradigm is especially good for implementation of analysis of formal languages. Prolog seems also to be good for term manipulation, complex schema-transformation and text retrieval. In this report an appropriate model for XML documents is proposed, the basic transformation language for Prolog LTL is defined and the expressiveness power compared with XSL-T is demonstrated, the implementations used throughout are multi paradigmatic. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08361v1 |
https://arxiv.org/pdf/1906.08361v1.pdf | |
PWC | https://paperswithcode.com/paper/transformation-of-xml-documents-with-prolog |
Repo | |
Framework | |
Learn Electronic Health Records by Fully Decentralized Federated Learning
Title | Learn Electronic Health Records by Fully Decentralized Federated Learning |
Authors | Songtao Lu, Yawen Zhang, Yunlong Wang, Christina Mack |
Abstract | Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network. In this paper, we focus on improving the communication efficiency for fully decentralized federated learning over a graph, where the algorithm performs local updates for several iterations and then enables communications among the nodes. In such a way, the communication rounds of exchanging the common interest of parameters can be saved significantly without loss of optimality of the solutions. Multiple numerical simulations based on large, real-world electronic health record databases showcase the superiority of the decentralized federated learning compared with classic methods. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01792v2 |
https://arxiv.org/pdf/1912.01792v2.pdf | |
PWC | https://paperswithcode.com/paper/learn-electronic-health-records-by-fully |
Repo | |
Framework | |
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
Title | Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation |
Authors | Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster |
Abstract | We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows. This approach naturally exhibits very low latency and high final quality, but at the cost of incremental instability as the output is continuously refined. We experiment with a pipeline of industry-grade speech recognition and translation tools, augmented with simple inference heuristics to improve stability. We use TED Talks as a source of multilingual test data, developing our techniques on English-to-German spoken language translation. Our minimalist approach to simultaneous translation allows us to easily scale our final evaluation to six more target languages, dramatically improving incremental stability for all of them. |
Tasks | Machine Translation, Speech Recognition |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03393v1 |
https://arxiv.org/pdf/1912.03393v1.pdf | |
PWC | https://paperswithcode.com/paper/re-translation-strategies-for-long-form |
Repo | |
Framework | |
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
Title | Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee |
Authors | Wei Hu, Zhiyuan Li, Dingli Yu |
Abstract | Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other hand, simple regularization methods like early-stopping can often achieve highly nontrivial performance on clean test data in these scenarios, a phenomenon not theoretically understood. This paper proposes and analyzes two simple and intuitive regularization methods: (i) regularization by the distance between the network parameters to initialization, and (ii) adding a trainable auxiliary variable to the network output for each training example. Theoretically, we prove that gradient descent training with either of these two methods leads to a generalization guarantee on the clean data distribution despite being trained using noisy labels. Our generalization analysis relies on the connection between wide neural network and neural tangent kernel (NTK). The generalization bound is independent of the network size, and is comparable to the bound one can get when there is no label noise. Experimental results verify the effectiveness of these methods on noisily labeled datasets. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11368v3 |
https://arxiv.org/pdf/1905.11368v3.pdf | |
PWC | https://paperswithcode.com/paper/understanding-generalization-of-deep-neural |
Repo | |
Framework | |
Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning
Title | Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning |
Authors | Shi Pu, Alex Olshevsky, Ioannis Ch. Paschalidis |
Abstract | We provide a discussion of several recent results which, in certain scenarios, are able to overcome a barrier in distributed stochastic optimization for machine learning. Our focus is the so-called asymptotic network independence property, which is achieved whenever a distributed method executed over a network of n nodes asymptotically converges to the optimal solution at a comparable rate to a centralized method with the same computational power as the entire network. We explain this property through an example involving the training of ML models and sketch a short mathematical analysis for comparing the performance of distributed stochastic gradient descent (DSGD) with centralized stochastic gradient decent (SGD). |
Tasks | Distributed Optimization, Stochastic Optimization |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12345v5 |
https://arxiv.org/pdf/1906.12345v5.pdf | |
PWC | https://paperswithcode.com/paper/asymptotic-network-independence-in |
Repo | |
Framework | |
Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps
Title | Deep Context Map: Agent Trajectory Prediction using Location-specific Latent Maps |
Authors | Igor Gilitschenski, Guy Rosman, Arjun Gupta, Sertac Karaman, Daniela Rus |
Abstract | In this paper, we propose a novel approach for agent motion prediction in cluttered environments. One of the main challenges in predicting agent motion is accounting for location and context-specific information. Our main contribution is the concept of learning context maps to improve the prediction task. Context maps are a set of location-specific latent maps that are trained alongside the predictor. Thus, the proposed maps are capable of capturing location context beyond visual context cues (e.g. usual average speeds and typical trajectories) or predefined map primitives (lanes and stop lines). We pose context map learning as a multi-task training problem and describe our map model and its incorporation into a state-of-the-art trajectory predictor. In extensive experiments, it is shown that use of maps can significantly improve predictor accuracy and be additionally boosted by providing even partial knowledge of map semantics. |
Tasks | motion prediction, Trajectory Prediction |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06785v1 |
https://arxiv.org/pdf/1912.06785v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-context-map-agent-trajectory-prediction |
Repo | |
Framework | |
Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems
Title | Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems |
Authors | Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, Yongfeng Zhang |
Abstract | Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-$k$ recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-$k$ recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-$k$ ranking tasks and the economic profits of the system. |
Tasks | Recommendation Systems |
Published | 2019-02-03 |
URL | http://arxiv.org/abs/1902.00851v1 |
http://arxiv.org/pdf/1902.00851v1.pdf | |
PWC | https://paperswithcode.com/paper/value-aware-recommendation-based-on |
Repo | |
Framework | |
Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories
Title | Multi-Service Mobile Traffic Forecasting via Convolutional Long Short-Term Memories |
Authors | Chaoyun Zhang, Marco Fiore, Paul Patras |
Abstract | Network slicing is increasingly used to partition network infrastructure between different mobile services. Precise service-wise mobile traffic forecasting becomes essential in this context, as mobile operators seek to pre-allocate resources to each slice in advance, to meet the distinct requirements of individual services. This paper attacks the problem of multi-service mobile traffic forecasting using a sequence-to-sequence (S2S) learning paradigm and convolutional long short-term memories (ConvLSTMs). The proposed architecture is designed so as to effectively extract complex spatiotemporal features of mobile network traffic and predict with high accuracy the future demands for individual services at city scale. We conduct experiments on a mobile traffic dataset collected in a large European metropolis, demonstrating that the proposed S2S-ConvLSTM can forecast the mobile traffic volume produced by tens of different services in advance of up to one hour, by just using measurements taken during the past hour. In particular, our solution achieves mean absolute errors (MAE) at antenna level that are below 13KBps, outperforming other deep learning approaches by up to 31.2%. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09771v1 |
https://arxiv.org/pdf/1905.09771v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-service-mobile-traffic-forecasting-via |
Repo | |
Framework | |
Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation
Title | Optimal Transport on the Manifold of SPD Matrices for Domain Adaptation |
Authors | Or Yair, Felix Dietrich, Ronen Talmon, Ioannis G. Kevrekidis |
Abstract | The problem of domain adaptation has become central in many applications from a broad range of fields. Recently, it was proposed to use Optimal Transport (OT) to solve it. In this paper, we model the difference between the two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for domain adaptation in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices, whose structure provided a useful context in recent applications. We demonstrate the polar factorization theorem on this manifold. Due to the uniqueness of the weighted Riemannian mean, and by exploiting existing regularized OT algorithms, we formulate a simple algorithm that maps the source domain to the target domain. We test our algorithm on two Brain-Computer Interface (BCI) data sets and observe state of the art performance. |
Tasks | Domain Adaptation |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00616v3 |
https://arxiv.org/pdf/1906.00616v3.pdf | |
PWC | https://paperswithcode.com/paper/190600616 |
Repo | |
Framework | |