January 30, 2020

3163 words 15 mins read

Paper Group ANR 278

Paper Group ANR 278

Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features. Conditionally-additive-noise Models for Structure Learning. Regularized Hierarchical Policies for Compositional Transfer in Robotics. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms. Data-driven model reduc …

Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features

Title Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features
Authors Rafal Pilarczyk, Xin Chang, Wladyslaw Skarbek
Abstract Several computer algorithms for recognition of visible human emotions are compared at the web camera scenario using CNN/MMOD face detector. The recognition refers to four face expressions: smile, surprise, anger, and neutral. At the feature extraction stage, the following three concepts of face description are confronted: (a) static 2D face geometry represented by its 68 characteristic landmarks (FP68); (b) dynamic 3D geometry defined by motion parameters for eight distinguished face parts (denoted as AU8) of personalized Candide-3 model; (c) static 2D visual description as 2D array of gray scale pixels (known as facial raw image). At the classification stage, the performance of two major models are analyzed: (a) support vector machine (SVM) with kernel options; (b) convolutional neural network (CNN) with variety of relevant tensor processing layers and blocks of them. The models are trained for frontal views of human faces while they are tested for arbitrary head poses. For geometric features, the success rate (accuracy) indicate nearly triple increase of performance of CNN with respect to SVM classifiers. For raw images, CNN outperforms in accuracy its best geometric counterpart (AU/CNN) by about 30 percent while the best SVM solutions are inferior nearly four times. For F-score the high advantage of raw/CNN over geometric/CNN and geometric/SVM is observed, as well. We conclude that contrary to CNN based emotion classifiers, the generalization capability wrt human head pose is for SVM based emotion classifiers poor.
Tasks
Published 2019-01-31
URL http://arxiv.org/abs/1901.11179v1
PDF http://arxiv.org/pdf/1901.11179v1.pdf
PWC https://paperswithcode.com/paper/human-face-expressions-from-images-2d-face
Repo
Framework

Conditionally-additive-noise Models for Structure Learning

Title Conditionally-additive-noise Models for Structure Learning
Authors Daniel Chicharro, Stefano Panzeri, Ilya Shpitser
Abstract Constraint-based structure learning algorithms infer the causal structure of multivariate systems from observational data by determining an equivalent class of causal structures compatible with the conditional independencies in the data. Methods based on additive-noise (AN) models have been proposed to further discriminate between causal structures that are equivalent in terms of conditional independencies. These methods rely on a particular form of the generative functional equations, with an additive noise structure, which allows inferring the directionality of causation by testing the independence between the residuals of a nonlinear regression and the predictors (nrr-independencies). Full causal structure identifiability has been proven for systems that contain only additive-noise equations and have no hidden variables. We extend the AN framework in several ways. We introduce alternative regression-free tests of independence based on conditional variances (cv-independencies). We consider conditionally-additive-noise (CAN) models, in which the equations may have the AN form only after conditioning. We exploit asymmetries in nrr-independencies or cv-independencies resulting from the CAN form to derive a criterion that infers the causal relation between a pair of variables in a multivariate system without any assumption about the form of the equations or the presence of hidden variables.
Tasks
Published 2019-05-20
URL https://arxiv.org/abs/1905.08360v1
PDF https://arxiv.org/pdf/1905.08360v1.pdf
PWC https://paperswithcode.com/paper/conditionally-additive-noise-models-for
Repo
Framework

Regularized Hierarchical Policies for Compositional Transfer in Robotics

Title Regularized Hierarchical Policies for Compositional Transfer in Robotics
Authors Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller
Abstract The successful application of flexible, general learning algorithms – such as deep reinforcement learning – to real-world robotics applications is often limited by their poor data-efficiency. Domains with more than a single dominant task of interest encourage algorithms that share partial solutions across tasks to limit the required experiment time. We develop and investigate simple hierarchical inductive biases – in the form of structured policies – as a mechanism for knowledge transfer across tasks in reinforcement learning (RL). To leverage the power of these structured policies we design an RL algorithm that enables stable and fast learning. We demonstrate the success of our method both in simulated robot environments (using locomotion and manipulation domains) as well as real robot experiments, demonstrating substantially better data-efficiency than competitive baselines.
Tasks Transfer Learning
Published 2019-06-26
URL https://arxiv.org/abs/1906.11228v2
PDF https://arxiv.org/pdf/1906.11228v2.pdf
PWC https://paperswithcode.com/paper/regularized-hierarchical-policies-for
Repo
Framework

Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms

Title Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms
Authors Liqun Shao, Yiwen Zhu, Abhiram Eswaran, Kristin Lieber, Janhavi Mahajan, Minsoo Thigpen, Sudhir Darbha, Siqi Liu, Subru Krishnan, Soundar Srinivasan, Carlo Curino, Konstantinos Karanasos
Abstract Microsoft’s internal big data analytics platform is comprised of hundreds of thousands of machines, serving over half a million jobs daily, from thousands of users. The majority of these jobs are recurring and are crucial for the company’s operation. Although administrators spend significant effort tuning system performance, some jobs inevitably experience slowdowns, i.e., their execution time degrades over previous runs. Currently, the investigation of such slowdowns is a labor-intensive and error-prone process, which costs Microsoft significant human and machine resources, and negatively impacts several lines of businesses. In this work, we present Griffin, a system we built and have deployed in production last year to automatically discover the root cause of job slowdowns. Existing solutions either rely on labeled data (i.e., resolved incidents with labeled reasons for job slowdowns), which is in most cases non-existent or non-trivial to acquire, or on time-series analysis of individual metrics that do not target specific jobs holistically. In contrast, in Griffin we cast the problem to a corresponding regression one that predicts the runtime of a job, and show how the relative contributions of the features used to train our interpretable model can be exploited to rank the potential causes of job slowdowns. Evaluated over historical incidents, we show that Griffin discovers slowdown causes that are consistent with the ones validated by domain-expert engineers, in a fraction of the time required by them.
Tasks Time Series, Time Series Analysis
Published 2019-08-23
URL https://arxiv.org/abs/1908.09048v1
PDF https://arxiv.org/pdf/1908.09048v1.pdf
PWC https://paperswithcode.com/paper/griffon-reasoning-about-job-anomalies-with
Repo
Framework

Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism

Title Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism
Authors Kevin K. Lin, Fei Lu
Abstract First-principles models of complex dynamic phenomena often have many degrees of freedom, only a small fraction of which may be scientifically relevant or observable. Reduced models distill such phenomena to their essence by modeling only relevant variables, thus decreasing computational cost and clarifying dynamical mechanisms. Here, we consider data-driven model reduction for nonlinear dynamical systems without sharp scale separation. Motivated by a discrete-time version of the Mori-Zwanzig projection operator formalism and the Wiener filter, we propose a simple and flexible mathematical formulation based on Wiener projection, which decomposes a nonlinear dynamical system into a component predictable by past values of relevant variables and its orthogonal complement. Wiener projection is equally applicable to deterministic chaotic dynamics and randomly-forced systems, and provides a natural starting point for systematic approximations. In particular, we use it to derive NARMAX models from an underlying dynamical system, thereby clarifying the scope of these widely-used tools in time series analysis. We illustrate its versatility on the Kuramoto-Sivashinsky model of spatiotemporal chaos and a stochastic Burgers equation.
Tasks Time Series, Time Series Analysis
Published 2019-08-21
URL https://arxiv.org/abs/1908.07725v3
PDF https://arxiv.org/pdf/1908.07725v3.pdf
PWC https://paperswithcode.com/paper/190807725
Repo
Framework

Advancing subgroup fairness via sleeping experts

Title Advancing subgroup fairness via sleeping experts
Authors Avrim Blum, Thodoris Lykouris
Abstract We study methods for improving fairness to subgroups in settings with overlapping populations and sequential predictions. Classical notions of fairness focus on the balance of some property across different populations. However, in many applications the goal of the different groups is not to be predicted equally but rather to be predicted well. We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup. On the positive side, we show that when individuals are equally important to the different groups they belong to, this goal is achievable; to do so, we draw a connection to the sleeping experts literature in online learning. Motivated by the one-sided feedback in natural settings of interest, we extend our results to such a feedback model. We also provide a game-theoretic interpretation of our results, examining the incentives of participants to join the system and to provide the system full information about predictors they may possess. We end with several interesting open problems concerning the strength of guarantees that can be achieved in a computationally efficient manner.
Tasks
Published 2019-09-18
URL https://arxiv.org/abs/1909.08375v2
PDF https://arxiv.org/pdf/1909.08375v2.pdf
PWC https://paperswithcode.com/paper/advancing-subgroup-fairness-via-sleeping
Repo
Framework

Quizbowl: The Case for Incremental Question Answering

Title Quizbowl: The Case for Incremental Question Answering
Authors Pedro Rodriguez, Shi Feng, Mohit Iyyer, He He, Jordan Boyd-Graber
Abstract Quizbowl is a scholastic trivia competition that tests human knowledge and intelligence; additionally, it supports diverse research in question answering (QA). A Quizbowl question consists of multiple sentences whose clues are arranged by difficulty (from obscure to obvious) and uniquely identify a well-known entity such as those found on Wikipedia. Since players can answer the question at any time, an elite player (human or machine) demonstrates its superiority by answering correctly given as few clues as possible. We make two key contributions to machine learning research through Quizbowl: (1) collecting and curating a large factoid QA dataset and an accompanying gameplay dataset, and (2) developing a computational approach to playing Quizbowl that involves determining both what to answer and when to answer. Our Quizbowl system has defeated some of the best trivia players in the world over a multi-year series of exhibition matches. Throughout this paper, we show that collaborations with the vibrant Quizbowl community have contributed to the high quality of our dataset, led to new research directions, and doubled as an exciting way to engage the public with research in machine learning and natural language processing.
Tasks Question Answering
Published 2019-04-09
URL http://arxiv.org/abs/1904.04792v1
PDF http://arxiv.org/pdf/1904.04792v1.pdf
PWC https://paperswithcode.com/paper/quizbowl-the-case-for-incremental-question
Repo
Framework

Assessing incrementality in sequence-to-sequence models

Title Assessing incrementality in sequence-to-sequence models
Authors Dennis Ulmer, Dieuwke Hupkes, Elia Bruni
Abstract Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03293v1
PDF https://arxiv.org/pdf/1906.03293v1.pdf
PWC https://paperswithcode.com/paper/assessing-incrementality-in-sequence-to
Repo
Framework

Adversarial Example Detection and Classification With Asymmetrical Adversarial Training

Title Adversarial Example Detection and Classification With Asymmetrical Adversarial Training
Authors Xuwang Yin, Soheil Kolouri, Gustavo K. Rohde
Abstract The vulnerabilities of deep neural networks against adversarial examples have become a significant concern for deploying these models in sensitive domains. Devising a definitive defense against such attacks is proven to be challenging, and the methods relying on detecting adversarial samples are only valid when the attacker is oblivious to the detection mechanism. In this paper we first present an adversarial example detection method that provides performance guarantee to norm constrained adversaries. The method is based on the idea of training adversarial robust subspace detectors using asymmetrical adversarial training (AAT). The novel AAT objective presents a minimax problem similar to that of GANs; it has the same convergence property, and consequently supports the learning of class conditional distributions. We first demonstrate that the minimax problem could be reasonably solved by PGD attack, and then use the learned class conditional generative models to define generative detection/classification models that are both robust and more interpretable. We provide comprehensive evaluations of the above methods, and demonstrate their competitive performances and compelling properties on adversarial detection and robust classification problems.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.11475v2
PDF https://arxiv.org/pdf/1905.11475v2.pdf
PWC https://paperswithcode.com/paper/divide-and-conquer-adversarial-detection
Repo
Framework

Generating Personalized Recipes from Historical User Preferences

Title Generating Personalized Recipes from Historical User Preferences
Authors Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
Abstract Existing approaches to recipe generation are unable to create recipes for users with culinary preferences but incomplete knowledge of ingredients in specific dishes. We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user’s historical preferences. We attend on technique- and recipe-level representations of a user’s previously consumed recipes, fusing these ‘user-aware’ representations in an attention fusion layer to control recipe text generation. Experiments on a new dataset of 180K recipes and 700K interactions show our model’s ability to generate plausible and personalized recipes compared to non-personalized baselines.
Tasks Recipe Generation, Text Generation
Published 2019-08-31
URL https://arxiv.org/abs/1909.00105v1
PDF https://arxiv.org/pdf/1909.00105v1.pdf
PWC https://paperswithcode.com/paper/generating-personalized-recipes-from
Repo
Framework

J-Net: Randomly weighted U-Net for audio source separation

Title J-Net: Randomly weighted U-Net for audio source separation
Authors Bo-Wen Chen, Yen-Min Hsu, Hung-Yi Lee
Abstract Several results in the computer vision literature have shown the potential of randomly weighted neural networks. While they perform fairly well as feature extractors for discriminative tasks, a positive correlation exists between their performance and their fully trained counterparts. According to these discoveries, we pose two questions: what is the value of randomly weighted networks in difficult generative audio tasks such as audio source separation and does such positive correlation still exist when it comes to large random networks and their trained counterparts? In this paper, we demonstrate that the positive correlation still exists. Based on this discovery, we can try out different architecture designs or tricks without training the whole model. Meanwhile, we find a surprising result that in comparison to the non-trained encoder (down-sample path) in Wave-U-Net, fixing the decoder (up-sample path) to random weights results in better performance, almost comparable to the fully trained model.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1911.12926v1
PDF https://arxiv.org/pdf/1911.12926v1.pdf
PWC https://paperswithcode.com/paper/j-net-randomly-weighted-u-net-for-audio
Repo
Framework

Structuring an unordered text document

Title Structuring an unordered text document
Authors Shashank Yadav, Tejas Shimpi, C. Ravindranath Chowdary, Prashant Sharma, Deepansh Agrawal, Shivang Agarwal
Abstract Segmenting an unordered text document into different sections is a very useful task in many text processing applications like multiple document summarization, question answering, etc. This paper proposes structuring of an unordered text document based on the keywords in the document. We test our approach on Wikipedia documents using both statistical and predictive methods such as the TextRank algorithm and Google’s USE (Universal Sentence Encoder). From our experimental results, we show that the proposed model can effectively structure an unordered document into sections.
Tasks Document Summarization, Question Answering
Published 2019-01-29
URL http://arxiv.org/abs/1901.10133v1
PDF http://arxiv.org/pdf/1901.10133v1.pdf
PWC https://paperswithcode.com/paper/structuring-an-unordered-text-document
Repo
Framework

Learning Numeracy: Binary Arithmetic with Neural Turing Machines

Title Learning Numeracy: Binary Arithmetic with Neural Turing Machines
Authors Jacopo Castellini
Abstract One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via stochastic gradient descent. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use NTMs to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the real capabilities on this neural model.
Tasks
Published 2019-04-04
URL https://arxiv.org/abs/1904.02478v2
PDF https://arxiv.org/pdf/1904.02478v2.pdf
PWC https://paperswithcode.com/paper/learning-numeracy-binary-arithmetic-with
Repo
Framework

A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience

Title A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience
Authors Aritra Mitra, John A. Richards, Shreyas Sundaram
Abstract We study a setting where a group of agents, each receiving partially informative private signals, seek to collaboratively learn the true underlying state of the world (from a finite set of hypotheses) that generates their joint observation profiles. To solve this problem, we propose a distributed learning rule that differs fundamentally from existing approaches, in that it does not employ any form of “belief-averaging”. Instead, agents update their beliefs based on a min-rule. Under standard assumptions on the observation model and the network structure, we establish that each agent learns the truth asymptotically almost surely. As our main contribution, we prove that with probability 1, each false hypothesis is ruled out by every agent exponentially fast at a network-independent rate that is strictly larger than existing rates. We then develop a computationally-efficient variant of our learning rule that is provably resilient to agents who do not behave as expected (as represented by a Byzantine adversary model) and deliberately try to spread misinformation.
Tasks
Published 2019-07-05
URL https://arxiv.org/abs/1907.03588v1
PDF https://arxiv.org/pdf/1907.03588v1.pdf
PWC https://paperswithcode.com/paper/a-new-approach-to-distributed-hypothesis
Repo
Framework

A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data

Title A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data
Authors Sebastian Pölsterl, Ignacio Sarasua, Benjamín Gutiérrez-Becker, Christian Wachinger
Abstract We introduce a wide and deep neural network for prediction of progression from patients with mild cognitive impairment to Alzheimer’s disease. Information from anatomical shape and tabular clinical data (demographics, biomarkers) are fused in a single neural network. The network is invariant to shape transformations and avoids the need to identify point correspondences between shapes. To account for right censored time-to-event data, i.e., when it is only known that a patient did not develop Alzheimer’s disease up to a particular time point, we employ a loss commonly used in survival analysis. Our network is trained end-to-end to combine information from a patient’s hippocampus shape and clinical biomarkers. Our experiments on data from the Alzheimer’s Disease Neuroimaging Initiative demonstrate that our proposed model is able to learn a shape descriptor that augments clinical biomarkers and outperforms a deep neural network on shape alone and a linear model on common clinical biomarkers.
Tasks Survival Analysis
Published 2019-09-09
URL https://arxiv.org/abs/1909.03890v1
PDF https://arxiv.org/pdf/1909.03890v1.pdf
PWC https://paperswithcode.com/paper/a-wide-and-deep-neural-network-for-survival
Repo
Framework
comments powered by Disqus