January 30, 2020

3163 words 15 mins read

Paper Group ANR 278

Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features. Conditionally-additive-noise Models for Structure Learning. Regularized Hierarchical Policies for Compositional Transfer in Robotics. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms. Data-driven model reduc …

Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features


Title	Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features
Authors	Rafal Pilarczyk, Xin Chang, Wladyslaw Skarbek
Abstract	Several computer algorithms for recognition of visible human emotions are compared at the web camera scenario using CNN/MMOD face detector. The recognition refers to four face expressions: smile, surprise, anger, and neutral. At the feature extraction stage, the following three concepts of face description are confronted: (a) static 2D face geometry represented by its 68 characteristic landmarks (FP68); (b) dynamic 3D geometry defined by motion parameters for eight distinguished face parts (denoted as AU8) of personalized Candide-3 model; (c) static 2D visual description as 2D array of gray scale pixels (known as facial raw image). At the classification stage, the performance of two major models are analyzed: (a) support vector machine (SVM) with kernel options; (b) convolutional neural network (CNN) with variety of relevant tensor processing layers and blocks of them. The models are trained for frontal views of human faces while they are tested for arbitrary head poses. For geometric features, the success rate (accuracy) indicate nearly triple increase of performance of CNN with respect to SVM classifiers. For raw images, CNN outperforms in accuracy its best geometric counterpart (AU/CNN) by about 30 percent while the best SVM solutions are inferior nearly four times. For F-score the high advantage of raw/CNN over geometric/CNN and geometric/SVM is observed, as well. We conclude that contrary to CNN based emotion classifiers, the generalization capability wrt human head pose is for SVM based emotion classifiers poor.
Tasks
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11179v1
PDF	http://arxiv.org/pdf/1901.11179v1.pdf
PWC	https://paperswithcode.com/paper/human-face-expressions-from-images-2d-face
Repo
Framework

Conditionally-additive-noise Models for Structure Learning


Title	Conditionally-additive-noise Models for Structure Learning
Authors	Daniel Chicharro, Stefano Panzeri, Ilya Shpitser
Abstract	Constraint-based structure learning algorithms infer the causal structure of multivariate systems from observational data by determining an equivalent class of causal structures compatible with the conditional independencies in the data. Methods based on additive-noise (AN) models have been proposed to further discriminate between causal structures that are equivalent in terms of conditional independencies. These methods rely on a particular form of the generative functional equations, with an additive noise structure, which allows inferring the directionality of causation by testing the independence between the residuals of a nonlinear regression and the predictors (nrr-independencies). Full causal structure identifiability has been proven for systems that contain only additive-noise equations and have no hidden variables. We extend the AN framework in several ways. We introduce alternative regression-free tests of independence based on conditional variances (cv-independencies). We consider conditionally-additive-noise (CAN) models, in which the equations may have the AN form only after conditioning. We exploit asymmetries in nrr-independencies or cv-independencies resulting from the CAN form to derive a criterion that infers the causal relation between a pair of variables in a multivariate system without any assumption about the form of the equations or the presence of hidden variables.
Tasks
Published	2019-05-20
URL	https://arxiv.org/abs/1905.08360v1
PDF	https://arxiv.org/pdf/1905.08360v1.pdf
PWC	https://paperswithcode.com/paper/conditionally-additive-noise-models-for
Repo
Framework

Regularized Hierarchical Policies for Compositional Transfer in Robotics


Title	Regularized Hierarchical Policies for Compositional Transfer in Robotics
Authors	Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller
Abstract	The successful application of flexible, general learning algorithms – such as deep reinforcement learning – to real-world robotics applications is often limited by their poor data-efficiency. Domains with more than a single dominant task of interest encourage algorithms that share partial solutions across tasks to limit the required experiment time. We develop and investigate simple hierarchical inductive biases – in the form of structured policies – as a mechanism for knowledge transfer across tasks in reinforcement learning (RL). To leverage the power of these structured policies we design an RL algorithm that enables stable and fast learning. We demonstrate the success of our method both in simulated robot environments (using locomotion and manipulation domains) as well as real robot experiments, demonstrating substantially better data-efficiency than competitive baselines.
Tasks	Transfer Learning
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11228v2
PDF	https://arxiv.org/pdf/1906.11228v2.pdf
PWC	https://paperswithcode.com/paper/regularized-hierarchical-policies-for
Repo
Framework

Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms


Title	Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms
Authors	Liqun Shao, Yiwen Zhu, Abhiram Eswaran, Kristin Lieber, Janhavi Mahajan, Minsoo Thigpen, Sudhir Darbha, Siqi Liu, Subru Krishnan, Soundar Srinivasan, Carlo Curino, Konstantinos Karanasos
Abstract	Microsoft’s internal big data analytics platform is comprised of hundreds of thousands of machines, serving over half a million jobs daily, from thousands of users. The majority of these jobs are recurring and are crucial for the company’s operation. Although administrators spend significant effort tuning system performance, some jobs inevitably experience slowdowns, i.e., their execution time degrades over previous runs. Currently, the investigation of such slowdowns is a labor-intensive and error-prone process, which costs Microsoft significant human and machine resources, and negatively impacts several lines of businesses. In this work, we present Griffin, a system we built and have deployed in production last year to automatically discover the root cause of job slowdowns. Existing solutions either rely on labeled data (i.e., resolved incidents with labeled reasons for job slowdowns), which is in most cases non-existent or non-trivial to acquire, or on time-series analysis of individual metrics that do not target specific jobs holistically. In contrast, in Griffin we cast the problem to a corresponding regression one that predicts the runtime of a job, and show how the relative contributions of the features used to train our interpretable model can be exploited to rank the potential causes of job slowdowns. Evaluated over historical incidents, we show that Griffin discovers slowdown causes that are consistent with the ones validated by domain-expert engineers, in a fraction of the time required by them.
Tasks	Time Series, Time Series Analysis
Published	2019-08-23
URL	https://arxiv.org/abs/1908.09048v1
PDF	https://arxiv.org/pdf/1908.09048v1.pdf
PWC	https://paperswithcode.com/paper/griffon-reasoning-about-job-anomalies-with
Repo
Framework

Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism


Title	Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism
Authors	Kevin K. Lin, Fei Lu
Abstract	First-principles models of complex dynamic phenomena often have many degrees of freedom, only a small fraction of which may be scientifically relevant or observable. Reduced models distill such phenomena to their essence by modeling only relevant variables, thus decreasing computational cost and clarifying dynamical mechanisms. Here, we consider data-driven model reduction for nonlinear dynamical systems without sharp scale separation. Motivated by a discrete-time version of the Mori-Zwanzig projection operator formalism and the Wiener filter, we propose a simple and flexible mathematical formulation based on Wiener projection, which decomposes a nonlinear dynamical system into a component predictable by past values of relevant variables and its orthogonal complement. Wiener projection is equally applicable to deterministic chaotic dynamics and randomly-forced systems, and provides a natural starting point for systematic approximations. In particular, we use it to derive NARMAX models from an underlying dynamical system, thereby clarifying the scope of these widely-used tools in time series analysis. We illustrate its versatility on the Kuramoto-Sivashinsky model of spatiotemporal chaos and a stochastic Burgers equation.
Tasks	Time Series, Time Series Analysis
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07725v3
PDF	https://arxiv.org/pdf/1908.07725v3.pdf
PWC	https://paperswithcode.com/paper/190807725
Repo
Framework

Advancing subgroup fairness via sleeping experts


Title	Advancing subgroup fairness via sleeping experts
Authors	Avrim Blum, Thodoris Lykouris
Abstract	We study methods for improving fairness to subgroups in settings with overlapping populations and sequential predictions. Classical notions of fairness focus on the balance of some property across different populations. However, in many applications the goal of the different groups is not to be predicted equally but rather to be predicted well. We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup. On the positive side, we show that when individuals are equally important to the different groups they belong to, this goal is achievable; to do so, we draw a connection to the sleeping experts literature in online learning. Motivated by the one-sided feedback in natural settings of interest, we extend our results to such a feedback model. We also provide a game-theoretic interpretation of our results, examining the incentives of participants to join the system and to provide the system full information about predictors they may possess. We end with several interesting open problems concerning the strength of guarantees that can be achieved in a computationally efficient manner.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08375v2
PDF	https://arxiv.org/pdf/1909.08375v2.pdf
PWC	https://paperswithcode.com/paper/advancing-subgroup-fairness-via-sleeping
Repo
Framework

Quizbowl: The Case for Incremental Question Answering


Title	Quizbowl: The Case for Incremental Question Answering
Authors	Pedro Rodriguez, Shi Feng, Mohit Iyyer, He He, Jordan Boyd-Graber
Abstract	Quizbowl is a scholastic trivia competition that tests human knowledge and intelligence; additionally, it supports diverse research in question answering (QA). A Quizbowl question consists of multiple sentences whose clues are arranged by difficulty (from obscure to obvious) and uniquely identify a well-known entity such as those found on Wikipedia. Since players can answer the question at any time, an elite player (human or machine) demonstrates its superiority by answering correctly given as few clues as possible. We make two key contributions to machine learning research through Quizbowl: (1) collecting and curating a large factoid QA dataset and an accompanying gameplay dataset, and (2) developing a computational approach to playing Quizbowl that involves determining both what to answer and when to answer. Our Quizbowl system has defeated some of the best trivia players in the world over a multi-year series of exhibition matches. Throughout this paper, we show that collaborations with the vibrant Quizbowl community have contributed to the high quality of our dataset, led to new research directions, and doubled as an exciting way to engage the public with research in machine learning and natural language processing.
Tasks	Question Answering
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04792v1
PDF	http://arxiv.org/pdf/1904.04792v1.pdf
PWC	https://paperswithcode.com/paper/quizbowl-the-case-for-incremental-question
Repo
Framework

Assessing incrementality in sequence-to-sequence models


Title	Assessing incrementality in sequence-to-sequence models
Authors	Dennis Ulmer, Dieuwke Hupkes, Elia Bruni
Abstract	Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03293v1
PDF	https://arxiv.org/pdf/1906.03293v1.pdf
PWC	https://paperswithcode.com/paper/assessing-incrementality-in-sequence-to
Repo
Framework

Adversarial Example Detection and Classification With Asymmetrical Adversarial Training


Title	Adversarial Example Detection and Classification With Asymmetrical Adversarial Training
Authors	Xuwang Yin, Soheil Kolouri, Gustavo K. Rohde
Abstract	The vulnerabilities of deep neural networks against adversarial examples have become a significant concern for deploying these models in sensitive domains. Devising a definitive defense against such attacks is proven to be challenging, and the methods relying on detecting adversarial samples are only valid when the attacker is oblivious to the detection mechanism. In this paper we first present an adversarial example detection method that provides performance guarantee to norm constrained adversaries. The method is based on the idea of training adversarial robust subspace detectors using asymmetrical adversarial training (AAT). The novel AAT objective presents a minimax problem similar to that of GANs; it has the same convergence property, and consequently supports the learning of class conditional distributions. We first demonstrate that the minimax problem could be reasonably solved by PGD attack, and then use the learned class conditional generative models to define generative detection/classification models that are both robust and more interpretable. We provide comprehensive evaluations of the above methods, and demonstrate their competitive performances and compelling properties on adversarial detection and robust classification problems.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11475v2
PDF	https://arxiv.org/pdf/1905.11475v2.pdf
PWC	https://paperswithcode.com/paper/divide-and-conquer-adversarial-detection
Repo
Framework

Generating Personalized Recipes from Historical User Preferences


Title	Generating Personalized Recipes from Historical User Preferences
Authors	Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
Abstract	Existing approaches to recipe generation are unable to create recipes for users with culinary preferences but incomplete knowledge of ingredients in specific dishes. We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user’s historical preferences. We attend on technique- and recipe-level representations of a user’s previously consumed recipes, fusing these ‘user-aware’ representations in an attention fusion layer to control recipe text generation. Experiments on a new dataset of 180K recipes and 700K interactions show our model’s ability to generate plausible and personalized recipes compared to non-personalized baselines.
Tasks	Recipe Generation, Text Generation
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00105v1
PDF	https://arxiv.org/pdf/1909.00105v1.pdf
PWC	https://paperswithcode.com/paper/generating-personalized-recipes-from
Repo
Framework

J-Net: Randomly weighted U-Net for audio source separation


Title	J-Net: Randomly weighted U-Net for audio source separation
Authors	Bo-Wen Chen, Yen-Min Hsu, Hung-Yi Lee
Abstract	Several results in the computer vision literature have shown the potential of randomly weighted neural networks. While they perform fairly well as feature extractors for discriminative tasks, a positive correlation exists between their performance and their fully trained counterparts. According to these discoveries, we pose two questions: what is the value of randomly weighted networks in difficult generative audio tasks such as audio source separation and does such positive correlation still exist when it comes to large random networks and their trained counterparts? In this paper, we demonstrate that the positive correlation still exists. Based on this discovery, we can try out different architecture designs or tricks without training the whole model. Meanwhile, we find a surprising result that in comparison to the non-trained encoder (down-sample path) in Wave-U-Net, fixing the decoder (up-sample path) to random weights results in better performance, almost comparable to the fully trained model.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12926v1
PDF	https://arxiv.org/pdf/1911.12926v1.pdf
PWC	https://paperswithcode.com/paper/j-net-randomly-weighted-u-net-for-audio
Repo
Framework

Structuring an unordered text document


Title	Structuring an unordered text document
Authors	Shashank Yadav, Tejas Shimpi, C. Ravindranath Chowdary, Prashant Sharma, Deepansh Agrawal, Shivang Agarwal
Abstract	Segmenting an unordered text document into different sections is a very useful task in many text processing applications like multiple document summarization, question answering, etc. This paper proposes structuring of an unordered text document based on the keywords in the document. We test our approach on Wikipedia documents using both statistical and predictive methods such as the TextRank algorithm and Google’s USE (Universal Sentence Encoder). From our experimental results, we show that the proposed model can effectively structure an unordered document into sections.
Tasks	Document Summarization, Question Answering
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10133v1
PDF	http://arxiv.org/pdf/1901.10133v1.pdf
PWC	https://paperswithcode.com/paper/structuring-an-unordered-text-document
Repo
Framework

Learning Numeracy: Binary Arithmetic with Neural Turing Machines


Title	Learning Numeracy: Binary Arithmetic with Neural Turing Machines
Authors	Jacopo Castellini
Abstract	One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via stochastic gradient descent. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use NTMs to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the real capabilities on this neural model.
Tasks
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02478v2
PDF	https://arxiv.org/pdf/1904.02478v2.pdf
PWC	https://paperswithcode.com/paper/learning-numeracy-binary-arithmetic-with
Repo
Framework

A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience


Title	A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience
Authors	Aritra Mitra, John A. Richards, Shreyas Sundaram
Abstract	We study a setting where a group of agents, each receiving partially informative private signals, seek to collaboratively learn the true underlying state of the world (from a finite set of hypotheses) that generates their joint observation profiles. To solve this problem, we propose a distributed learning rule that differs fundamentally from existing approaches, in that it does not employ any form of “belief-averaging”. Instead, agents update their beliefs based on a min-rule. Under standard assumptions on the observation model and the network structure, we establish that each agent learns the truth asymptotically almost surely. As our main contribution, we prove that with probability 1, each false hypothesis is ruled out by every agent exponentially fast at a network-independent rate that is strictly larger than existing rates. We then develop a computationally-efficient variant of our learning rule that is provably resilient to agents who do not behave as expected (as represented by a Byzantine adversary model) and deliberately try to spread misinformation.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.03588v1
PDF	https://arxiv.org/pdf/1907.03588v1.pdf
PWC	https://paperswithcode.com/paper/a-new-approach-to-distributed-hypothesis
Repo
Framework

A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data


Title	A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data
Authors	Sebastian Pölsterl, Ignacio Sarasua, Benjamín Gutiérrez-Becker, Christian Wachinger
Abstract	We introduce a wide and deep neural network for prediction of progression from patients with mild cognitive impairment to Alzheimer’s disease. Information from anatomical shape and tabular clinical data (demographics, biomarkers) are fused in a single neural network. The network is invariant to shape transformations and avoids the need to identify point correspondences between shapes. To account for right censored time-to-event data, i.e., when it is only known that a patient did not develop Alzheimer’s disease up to a particular time point, we employ a loss commonly used in survival analysis. Our network is trained end-to-end to combine information from a patient’s hippocampus shape and clinical biomarkers. Our experiments on data from the Alzheimer’s Disease Neuroimaging Initiative demonstrate that our proposed model is able to learn a shape descriptor that augments clinical biomarkers and outperforms a deep neural network on shape alone and a linear model on common clinical biomarkers.
Tasks	Survival Analysis
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03890v1
PDF	https://arxiv.org/pdf/1909.03890v1.pdf
PWC	https://paperswithcode.com/paper/a-wide-and-deep-neural-network-for-survival
Repo
Framework