Paper Group ANR 278
Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features. Conditionally-additive-noise Models for Structure Learning. Regularized Hierarchical Policies for Compositional Transfer in Robotics. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms. Data-driven model reduc …
Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features
Title | Human Face Expressions from Images - 2D Face Geometry and 3D Face Local Motion versus Deep Neural Features |
Authors | Rafal Pilarczyk, Xin Chang, Wladyslaw Skarbek |
Abstract | Several computer algorithms for recognition of visible human emotions are compared at the web camera scenario using CNN/MMOD face detector. The recognition refers to four face expressions: smile, surprise, anger, and neutral. At the feature extraction stage, the following three concepts of face description are confronted: (a) static 2D face geometry represented by its 68 characteristic landmarks (FP68); (b) dynamic 3D geometry defined by motion parameters for eight distinguished face parts (denoted as AU8) of personalized Candide-3 model; (c) static 2D visual description as 2D array of gray scale pixels (known as facial raw image). At the classification stage, the performance of two major models are analyzed: (a) support vector machine (SVM) with kernel options; (b) convolutional neural network (CNN) with variety of relevant tensor processing layers and blocks of them. The models are trained for frontal views of human faces while they are tested for arbitrary head poses. For geometric features, the success rate (accuracy) indicate nearly triple increase of performance of CNN with respect to SVM classifiers. For raw images, CNN outperforms in accuracy its best geometric counterpart (AU/CNN) by about 30 percent while the best SVM solutions are inferior nearly four times. For F-score the high advantage of raw/CNN over geometric/CNN and geometric/SVM is observed, as well. We conclude that contrary to CNN based emotion classifiers, the generalization capability wrt human head pose is for SVM based emotion classifiers poor. |
Tasks | |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11179v1 |
http://arxiv.org/pdf/1901.11179v1.pdf | |
PWC | https://paperswithcode.com/paper/human-face-expressions-from-images-2d-face |
Repo | |
Framework | |
Conditionally-additive-noise Models for Structure Learning
Title | Conditionally-additive-noise Models for Structure Learning |
Authors | Daniel Chicharro, Stefano Panzeri, Ilya Shpitser |
Abstract | Constraint-based structure learning algorithms infer the causal structure of multivariate systems from observational data by determining an equivalent class of causal structures compatible with the conditional independencies in the data. Methods based on additive-noise (AN) models have been proposed to further discriminate between causal structures that are equivalent in terms of conditional independencies. These methods rely on a particular form of the generative functional equations, with an additive noise structure, which allows inferring the directionality of causation by testing the independence between the residuals of a nonlinear regression and the predictors (nrr-independencies). Full causal structure identifiability has been proven for systems that contain only additive-noise equations and have no hidden variables. We extend the AN framework in several ways. We introduce alternative regression-free tests of independence based on conditional variances (cv-independencies). We consider conditionally-additive-noise (CAN) models, in which the equations may have the AN form only after conditioning. We exploit asymmetries in nrr-independencies or cv-independencies resulting from the CAN form to derive a criterion that infers the causal relation between a pair of variables in a multivariate system without any assumption about the form of the equations or the presence of hidden variables. |
Tasks | |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.08360v1 |
https://arxiv.org/pdf/1905.08360v1.pdf | |
PWC | https://paperswithcode.com/paper/conditionally-additive-noise-models-for |
Repo | |
Framework | |
Regularized Hierarchical Policies for Compositional Transfer in Robotics
Title | Regularized Hierarchical Policies for Compositional Transfer in Robotics |
Authors | Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller |
Abstract | The successful application of flexible, general learning algorithms – such as deep reinforcement learning – to real-world robotics applications is often limited by their poor data-efficiency. Domains with more than a single dominant task of interest encourage algorithms that share partial solutions across tasks to limit the required experiment time. We develop and investigate simple hierarchical inductive biases – in the form of structured policies – as a mechanism for knowledge transfer across tasks in reinforcement learning (RL). To leverage the power of these structured policies we design an RL algorithm that enables stable and fast learning. We demonstrate the success of our method both in simulated robot environments (using locomotion and manipulation domains) as well as real robot experiments, demonstrating substantially better data-efficiency than competitive baselines. |
Tasks | Transfer Learning |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11228v2 |
https://arxiv.org/pdf/1906.11228v2.pdf | |
PWC | https://paperswithcode.com/paper/regularized-hierarchical-policies-for |
Repo | |
Framework | |
Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms
Title | Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms |
Authors | Liqun Shao, Yiwen Zhu, Abhiram Eswaran, Kristin Lieber, Janhavi Mahajan, Minsoo Thigpen, Sudhir Darbha, Siqi Liu, Subru Krishnan, Soundar Srinivasan, Carlo Curino, Konstantinos Karanasos |
Abstract | Microsoft’s internal big data analytics platform is comprised of hundreds of thousands of machines, serving over half a million jobs daily, from thousands of users. The majority of these jobs are recurring and are crucial for the company’s operation. Although administrators spend significant effort tuning system performance, some jobs inevitably experience slowdowns, i.e., their execution time degrades over previous runs. Currently, the investigation of such slowdowns is a labor-intensive and error-prone process, which costs Microsoft significant human and machine resources, and negatively impacts several lines of businesses. In this work, we present Griffin, a system we built and have deployed in production last year to automatically discover the root cause of job slowdowns. Existing solutions either rely on labeled data (i.e., resolved incidents with labeled reasons for job slowdowns), which is in most cases non-existent or non-trivial to acquire, or on time-series analysis of individual metrics that do not target specific jobs holistically. In contrast, in Griffin we cast the problem to a corresponding regression one that predicts the runtime of a job, and show how the relative contributions of the features used to train our interpretable model can be exploited to rank the potential causes of job slowdowns. Evaluated over historical incidents, we show that Griffin discovers slowdown causes that are consistent with the ones validated by domain-expert engineers, in a fraction of the time required by them. |
Tasks | Time Series, Time Series Analysis |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.09048v1 |
https://arxiv.org/pdf/1908.09048v1.pdf | |
PWC | https://paperswithcode.com/paper/griffon-reasoning-about-job-anomalies-with |
Repo | |
Framework | |
Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism
Title | Data-driven model reduction, Wiener projections, and the Mori-Zwanzig formalism |
Authors | Kevin K. Lin, Fei Lu |
Abstract | First-principles models of complex dynamic phenomena often have many degrees of freedom, only a small fraction of which may be scientifically relevant or observable. Reduced models distill such phenomena to their essence by modeling only relevant variables, thus decreasing computational cost and clarifying dynamical mechanisms. Here, we consider data-driven model reduction for nonlinear dynamical systems without sharp scale separation. Motivated by a discrete-time version of the Mori-Zwanzig projection operator formalism and the Wiener filter, we propose a simple and flexible mathematical formulation based on Wiener projection, which decomposes a nonlinear dynamical system into a component predictable by past values of relevant variables and its orthogonal complement. Wiener projection is equally applicable to deterministic chaotic dynamics and randomly-forced systems, and provides a natural starting point for systematic approximations. In particular, we use it to derive NARMAX models from an underlying dynamical system, thereby clarifying the scope of these widely-used tools in time series analysis. We illustrate its versatility on the Kuramoto-Sivashinsky model of spatiotemporal chaos and a stochastic Burgers equation. |
Tasks | Time Series, Time Series Analysis |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07725v3 |
https://arxiv.org/pdf/1908.07725v3.pdf | |
PWC | https://paperswithcode.com/paper/190807725 |
Repo | |
Framework | |
Advancing subgroup fairness via sleeping experts
Title | Advancing subgroup fairness via sleeping experts |
Authors | Avrim Blum, Thodoris Lykouris |
Abstract | We study methods for improving fairness to subgroups in settings with overlapping populations and sequential predictions. Classical notions of fairness focus on the balance of some property across different populations. However, in many applications the goal of the different groups is not to be predicted equally but rather to be predicted well. We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup. On the positive side, we show that when individuals are equally important to the different groups they belong to, this goal is achievable; to do so, we draw a connection to the sleeping experts literature in online learning. Motivated by the one-sided feedback in natural settings of interest, we extend our results to such a feedback model. We also provide a game-theoretic interpretation of our results, examining the incentives of participants to join the system and to provide the system full information about predictors they may possess. We end with several interesting open problems concerning the strength of guarantees that can be achieved in a computationally efficient manner. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08375v2 |
https://arxiv.org/pdf/1909.08375v2.pdf | |
PWC | https://paperswithcode.com/paper/advancing-subgroup-fairness-via-sleeping |
Repo | |
Framework | |
Quizbowl: The Case for Incremental Question Answering
Title | Quizbowl: The Case for Incremental Question Answering |
Authors | Pedro Rodriguez, Shi Feng, Mohit Iyyer, He He, Jordan Boyd-Graber |
Abstract | Quizbowl is a scholastic trivia competition that tests human knowledge and intelligence; additionally, it supports diverse research in question answering (QA). A Quizbowl question consists of multiple sentences whose clues are arranged by difficulty (from obscure to obvious) and uniquely identify a well-known entity such as those found on Wikipedia. Since players can answer the question at any time, an elite player (human or machine) demonstrates its superiority by answering correctly given as few clues as possible. We make two key contributions to machine learning research through Quizbowl: (1) collecting and curating a large factoid QA dataset and an accompanying gameplay dataset, and (2) developing a computational approach to playing Quizbowl that involves determining both what to answer and when to answer. Our Quizbowl system has defeated some of the best trivia players in the world over a multi-year series of exhibition matches. Throughout this paper, we show that collaborations with the vibrant Quizbowl community have contributed to the high quality of our dataset, led to new research directions, and doubled as an exciting way to engage the public with research in machine learning and natural language processing. |
Tasks | Question Answering |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04792v1 |
http://arxiv.org/pdf/1904.04792v1.pdf | |
PWC | https://paperswithcode.com/paper/quizbowl-the-case-for-incremental-question |
Repo | |
Framework | |
Assessing incrementality in sequence-to-sequence models
Title | Assessing incrementality in sequence-to-sequence models |
Authors | Dennis Ulmer, Dieuwke Hupkes, Elia Bruni |
Abstract | Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03293v1 |
https://arxiv.org/pdf/1906.03293v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-incrementality-in-sequence-to |
Repo | |
Framework | |
Adversarial Example Detection and Classification With Asymmetrical Adversarial Training
Title | Adversarial Example Detection and Classification With Asymmetrical Adversarial Training |
Authors | Xuwang Yin, Soheil Kolouri, Gustavo K. Rohde |
Abstract | The vulnerabilities of deep neural networks against adversarial examples have become a significant concern for deploying these models in sensitive domains. Devising a definitive defense against such attacks is proven to be challenging, and the methods relying on detecting adversarial samples are only valid when the attacker is oblivious to the detection mechanism. In this paper we first present an adversarial example detection method that provides performance guarantee to norm constrained adversaries. The method is based on the idea of training adversarial robust subspace detectors using asymmetrical adversarial training (AAT). The novel AAT objective presents a minimax problem similar to that of GANs; it has the same convergence property, and consequently supports the learning of class conditional distributions. We first demonstrate that the minimax problem could be reasonably solved by PGD attack, and then use the learned class conditional generative models to define generative detection/classification models that are both robust and more interpretable. We provide comprehensive evaluations of the above methods, and demonstrate their competitive performances and compelling properties on adversarial detection and robust classification problems. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11475v2 |
https://arxiv.org/pdf/1905.11475v2.pdf | |
PWC | https://paperswithcode.com/paper/divide-and-conquer-adversarial-detection |
Repo | |
Framework | |
Generating Personalized Recipes from Historical User Preferences
Title | Generating Personalized Recipes from Historical User Preferences |
Authors | Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley |
Abstract | Existing approaches to recipe generation are unable to create recipes for users with culinary preferences but incomplete knowledge of ingredients in specific dishes. We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user’s historical preferences. We attend on technique- and recipe-level representations of a user’s previously consumed recipes, fusing these ‘user-aware’ representations in an attention fusion layer to control recipe text generation. Experiments on a new dataset of 180K recipes and 700K interactions show our model’s ability to generate plausible and personalized recipes compared to non-personalized baselines. |
Tasks | Recipe Generation, Text Generation |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00105v1 |
https://arxiv.org/pdf/1909.00105v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-personalized-recipes-from |
Repo | |
Framework | |
J-Net: Randomly weighted U-Net for audio source separation
Title | J-Net: Randomly weighted U-Net for audio source separation |
Authors | Bo-Wen Chen, Yen-Min Hsu, Hung-Yi Lee |
Abstract | Several results in the computer vision literature have shown the potential of randomly weighted neural networks. While they perform fairly well as feature extractors for discriminative tasks, a positive correlation exists between their performance and their fully trained counterparts. According to these discoveries, we pose two questions: what is the value of randomly weighted networks in difficult generative audio tasks such as audio source separation and does such positive correlation still exist when it comes to large random networks and their trained counterparts? In this paper, we demonstrate that the positive correlation still exists. Based on this discovery, we can try out different architecture designs or tricks without training the whole model. Meanwhile, we find a surprising result that in comparison to the non-trained encoder (down-sample path) in Wave-U-Net, fixing the decoder (up-sample path) to random weights results in better performance, almost comparable to the fully trained model. |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.12926v1 |
https://arxiv.org/pdf/1911.12926v1.pdf | |
PWC | https://paperswithcode.com/paper/j-net-randomly-weighted-u-net-for-audio |
Repo | |
Framework | |
Structuring an unordered text document
Title | Structuring an unordered text document |
Authors | Shashank Yadav, Tejas Shimpi, C. Ravindranath Chowdary, Prashant Sharma, Deepansh Agrawal, Shivang Agarwal |
Abstract | Segmenting an unordered text document into different sections is a very useful task in many text processing applications like multiple document summarization, question answering, etc. This paper proposes structuring of an unordered text document based on the keywords in the document. We test our approach on Wikipedia documents using both statistical and predictive methods such as the TextRank algorithm and Google’s USE (Universal Sentence Encoder). From our experimental results, we show that the proposed model can effectively structure an unordered document into sections. |
Tasks | Document Summarization, Question Answering |
Published | 2019-01-29 |
URL | http://arxiv.org/abs/1901.10133v1 |
http://arxiv.org/pdf/1901.10133v1.pdf | |
PWC | https://paperswithcode.com/paper/structuring-an-unordered-text-document |
Repo | |
Framework | |
Learning Numeracy: Binary Arithmetic with Neural Turing Machines
Title | Learning Numeracy: Binary Arithmetic with Neural Turing Machines |
Authors | Jacopo Castellini |
Abstract | One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via stochastic gradient descent. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use NTMs to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the real capabilities on this neural model. |
Tasks | |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02478v2 |
https://arxiv.org/pdf/1904.02478v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-numeracy-binary-arithmetic-with |
Repo | |
Framework | |
A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience
Title | A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine-Resilience |
Authors | Aritra Mitra, John A. Richards, Shreyas Sundaram |
Abstract | We study a setting where a group of agents, each receiving partially informative private signals, seek to collaboratively learn the true underlying state of the world (from a finite set of hypotheses) that generates their joint observation profiles. To solve this problem, we propose a distributed learning rule that differs fundamentally from existing approaches, in that it does not employ any form of “belief-averaging”. Instead, agents update their beliefs based on a min-rule. Under standard assumptions on the observation model and the network structure, we establish that each agent learns the truth asymptotically almost surely. As our main contribution, we prove that with probability 1, each false hypothesis is ruled out by every agent exponentially fast at a network-independent rate that is strictly larger than existing rates. We then develop a computationally-efficient variant of our learning rule that is provably resilient to agents who do not behave as expected (as represented by a Byzantine adversary model) and deliberately try to spread misinformation. |
Tasks | |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.03588v1 |
https://arxiv.org/pdf/1907.03588v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-approach-to-distributed-hypothesis |
Repo | |
Framework | |
A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data
Title | A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data |
Authors | Sebastian Pölsterl, Ignacio Sarasua, Benjamín Gutiérrez-Becker, Christian Wachinger |
Abstract | We introduce a wide and deep neural network for prediction of progression from patients with mild cognitive impairment to Alzheimer’s disease. Information from anatomical shape and tabular clinical data (demographics, biomarkers) are fused in a single neural network. The network is invariant to shape transformations and avoids the need to identify point correspondences between shapes. To account for right censored time-to-event data, i.e., when it is only known that a patient did not develop Alzheimer’s disease up to a particular time point, we employ a loss commonly used in survival analysis. Our network is trained end-to-end to combine information from a patient’s hippocampus shape and clinical biomarkers. Our experiments on data from the Alzheimer’s Disease Neuroimaging Initiative demonstrate that our proposed model is able to learn a shape descriptor that augments clinical biomarkers and outperforms a deep neural network on shape alone and a linear model on common clinical biomarkers. |
Tasks | Survival Analysis |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03890v1 |
https://arxiv.org/pdf/1909.03890v1.pdf | |
PWC | https://paperswithcode.com/paper/a-wide-and-deep-neural-network-for-survival |
Repo | |
Framework | |