February 1, 2020

3219 words 16 mins read

Paper Group AWR 179

Paper Group AWR 179

Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing. Dynamically Fused Graph Network for Multi-hop Reasoning. Machine Learning Classification Informed by a Functional Biophysical System. Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning. Addressing Failure Predicti …

Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing

Title Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing
Authors Wenhan Xiong, Jiawei Wu, Deren Lei, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang
Abstract Existing entity typing systems usually exploit the type hierarchy provided by knowledge base (KB) schema to model label correlations and thus improve the overall performance. Such techniques, however, are not directly applicable to more open and practical scenarios where the type set is not restricted by KB schema and includes a vast number of free-form types. To model the underly-ing label correlations without access to manually annotated label structures, we introduce a novel label-relational inductive bias, represented by a graph propagation layer that effectively encodes both global label co-occurrence statistics and word-level similarities.On a large dataset with over 10,000 free-form types, the graph-enhanced model equipped with an attention-based matching module is able to achieve a much higher recall score while maintaining a high-level precision. Specifically, it achieves a 15.3% relative F1 improvement and also less inconsistency in the outputs. We further show that a simple modification of our proposed graph layer can also improve the performance on a conventional and widely-tested dataset that only includes KB-schema types.
Tasks Entity Typing
Published 2019-03-06
URL http://arxiv.org/abs/1903.02591v1
PDF http://arxiv.org/pdf/1903.02591v1.pdf
PWC https://paperswithcode.com/paper/imposing-label-relational-inductive-bias-for
Repo https://github.com/xwhan/Extremely-Fine-Grained-Entity-Typing
Framework pytorch

Dynamically Fused Graph Network for Multi-hop Reasoning

Title Dynamically Fused Graph Network for Multi-hop Reasoning
Authors Yunxuan Xiao, Yanru Qu, Lin Qiu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu
Abstract Text-based question answering (TBQA) has been studied extensively in recent years. Most existing approaches focus on finding the answer to a question within a single paragraph. However, many difficult questions require multiple supporting evidence from scattered text among two or more documents. In this paper, we propose Dynamically Fused Graph Network(DFGN), a novel method to answer those questions requiring multiple scattered evidence and reasoning over them. Inspired by human’s step-by-step reasoning behavior, DFGN includes a dynamic fusion layer that starts from the entities mentioned in the given query, explores along the entity graph dynamically built from the text, and gradually finds relevant supporting entities from the given documents. We evaluate DFGN on HotpotQA, a public TBQA dataset requiring multi-hop reasoning. DFGN achieves competitive results on the public board. Furthermore, our analysis shows DFGN produces interpretable reasoning chains.
Tasks Question Answering
Published 2019-05-16
URL https://arxiv.org/abs/1905.06933v3
PDF https://arxiv.org/pdf/1905.06933v3.pdf
PWC https://paperswithcode.com/paper/dynamically-fused-graph-network-for-multi-hop
Repo https://github.com/woshiyyya/DFGN-pytorch
Framework pytorch

Machine Learning Classification Informed by a Functional Biophysical System

Title Machine Learning Classification Informed by a Functional Biophysical System
Authors Jason A. Platt, Anna Miller, Henry D. I. Abarbanel
Abstract We present a novel machine learning architecture for classification suggested by experiments on the insect olfactory system. The network separates odors via a winnerless competition network, then classifies objects by projection into a high dimensional space where a support vector machine provides more precision in classification. We build this network using biophysical models of neurons with our results showing high discrimination among inputs and exceptional robustness to noise. The same circuitry accurately identifies the amplitudes of mixtures of the odors on which it has been trained.
Tasks
Published 2019-11-19
URL https://arxiv.org/abs/1911.08589v1
PDF https://arxiv.org/pdf/1911.08589v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-classification-informed-by-a
Repo https://github.com/japlatt/WLC_SVM_Time
Framework none

Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning

Title Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning
Authors Yizhe Zhu, Jianwen Xie, Bingchen Liu, Ahmed Elgammal
Abstract We investigate learning feature-to-feature translator networks by alternating back-propagation as a general-purpose solution to zero-shot learning (ZSL) problems. It is a generative model-based ZSL framework. In contrast to models based on generative adversarial networks (GAN) or variational autoencoders (VAE) that require auxiliary networks to assist the training, our model consists of a single conditional generator that maps class-level semantic features and Gaussian white noise vector accounting for instance-level latent factors to visual features, and is trained by maximum likelihood estimation. The training process is a simple yet effective alternating back-propagation process that iterates the following two steps: (i) the inferential back-propagation to infer the latent factors of each observed example, and (ii) the learning back-propagation to update the model parameters. We show that, with slight modifications, our model is capable of learning from incomplete visual features for ZSL. We conduct extensive comparisons with existing generative ZSL methods on five benchmarks, demonstrating the superiority of our method in not only ZSL performance but also convergence speed and computational cost. Specifically, our model outperforms the existing state-of-the-art methods by a remarkable margin up to 3.1% and 4.0% in ZSL and generalized ZSL settings, respectively.
Tasks Zero-Shot Learning
Published 2019-04-22
URL https://arxiv.org/abs/1904.10056v3
PDF https://arxiv.org/pdf/1904.10056v3.pdf
PWC https://paperswithcode.com/paper/learning-feature-to-feature-translator-by
Repo https://github.com/EthanZhu90/ZSL_ABP
Framework pytorch

Addressing Failure Prediction by Learning Model Confidence

Title Addressing Failure Prediction by Learning Model Confidence
Authors Charles Corbière, Nicolas Thome, Avner Bar-Hen, Matthieu Cord, Patrick Pérez
Abstract Assessing reliably the confidence of a deep neural network and predicting its failures is of primary importance for the practical deployment of these models. In this paper, we propose a new target criterion for model confidence, corresponding to the True Class Probability (TCP). We show how using the TCP is more suited than relying on the classic Maximum Class Probability (MCP). We provide in addition theoretical guarantees for TCP in the context of failure prediction. Since the true class is by essence unknown at test time, we propose to learn TCP criterion on the training set, introducing a specific learning scheme adapted to this context. Extensive experiments are conducted for validating the relevance of the proposed approach. We study various network architectures, small and large scale datasets for image classification and semantic segmentation. We show that our approach consistently outperforms several strong methods, from MCP to Bayesian uncertainty, as well as recent approaches specifically designed for failure prediction.
Tasks Image Classification, Semantic Segmentation
Published 2019-10-01
URL https://arxiv.org/abs/1910.04851v2
PDF https://arxiv.org/pdf/1910.04851v2.pdf
PWC https://paperswithcode.com/paper/addressing-failure-prediction-by-learning
Repo https://github.com/valeoai/ConfidNet
Framework pytorch

Multi-Task Gaussian Processes and Dilated Convolutional Networks for Reconstruction of Reproductive Hormonal Dynamics

Title Multi-Task Gaussian Processes and Dilated Convolutional Networks for Reconstruction of Reproductive Hormonal Dynamics
Authors Iñigo Urteaga, Tristan Bertin, Theresa M. Hardy, David J. Albers, Noémie Elhadad
Abstract We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potential health issues. Our goal is to infer and forecast individual hormone daily levels over time, while accommodating pragmatic and minimally invasive measurement settings. To that end, our approach combines the power of probabilistic generative models (i.e., multi-task Gaussian processes) with the flexibility of neural networks (i.e., a dilated convolutional architecture) to learn complex temporal mappings. To attain accurate hormone level reconstruction with as little data as possible, we propose a sampling mechanism for optimal reconstruction accuracy with limited sampling budget. Our results show the validity of our proposed hormonal dynamic modeling framework, as it provides accurate predictive performance across different realistic sampling budgets and outperforms baselines methods.
Tasks Gaussian Processes
Published 2019-08-27
URL https://arxiv.org/abs/1908.10226v1
PDF https://arxiv.org/pdf/1908.10226v1.pdf
PWC https://paperswithcode.com/paper/multi-task-gaussian-processes-and-dilated
Repo https://github.com/iurteaga/hmc
Framework pytorch

Structured Pruning of Large Language Models

Title Structured Pruning of Large Language Models
Authors Ziheng Wang, Jeremy Wohlwend, Tao Lei
Abstract Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly, and raises an interesting question: do language models need to be large? We study this question through the lens of model compression. We present a novel, structured pruning approach based on low rank factorization and augmented Lagrangian L0 norm regularization. Our structured approach achieves significant inference speedups while matching or outperforming our unstructured pruning baseline at various sparsity levels. We apply our method to state of the art models on the enwiki8 dataset and obtain a 1.19 perplexity score with just 5M parameters, vastly outperforming a model of the same size trained from scratch. We also demonstrate that our method can be applied to language model fine-tuning by pruning the BERT model on several downstream classification benchmarks.
Tasks Language Modelling, Model Compression
Published 2019-10-10
URL https://arxiv.org/abs/1910.04732v1
PDF https://arxiv.org/pdf/1910.04732v1.pdf
PWC https://paperswithcode.com/paper/structured-pruning-of-large-language-models
Repo https://github.com/asappresearch/flop
Framework pytorch

Adversarial Margin Maximization Networks

Title Adversarial Margin Maximization Networks
Authors Ziang Yan, Yiwen Guo, Changshui Zhang
Abstract The tremendous recent success of deep neural networks (DNNs) has sparked a surge of interest in understanding their predictive ability. Unlike the human visual system which is able to generalize robustly and learn with little supervision, DNNs normally require a massive amount of data to learn new concepts. In addition, research works also show that DNNs are vulnerable to adversarial examples-maliciously generated images which seem perceptually similar to the natural ones but are actually formed to fool learning models, which means the models have problem generalizing to unseen data with certain type of distortions. In this paper, we analyze the generalization ability of DNNs comprehensively and attempt to improve it from a geometric point of view. We propose adversarial margin maximization (AMM), a learning-based regularization which exploits an adversarial perturbation as a proxy. It encourages a large margin in the input space, just like the support vector machines. With a differentiable formulation of the perturbation, we train the regularized DNNs simply through back-propagation in an end-to-end manner. Experimental results on various datasets (including MNIST, CIFAR-10/100, SVHN and ImageNet) and different DNN architectures demonstrate the superiority of our method over previous state-of-the-arts. Code and models for reproducing our results will be made publicly available.
Tasks
Published 2019-11-14
URL https://arxiv.org/abs/1911.05916v1
PDF https://arxiv.org/pdf/1911.05916v1.pdf
PWC https://paperswithcode.com/paper/adversarial-margin-maximization-networks
Repo https://github.com/ZiangYan/amm.pytorch
Framework pytorch

PointCleanNet: Learning to Denoise and Remove Outliers from Dense Point Clouds

Title PointCleanNet: Learning to Denoise and Remove Outliers from Dense Point Clouds
Authors Marie-Julie Rakotosaona, Vittorio La Barbera, Paul Guerrero, Niloy J. Mitra, Maks Ovsjanikov
Abstract Point clouds obtained with 3D scanners or by image-based reconstruction techniques are often corrupted with significant amount of noise and outliers. Traditional methods for point cloud denoising largely rely on local surface fitting (e.g., jets or MLS surfaces), local or non-local averaging, or on statistical assumptions about the underlying noise model. In contrast, we develop a simple data-driven method for removing outliers and reducing noise in unordered point clouds. We base our approach on a deep learning architecture adapted from PCPNet, which was recently proposed for estimating local 3D shape properties in point clouds. Our method first classifies and discards outlier samples, and then estimates correction vectors that project noisy points onto the original clean surfaces. The approach is efficient and robust to varying amounts of noise and outliers, while being able to handle large densely-sampled point clouds. In our extensive evaluation, both on synthesic and real data, we show an increased robustness to strong noise levels compared to various state-of-the-art methods, enabling accurate surface reconstruction from extremely noisy real data obtained by range scans. Finally, the simplicity and universality of our approach makes it very easy to integrate in any existing geometry processing pipeline.
Tasks Denoising
Published 2019-01-04
URL https://arxiv.org/abs/1901.01060v3
PDF https://arxiv.org/pdf/1901.01060v3.pdf
PWC https://paperswithcode.com/paper/pointcleannet-learning-to-denoise-and-remove
Repo https://github.com/mrakotosaon/pointcleannet
Framework pytorch

CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Title CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison
Authors Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, Andrew Y. Ng
Abstract Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. The dataset is freely available at https://stanfordmlgroup.github.io/competitions/chexpert .
Tasks Lung Disease Classification
Published 2019-01-21
URL http://arxiv.org/abs/1901.07031v1
PDF http://arxiv.org/pdf/1901.07031v1.pdf
PWC https://paperswithcode.com/paper/chexpert-a-large-chest-radiograph-dataset
Repo https://github.com/simongrest/chexpert-entries
Framework pytorch

Machine Learning Based Analysis of Finnish World War II Photographers

Title Machine Learning Based Analysis of Finnish World War II Photographers
Authors Kateryna Chumachenko, Anssi Männistö, Alexandros Iosifidis, Jenni Raitoharju
Abstract In this paper, we demonstrate the benefits of using state-of-the-art machine learning methods in the analysis of historical photo archives. Specifically, we analyze prominent Finnish World War II photographers, who have captured high numbers of photographs in the publicly available SA photo archive, which contains 160,000 photographs from Finnish Winter, Continuation, and Lapland Wars captures in 1939-1945. We were able to find some special characteristics for different photographers in terms of their typical photo content and photo types (e.g., close-ups vs. overview images, number of people). Furthermore, we managed to train a neural network that can successfully recognize the photographer from some of the photos, which shows that such photos are indeed characteristic for certain photographers. We further analyze the similarities and differences between the photographers using the features extracted from the photographer classifier network. All the extracted information will help historical and societal studies over the photo archive.
Tasks
Published 2019-04-22
URL https://arxiv.org/abs/1904.09811v3
PDF https://arxiv.org/pdf/1904.09811v3.pdf
PWC https://paperswithcode.com/paper/machine-learning-based-analysis-of-finnish
Repo https://github.com/katerynaCh/Finnish-WW2-photographers-analysis
Framework tf

Multi-view consensus CNN for 3D facial landmark placement

Title Multi-view consensus CNN for 3D facial landmark placement
Authors Rasmus R. Paulsen, Kristine Aavild Juhl, Thilde Marie Haspang, Thomas Hansen, Melanie Ganz, Gudmundur Einarsson
Abstract The rapid increase in the availability of accurate 3D scanning devices has moved facial recognition and analysis into the 3D domain. 3D facial landmarks are often used as a simple measure of anatomy and it is crucial to have accurate algorithms for automatic landmark placement. The current state-of-the-art approaches have yet to gain from the dramatic increase in performance reported in human pose tracking and 2D facial landmark placement due to the use of deep convolutional neural networks (CNN). Development of deep learning approaches for 3D meshes has given rise to the new subfield called geometric deep learning, where one topic is the adaptation of meshes for the use of deep CNNs. In this work, we demonstrate how methods derived from geometric deep learning, namely multi-view CNNs, can be combined with recent advances in human pose tracking. The method finds 2D landmark estimates and propagates this information to 3D space, where a consensus method determines the accurate 3D face landmark position. We utilise the method on a standard 3D face dataset and show that it outperforms current methods by a large margin. Further, we demonstrate how models trained on 3D range scans can be used to accurately place anatomical landmarks in magnetic resonance images.
Tasks Pose Tracking
Published 2019-10-14
URL https://arxiv.org/abs/1910.06007v1
PDF https://arxiv.org/pdf/1910.06007v1.pdf
PWC https://paperswithcode.com/paper/multi-view-consensus-cnn-for-3d-facial
Repo https://github.com/RasmusRPaulsen/Deep-MVLM
Framework pytorch

IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks

Title IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks
Authors Youngwoon Lee, Edward S. Hu, Zhengyu Yang, Alex Yin, Joseph J. Lim
Abstract The IKEA Furniture Assembly Environment is one of the first benchmarks for testing and accelerating the automation of complex manipulation tasks. The environment is designed to advance reinforcement learning from simple toy tasks to complex tasks requiring both long-term planning and sophisticated low-level control. Our environment supports over 80 different furniture models, Sawyer and Baxter robot simulation, and domain randomization. The IKEA Furniture Assembly Environment is a testbed for methods aiming to solve complex manipulation tasks. The environment is publicly available at https://clvrai.com/furniture
Tasks Robotic Grasping
Published 2019-11-17
URL https://arxiv.org/abs/1911.07246v1
PDF https://arxiv.org/pdf/1911.07246v1.pdf
PWC https://paperswithcode.com/paper/ikea-furniture-assembly-environment-for-long
Repo https://github.com/clvrai/furniture
Framework tf

The RobotriX: An eXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions

Title The RobotriX: An eXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions
Authors Alberto Garcia-Garcia, Pablo Martinez-Gonzalez, Sergiu Oprea, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Alvaro Jover-Alvarez
Abstract Enter the RobotriX, an extremely photorealistic indoor dataset designed to enable the application of deep learning techniques to a wide variety of robotic vision problems. The RobotriX consists of hyperrealistic indoor scenes which are explored by robot agents which also interact with objects in a visually realistic manner in that simulated world. Photorealistic scenes and robots are rendered by Unreal Engine into a virtual reality headset which captures gaze so that a human operator can move the robot and use controllers for the robotic hands; scene information is dumped on a per-frame basis so that it can be reproduced offline to generate raw data and ground truth labels. By taking this approach, we were able to generate a dataset of 38 semantic classes totaling 8M stills recorded at +60 frames per second with full HD resolution. For each frame, RGB-D and 3D information is provided with full annotations in both spaces. Thanks to the high quality and quantity of both raw information and annotations, the RobotriX will serve as a new milestone for investigating 2D and 3D robotic vision tasks with large-scale data-driven techniques.
Tasks Robotic Grasping
Published 2019-01-19
URL http://arxiv.org/abs/1901.06514v1
PDF http://arxiv.org/pdf/1901.06514v1.pdf
PWC https://paperswithcode.com/paper/the-robotrix-an-extremely-photorealistic-and
Repo https://github.com/3dperceptionlab/therobotrix
Framework none

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

Title QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Authors Kyunghwan Son, Daewoo Kim, Wan Ju Kang, David Earl Hostallero, Yung Yi
Abstract We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN’s superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.
Tasks Multi-agent Reinforcement Learning
Published 2019-05-14
URL https://arxiv.org/abs/1905.05408v1
PDF https://arxiv.org/pdf/1905.05408v1.pdf
PWC https://paperswithcode.com/paper/qtran-learning-to-factorize-with
Repo https://github.com/Sonkyunghwan/QTRAN
Framework none
comments powered by Disqus