Paper Group AWR 426
A Differentiable Programming System to Bridge Machine Learning and Scientific Computing. Improving Open Information Extraction via Iterative Rank-Aware Learning. Improving RetinaNet for CT Lesion Detection with Dense Masks from Weak RECIST Labels. Dynamics-Aware Unsupervised Discovery of Skills. Dilated Convolutional Neural Networks for Sequential …
A Differentiable Programming System to Bridge Machine Learning and Scientific Computing
Title | A Differentiable Programming System to Bridge Machine Learning and Scientific Computing |
Authors | Mike Innes, Alan Edelman, Keno Fischer, Chris Rackauckas, Elliot Saba, Viral B Shah, Will Tebbutt |
Abstract | Scientific computing is increasingly incorporating the advancements in machine learning and the ability to work with large amounts of data. At the same time, machine learning models are becoming increasingly sophisticated and exhibit many features often seen in scientific computing, stressing the capabilities of machine learning frameworks. Just as the disciplines of scientific computing and machine learning have shared common underlying infrastructure in the form of numerical linear algebra, we now have the opportunity to further share new computational infrastructure, and thus ideas, in the form of Differentiable Programming. We describe Zygote, a Differentiable Programming system that is able to take gradients of general program structures. We implement this system in the Julia programming language. Our system supports almost all language constructs (control flow, recursion, mutation, etc.) and compiles high-performance code without requiring any user intervention or refactoring to stage computations. This enables an expressive programming model for deep learning, but more importantly, it enables us to incorporate a large ecosystem of libraries in our models in a straightforward way. We discuss our approach to automatic differentiation, including its support for advanced techniques such as mixed-mode, complex and checkpointed differentiation, and present several examples of differentiating programs. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07587v2 |
https://arxiv.org/pdf/1907.07587v2.pdf | |
PWC | https://paperswithcode.com/paper/zygote-a-differentiable-programming-system-to |
Repo | https://github.com/ali-ramadhan/6S898-climate-parameterization |
Framework | none |
Improving Open Information Extraction via Iterative Rank-Aware Learning
Title | Improving Open Information Extraction via Iterative Rank-Aware Learning |
Authors | Zhengbao Jiang, Pengcheng Yin, Graham Neubig |
Abstract | Open information extraction (IE) is the task of extracting open-domain assertions from natural language sentences. A key step in open IE is confidence modeling, ranking the extractions based on their estimated quality to adjust precision and recall of extracted assertions. We found that the extraction likelihood, a confidence measure used by current supervised open IE systems, is not well calibrated when comparing the quality of assertions extracted from different sentences. We propose an additional binary classification loss to calibrate the likelihood to make it more globally comparable, and an iterative learning process, where extractions generated by the open IE model are incrementally included as training samples to help the model learn from trial and error. Experiments on OIE2016 demonstrate the effectiveness of our method. Code and data are available at https://github.com/jzbjyb/oie_rank. |
Tasks | Open Information Extraction |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13413v1 |
https://arxiv.org/pdf/1905.13413v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-open-information-extraction-via |
Repo | https://github.com/jzbjyb/oie_rank |
Framework | none |
Improving RetinaNet for CT Lesion Detection with Dense Masks from Weak RECIST Labels
Title | Improving RetinaNet for CT Lesion Detection with Dense Masks from Weak RECIST Labels |
Authors | Martin Zlocha, Qi Dou, Ben Glocker |
Abstract | Accurate, automated lesion detection in Computed Tomography (CT) is an important yet challenging task due to the large variation of lesion types, sizes, locations and appearances. Recent work on CT lesion detection employs two-stage region proposal based methods trained with centroid or bounding-box annotations. We propose a highly accurate and efficient one-stage lesion detector, by re-designing a RetinaNet to meet the particular challenges in medical imaging. Specifically, we optimize the anchor configurations using a differential evolution search algorithm. For training, we leverage the response evaluation criteria in solid tumors (RECIST) annotation which are measured in clinical routine. We incorporate dense masks from weak RECIST labels, obtained automatically using GrabCut, into the training objective, which in combination with other advancements yields new state-of-the-art performance. We evaluate our method on the public DeepLesion benchmark, consisting of 32,735 lesions across the body. Our one-stage detector achieves a sensitivity of 90.77% at 4 false positives per image, significantly outperforming the best reported methods by over 5%. |
Tasks | Computed Tomography (CT), Skin Lesion Identification |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02283v1 |
https://arxiv.org/pdf/1906.02283v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-retinanet-for-ct-lesion-detection |
Repo | https://github.com/fizyr/keras-retinanet |
Framework | tf |
Dynamics-Aware Unsupervised Discovery of Skills
Title | Dynamics-Aware Unsupervised Discovery of Skills |
Authors | Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman |
Abstract | Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment. A good model can potentially enable planning algorithms to generate a large variety of behaviors and solve diverse tasks. However, learning an accurate model for complex dynamical systems is difficult, and even then, the model might not generalize well outside the distribution of states on which it was trained. In this work, we combine model-based learning with model-free learning of primitives that make model-based planning easy. To that end, we aim to answer the question: how can we discover skills whose outcomes are easy to predict? We propose an unsupervised learning algorithm, Dynamics-Aware Discovery of Skills (DADS), which simultaneously discovers predictable behaviors and learns their dynamics. Our method can leverage continuous skill spaces, theoretically, allowing us to learn infinitely many behaviors even for high-dimensional state-spaces. We demonstrate that zero-shot planning in the learned latent space significantly outperforms standard MBRL and model-free goal-conditioned RL, can handle sparse-reward tasks, and substantially improves over prior hierarchical RL methods for unsupervised skill discovery. |
Tasks | |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01657v2 |
https://arxiv.org/pdf/1907.01657v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamics-aware-unsupervised-discovery-of |
Repo | https://github.com/google-research/dads |
Framework | tf |
Dilated Convolutional Neural Networks for Sequential Manifold-valued Data
Title | Dilated Convolutional Neural Networks for Sequential Manifold-valued Data |
Authors | Xingjian Zhen, Rudrasis Chakraborty, Nicholas Vogt, Barbara B. Bendlin, Vikas Singh |
Abstract | Efforts are underway to study ways via which the power of deep neural networks can be extended to non-standard data types such as structured data (e.g., graphs) or manifold-valued data (e.g., unit vectors or special matrices). Often, sizable empirical improvements are possible when the geometry of such data spaces are incorporated into the design of the model, architecture, and the algorithms. Motivated by neuroimaging applications, we study formulations where the data are {\em sequential manifold-valued measurements}. This case is common in brain imaging, where the samples correspond to symmetric positive definite matrices or orientation distribution functions. Instead of a recurrent model which poses computational/technical issues, and inspired by recent results showing the viability of dilated convolutional models for sequence prediction, we develop a dilated convolutional neural network architecture for this task. On the technical side, we show how the modules needed in our network can be derived while explicitly taking the Riemannian manifold structure into account. We show how the operations needed can leverage known results for calculating the weighted Fr'{e}chet Mean (wFM). Finally, we present scientific results for group difference analysis in Alzheimer’s disease (AD) where the groups are derived using AD pathology load: here the model finds several brain fiber bundles that are related to AD even when the subjects are all still cognitively healthy. |
Tasks | |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02206v1 |
https://arxiv.org/pdf/1910.02206v1.pdf | |
PWC | https://paperswithcode.com/paper/dilated-convolutional-neural-networks-for-1 |
Repo | https://github.com/zhenxingjian/DCNN |
Framework | tf |
PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks
Title | PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks |
Authors | Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem |
Abstract | The effectiveness of learning-based point cloud upsampling pipelines heavily relies on the upsampling modules and feature extractors used therein. We propose three novel point upsampling modules: Multi-branch GCN, Clone GCN, and NodeShuffle. Our modules use Graph Convolutional Networks (GCNs) to better encode local point information from the point neighborhood. These upsampling modules are versatile and can be incorporated into any point cloud upsampling pipeline. Extensive experiments show how these modules consistently improve state-of-the-art upsampling methods. We also propose a new multi-scale point feature extractor, called Inception DenseGCN. By aggregating features at multiple scales, this feature extractor enables further performance gain in the final upsampled point clouds. We combine Inception DenseGCN with one of our upsampling modules (NodeShuffle) into a new point upsampling pipeline: PU-GCN. We show qualitatively and quantitatively the significant advantages of PU-GCN over the state-of-the-art. The website and source code of this work are available at https://sites.google.com/kaust.edu.sa/pugcn and https://github.com/guochengqian/PU-GCN respectively. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.03264v2 |
https://arxiv.org/pdf/1912.03264v2.pdf | |
PWC | https://paperswithcode.com/paper/pu-gcn-point-cloud-upsampling-using-graph |
Repo | https://github.com/guochengqian/PU-GCN |
Framework | none |
Corners for Layout: End-to-End Layout Recovery from 360 Images
Title | Corners for Layout: End-to-End Layout Recovery from 360 Images |
Authors | Clara Fernandez-Labrador, Jose M. Facil, Alejandro Perez-Yus, Cédric Demonceaux, Javier Civera, Jose J. Guerrero |
Abstract | The problem of 3D layout recovery in indoor scenes has been a core research topic for over a decade. However, there are still several major challenges that remain unsolved. Among the most relevant ones, a major part of the state-of-the-art methods make implicit or explicit assumptions on the scenes – e.g. box-shaped or Manhattan layouts. Also, current methods are computationally expensive and not suitable for real-time applications like robot navigation and AR/VR. In this work we present CFL (Corners for Layout), the first end-to-end model for 3D layout recovery on 360 images. Our experimental results show that we outperform the state of the art relaxing assumptions about the scene and at a lower cost. We also show that our model generalizes better to camera position variations than conventional approaches by using EquiConvs, a type of convolution applied directly on the sphere projection and hence invariant to the equirectangular distortions. CFL Webpage: https://cfernandezlab.github.io/CFL/ |
Tasks | 3D Room Layouts From A Single Rgb Panorama, Robot Navigation |
Published | 2019-03-19 |
URL | http://arxiv.org/abs/1903.08094v2 |
http://arxiv.org/pdf/1903.08094v2.pdf | |
PWC | https://paperswithcode.com/paper/corners-for-layout-end-to-end-layout-recovery |
Repo | https://github.com/cfernandezlab/CFL |
Framework | tf |
Multi-relational Poincaré Graph Embeddings
Title | Multi-relational Poincaré Graph Embeddings |
Authors | Ivana Balažević, Carl Allen, Timothy Hospedales |
Abstract | Hyperbolic embeddings have recently gained attention in machine learning due to their ability to represent hierarchical data more accurately and succinctly than their Euclidean analogues. However, multi-relational knowledge graphs often exhibit multiple simultaneous hierarchies, which current hyperbolic models do not capture. To address this, we propose a model that embeds multi-relational graph data in the Poincar'e ball model of hyperbolic space. Our Multi-Relational Poincar'e model (MuRP) learns relation-specific parameters to transform entity embeddings by M"obius matrix-vector multiplication and M"obius addition. Experiments on the hierarchical WN18RR knowledge graph show that our Poincar'e embeddings outperform their Euclidean counterpart and existing embedding methods on the link prediction task, particularly at lower dimensionality. |
Tasks | Entity Embeddings, Knowledge Graphs, Link Prediction |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09791v3 |
https://arxiv.org/pdf/1905.09791v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-relational-poincare-graph-embeddings |
Repo | https://github.com/ibalazevic/multirelational-poincare |
Framework | pytorch |
Universal Adversarial Triggers for Attacking and Analyzing NLP
Title | Universal Adversarial Triggers for Attacking and Analyzing NLP |
Authors | Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh |
Abstract | Adversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset. We propose a gradient-guided search over tokens which finds short trigger sequences (e.g., one word for classification and four words for language modeling) that successfully trigger the target prediction. For example, triggers cause SNLI entailment accuracy to drop from 89.94% to 0.55%, 72% of “why” questions in SQuAD to be answered “to kill american people”, and the GPT-2 language model to spew racist output even when conditioned on non-racial contexts. Furthermore, although the triggers are optimized using white-box access to a specific model, they transfer to other models for all tasks we consider. Finally, since triggers are input-agnostic, they provide an analysis of global model behavior. For instance, they confirm that SNLI models exploit dataset biases and help to diagnose heuristics learned by reading comprehension models. |
Tasks | Language Modelling, Reading Comprehension |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07125v2 |
https://arxiv.org/pdf/1908.07125v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-adversarial-triggers-for-nlp |
Repo | https://github.com/Eric-Wallace/universal-triggers |
Framework | pytorch |
PifPaf: Composite Fields for Human Pose Estimation
Title | PifPaf: Composite Fields for Human Pose Estimation |
Authors | Sven Kreiss, Lorenzo Bertoni, Alexandre Alahi |
Abstract | We propose a new bottom-up method for multi-person 2D human pose estimation that is particularly well suited for urban mobility such as self-driving cars and delivery robots. The new method, PifPaf, uses a Part Intensity Field (PIF) to localize body parts and a Part Association Field (PAF) to associate body parts with each other to form full human poses. Our method outperforms previous methods at low resolution and in crowded, cluttered and occluded scenes thanks to (i) our new composite field PAF encoding fine-grained information and (ii) the choice of Laplace loss for regressions which incorporates a notion of uncertainty. Our architecture is based on a fully convolutional, single-shot, box-free design. We perform on par with the existing state-of-the-art bottom-up method on the standard COCO keypoint task and produce state-of-the-art results on a modified COCO keypoint task for the transportation domain. |
Tasks | Keypoint Detection, Multi-Person Pose Estimation, Pose Estimation, Self-Driving Cars |
Published | 2019-03-15 |
URL | http://arxiv.org/abs/1903.06593v2 |
http://arxiv.org/pdf/1903.06593v2.pdf | |
PWC | https://paperswithcode.com/paper/pifpaf-composite-fields-for-human-pose |
Repo | https://github.com/vita-epfl/openpifpafwebdemo |
Framework | pytorch |
On the Effectiveness of Low-rank Approximations for Collaborative Filtering compared to Neural Networks
Title | On the Effectiveness of Low-rank Approximations for Collaborative Filtering compared to Neural Networks |
Authors | Marcel Kurovski, Florian Wilhelm |
Abstract | Even in times of deep learning, low-rank approximations by factorizing a matrix into user and item latent factors continue to be a method of choice for collaborative filtering tasks due to their great performance. While deep learning based approaches excel in hybrid recommender tasks where additional features for items, users or even context are available, their flexibility seems to rather impair the performance compared to low-rank approximations for pure collaborative filtering tasks where no additional features are used. Recent works propose hybrid models combining low-rank approximations and traditional deep neural architectures with promising results but fail to explain why neural networks alone are unsuitable for this task. In this work, we revisit the model and intuition behind low-rank approximation to point out its suitability for collaborative filtering tasks. In several experiments we compare the performance and behavior of models based on a deep neural network and low-rank approximation to examine the reasons for the low effectiveness of traditional deep neural networks. We conclude that the universal approximation capabilities of traditional deep neural networks severely impair the determination of suitable latent vectors, leading to a worse performance compared to low-rank approximations. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12967v1 |
https://arxiv.org/pdf/1905.12967v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effectiveness-of-low-rank |
Repo | https://github.com/FlorianWilhelm/lrann |
Framework | pytorch |
RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space
Title | RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space |
Authors | Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, Jian Tang |
Abstract | We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a task heavily relies on the ability of modeling and inferring the patterns of (or between) the relations. In this paper, we present a new approach for knowledge graph embedding called RotatE, which is able to model and infer various relation patterns including: symmetry/antisymmetry, inversion, and composition. Specifically, the RotatE model defines each relation as a rotation from the source entity to the target entity in the complex vector space. In addition, we propose a novel self-adversarial negative sampling technique for efficiently and effectively training the RotatE model. Experimental results on multiple benchmark knowledge graphs show that the proposed RotatE model is not only scalable, but also able to infer and model various relation patterns and significantly outperform existing state-of-the-art models for link prediction. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.10197v1 |
http://arxiv.org/pdf/1902.10197v1.pdf | |
PWC | https://paperswithcode.com/paper/rotate-knowledge-graph-embedding-by |
Repo | https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding |
Framework | pytorch |
Supervised Encoding for Discrete Representation Learning
Title | Supervised Encoding for Discrete Representation Learning |
Authors | Cat P. Le, Yi Zhou, Jie Ding, Vahid Tarokh |
Abstract | Classical supervised classification tasks search for a nonlinear mapping that maps each encoded feature directly to a probability mass over the labels. Such a learning framework typically lacks the intuition that encoded features from the same class tend to be similar and thus has little interpretability for the learned features. In this paper, we propose a novel supervised learning model named Supervised-Encoding Quantizer (SEQ). The SEQ applies a quantizer to cluster and classify the encoded features. We found that the quantizer provides an interpretable graph where each cluster in the graph represents a class of data samples that have a particular style. We also trained a decoder that can decode convex combinations of the encoded features from similar and different clusters and provide guidance on style transfer between sub-classes. |
Tasks | Representation Learning, Style Transfer |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.11067v1 |
https://arxiv.org/pdf/1910.11067v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-encoding-for-discrete |
Repo | https://github.com/lephuoccat/Supervised-Encoding-Quantizer |
Framework | pytorch |
Dynamic Multi-Task Learning for Face Recognition with Facial Expression
Title | Dynamic Multi-Task Learning for Face Recognition with Facial Expression |
Authors | Zuheng Ming, Junshi Xia, Muhammad Muzzamil Luqman, Jean-Christophe Burie, Kaixing Zhao |
Abstract | Benefiting from the joint learning of the multiple tasks in the deep multi-task networks, many applications have shown the promising performance comparing to single-task learning. However, the performance of multi-task learning framework is highly dependant on the relative weights of the tasks. How to assign the weight of each task is a critical issue in the multi-task learning. Instead of tuning the weights manually which is exhausted and time-consuming, in this paper we propose an approach which can dynamically adapt the weights of the tasks according to the difficulty for training the task. Specifically, the proposed method does not introduce the hyperparameters and the simple structure allows the other multi-task deep learning networks can easily realize or reproduce this method. We demonstrate our approach for face recognition with facial expression and facial expression recognition from a single input image based on a deep multi-task learning Conventional Neural Networks (CNNs). Both the theoretical analysis and the experimental results demonstrate the effectiveness of the proposed dynamic multi-task learning method. This multi-task learning with dynamic weights also boosts of the performance on the different tasks comparing to the state-of-art methods with single-task learning. |
Tasks | Face Recognition, Facial Expression Recognition, Multi-Task Learning |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03281v1 |
https://arxiv.org/pdf/1911.03281v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-multi-task-learning-for-face |
Repo | https://github.com/hengxyz/Dynamic_multi-task-learning |
Framework | tf |
An Evaluation of Action Recognition Models on EPIC-Kitchens
Title | An Evaluation of Action Recognition Models on EPIC-Kitchens |
Authors | Will Price, Dima Damen |
Abstract | We benchmark contemporary action recognition models (TSN, TRN, and TSM) on the recently introduced EPIC-Kitchens dataset and release pretrained models on GitHub (https://github.com/epic-kitchens/action-models) for others to build upon. In contrast to popular action recognition datasets like Kinetics, Something-Something, UCF101, and HMDB51, EPIC-Kitchens is shot from an egocentric perspective and captures daily actions in-situ. In this report, we aim to understand how well these models can tackle the challenges present in this dataset, such as its long tail class distribution, unseen environment test set, and multiple tasks (verb, noun and, action classification). We discuss the models’ shortcomings and avenues for future research. |
Tasks | Action Classification |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.00867v1 |
https://arxiv.org/pdf/1908.00867v1.pdf | |
PWC | https://paperswithcode.com/paper/an-evaluation-of-action-recognition-models-on |
Repo | https://github.com/epic-kitchens/action-models |
Framework | pytorch |