January 31, 2020

2942 words 14 mins read

Paper Group AWR 372

Paper Group AWR 372

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection. TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning. Deep Coordination Graphs. Controllable Artistic Text Style Transfer via Shape-Matching GAN. Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-A …

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection

Title DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection
Authors Yuemeng Li, Yong Fan
Abstract Pulmonary nodule detection plays an important role in lung cancer screening with low-dose computed tomography (CT) scans. It remains challenging to build nodule detection deep learning models with good generalization performance due to unbalanced positive and negative samples. In order to overcome this problem and further improve state-of-the-art nodule detection methods, we develop a novel deep 3D convolutional neural network with an Encoder-Decoder structure in conjunction with a region proposal network. Particularly, we utilize a dynamically scaled cross entropy loss to reduce the false positive rate and combat the sample imbalance problem associated with nodule detection. We adopt the squeeze-and-excitation structure to learn effective image features and utilize inter-dependency information of different feature maps. We have validated our method based on publicly available CT scans with manually labelled ground-truth obtained from LIDC/IDRI dataset and its subset LUNA16 with thinner slices. Ablation studies and experimental results have demonstrated that our method could outperform state-of-the-art nodule detection methods by a large margin.
Tasks Computed Tomography (CT)
Published 2019-04-06
URL https://arxiv.org/abs/1904.03501v2
PDF https://arxiv.org/pdf/1904.03501v2.pdf
PWC https://paperswithcode.com/paper/deepseed-3d-squeeze-and-excitation-encoder
Repo https://github.com/ymli39/DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection
Framework pytorch

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

Title TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
Authors Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez
Abstract Learning good feature embeddings for images often requires substantial training data. As a consequence, in settings where training data is limited (e.g., few-shot and zero-shot learning), we are typically forced to use a generic feature embedding across various tasks. Ideally, we want to construct feature embeddings that are tuned for the given task. In this work, we propose Task-Aware Feature Embedding Networks (TAFE-Nets) to learn how to adapt the image representation to a new task in a meta learning fashion. Our network is composed of a meta learner and a prediction network. Based on a task input, the meta learner generates parameters for the feature layers in the prediction network so that the feature embedding can be accurately adjusted for that task. We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning. Our model matches or exceeds the state-of-the-art on all tasks. In particular, our approach improves the prediction accuracy of unseen attribute-object pairs by 4 to 15 points on the challenging visual attribute-object composition task.
Tasks Few-Shot Learning, Meta-Learning, Zero-Shot Learning
Published 2019-04-11
URL http://arxiv.org/abs/1904.05967v1
PDF http://arxiv.org/pdf/1904.05967v1.pdf
PWC https://paperswithcode.com/paper/tafe-net-task-aware-feature-embeddings-for-1
Repo https://github.com/ucbdrive/tafe-net
Framework pytorch

Deep Coordination Graphs

Title Deep Coordination Graphs
Authors Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson
Abstract This paper introduces the deep coordination graph (DCG) for collaborative multi-agent reinforcement learning. DCG strikes a flexible trade-off between representational capacity and generalization by factoring the joint value function of all agents according to a coordination graph into payoffs between pairs of agents. The value can be maximized by local message passing along the graph, which allows training of the value function end-to-end with Q-learning. Payoff functions are approximated with deep neural networks that employ parameter sharing and low-rank approximations to significantly improve sample efficiency. We show that DCG can solve predator-prey tasks that highlight the relative overgeneralization pathology, as well as challenging StarCraft II micromanagement tasks.
Tasks Multi-agent Reinforcement Learning, Q-Learning, Starcraft, Starcraft II
Published 2019-09-27
URL https://arxiv.org/abs/1910.00091v3
PDF https://arxiv.org/pdf/1910.00091v3.pdf
PWC https://paperswithcode.com/paper/deep-coordination-graphs
Repo https://github.com/Denys88/rl_games
Framework tf

Controllable Artistic Text Style Transfer via Shape-Matching GAN

Title Controllable Artistic Text Style Transfer via Shape-Matching GAN
Authors Shuai Yang, Zhangyang Wang, Zhaowen Wang, Ning Xu, Jiaying Liu, Zongming Guo
Abstract Artistic text style transfer is the task of migrating the style from a source image to the target text to create artistic typography. Recent style transfer methods have considered texture control to enhance usability. However, controlling the stylistic degree in terms of shape deformation remains an important open challenge. In this paper, we present the first text style transfer network that allows for real-time control of the crucial stylistic degree of the glyph through an adjustable parameter. Our key contribution is a novel bidirectional shape matching framework to establish an effective glyph-style mapping at various deformation levels without paired ground truth. Based on this idea, we propose a scale-controllable module to empower a single network to continuously characterize the multi-scale shape features of the style image and transfer these features to the target text. The proposed method demonstrates its superiority over previous state-of-the-arts in generating diverse, controllable and high-quality stylized text.
Tasks Style Transfer, Text Style Transfer
Published 2019-05-03
URL https://arxiv.org/abs/1905.01354v2
PDF https://arxiv.org/pdf/1905.01354v2.pdf
PWC https://paperswithcode.com/paper/controllable-artistic-text-style-transfer-via
Repo https://github.com/TAMU-VITA/ShapeMatchingGAN
Framework pytorch

Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control

Title Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control
Authors Jeremy Morton, Freddie D Witherden, Mykel J Kochenderfer
Abstract Koopman theory asserts that a nonlinear dynamical system can be mapped to a linear system, where the Koopman operator advances observations of the state forward in time. However, the observable functions that map states to observations are generally unknown. We introduce the Deep Variational Koopman (DVK) model, a method for inferring distributions over observations that can be propagated linearly in time. By sampling from the inferred distributions, we obtain a distribution over dynamical models, which in turn provides a distribution over possible outcomes as a modeled system advances in time. Experiments show that the DVK model is effective at long-term prediction for a variety of dynamical systems. Furthermore, we describe how to incorporate the learned models into a control framework, and demonstrate that accounting for the uncertainty present in the distribution over dynamical models enables more effective control.
Tasks
Published 2019-02-26
URL https://arxiv.org/abs/1902.09742v3
PDF https://arxiv.org/pdf/1902.09742v3.pdf
PWC https://paperswithcode.com/paper/deep-variational-koopman-models-inferring
Repo https://github.com/sisl/variational_koopman
Framework tf

Hierarchically-Refined Label Attention Network for Sequence Labeling

Title Hierarchically-Refined Label Attention Network for Sequence Labeling
Authors Leyang Cui, Yue Zhang
Abstract CRF has been used as a powerful model for statistical sequence labeling. For neural sequence labeling, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. This can be because the simple Markov label transition model of CRF does not give much information gain over strong neural encoding. For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by giving each word incrementally refined label distributions with hierarchical attention. Results on POS tagging, NER and CCG supertagging show that the proposed model not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to BiLSTM-CRF.
Tasks CCG Supertagging, Named Entity Recognition, Part-Of-Speech Tagging
Published 2019-08-23
URL https://arxiv.org/abs/1908.08676v3
PDF https://arxiv.org/pdf/1908.08676v3.pdf
PWC https://paperswithcode.com/paper/hierarchically-refined-label-attention
Repo https://github.com/Nealcly/LAN
Framework pytorch

Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset

Title Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset
Authors Desmond C. Ong, Zhengxuan Wu, Tan Zhi-Xuan, Marianne Reddan, Isabella Kahhale, Alison Mattek, Jamil Zaki
Abstract Human emotions unfold over time, and more affective computing research has to prioritize capturing this crucial component of real-world affect. Modeling dynamic emotional stimuli requires solving the twin challenges of time-series modeling and of collecting high-quality time-series datasets. We begin by assessing the state-of-the-art in time-series emotion recognition, and we review contemporary time-series approaches in affective computing, including discriminative and generative models. We then introduce the first version of the Stanford Emotional Narratives Dataset (SENDv1): a set of rich, multimodal videos of self-paced, unscripted emotional narratives, annotated for emotional valence over time. The complex narratives and naturalistic expressions in this dataset provide a challenging test for contemporary time-series emotion recognition models. We demonstrate several baseline and state-of-the-art modeling approaches on the SEND, including a Long Short-Term Memory model and a multimodal Variational Recurrent Neural Network, which perform comparably to the human-benchmark. We end by discussing the implications for future research in time-series affective computing.
Tasks Emotion Recognition, Time Series
Published 2019-11-22
URL https://arxiv.org/abs/1912.05008v1
PDF https://arxiv.org/pdf/1912.05008v1.pdf
PWC https://paperswithcode.com/paper/modeling-emotion-in-complex-stories-the
Repo https://github.com/desmond-ong/TAC-EA-model
Framework pytorch

The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition

Title The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition
Authors Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noburu Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina, Daniel Ionita
Abstract Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm"O (MARL"O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.
Tasks Multi-agent Reinforcement Learning
Published 2019-01-23
URL http://arxiv.org/abs/1901.08129v1
PDF http://arxiv.org/pdf/1901.08129v1.pdf
PWC https://paperswithcode.com/paper/the-multi-agent-reinforcement-learning-in
Repo https://github.com/maximecohen2/conception_solution_appli_ia_i4_tp_research
Framework none

Unsupervised Neural Machine Translation with SMT as Posterior Regularization

Title Unsupervised Neural Machine Translation with SMT as Posterior Regularization
Authors Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, Shuai Ma
Abstract Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. However, due to weak supervision, the pseudo data inevitably contain noises and errors that will be accumulated and reinforced in the subsequent training process, leading to bad translation performance. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process. Our method starts from SMT models built with pre-trained language models and word-level translation tables inferred from cross-lingual embeddings. Then SMT and NMT models are optimized jointly and boost each other incrementally in a unified EM framework. In this way, (1) the negative effect caused by errors in the iterative back-translation process can be alleviated timely by SMT filtering noises from its phrase tables; meanwhile, (2) NMT can compensate for the deficiency of fluency inherent in SMT. Experiments conducted on en-fr and en-de translation tasks show that our method outperforms the strong baseline and achieves new state-of-the-art unsupervised machine translation performance.
Tasks Machine Translation, Unsupervised Machine Translation
Published 2019-01-14
URL http://arxiv.org/abs/1901.04112v1
PDF http://arxiv.org/pdf/1901.04112v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-neural-machine-translation-with
Repo https://github.com/Imagist-Shuo/UNMT-SPR
Framework tf

Assessing BERT’s Syntactic Abilities

Title Assessing BERT’s Syntactic Abilities
Authors Yoav Goldberg
Abstract I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) “coloreless green ideas” subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena. The BERT model performs remarkably well on all cases.
Tasks
Published 2019-01-16
URL http://arxiv.org/abs/1901.05287v1
PDF http://arxiv.org/pdf/1901.05287v1.pdf
PWC https://paperswithcode.com/paper/assessing-berts-syntactic-abilities
Repo https://github.com/woailaosang/repo_treasure
Framework none

StarGAN v2: Diverse Image Synthesis for Multiple Domains

Title StarGAN v2: Diverse Image Synthesis for Multiple Domains
Authors Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha
Abstract A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2.
Tasks Image Generation, Image-to-Image Translation
Published 2019-12-04
URL https://arxiv.org/abs/1912.01865v1
PDF https://arxiv.org/pdf/1912.01865v1.pdf
PWC https://paperswithcode.com/paper/stargan-v2-diverse-image-synthesis-for
Repo https://github.com/clovaai/stargan-v2
Framework pytorch

Real Time Trajectory Prediction Using Deep Conditional Generative Models

Title Real Time Trajectory Prediction Using Deep Conditional Generative Models
Authors Sebastian Gomez-Gonzalez, Sergey Prokudin, Bernhard Scholkopf, Jan Peters
Abstract Data driven methods for time series forecasting that quantify uncertainty open new important possibilities for robot tasks with hard real time constraints, allowing the robot system to make decisions that trade off between reaction time and accuracy in the predictions. Despite the recent advances in deep learning, it is still challenging to make long term accurate predictions with the low latency required by real time robotic systems. In this paper, we propose a deep conditional generative model for trajectory prediction that is learned from a data set of collected trajectories. Our method uses encoder and decoder deep networks that maps complete or partial trajectories to a Gaussian distributed latent space and back, allowing for fast inference of the future values of a trajectory given previous observations. The encoder and decoder networks are trained using stochastic gradient variational Bayes. In the experiments, we show that our model provides more accurate long term predictions with a lower latency that popular models for trajectory forecasting like recurrent neural networks or physical models based on differential equations. Finally, we test our proposed approach in a robot table tennis scenario to evaluate the performance of the proposed method in a robotic task with hard real time constraints.
Tasks Time Series, Time Series Forecasting, Trajectory Prediction
Published 2019-09-09
URL https://arxiv.org/abs/1909.03895v2
PDF https://arxiv.org/pdf/1909.03895v2.pdf
PWC https://paperswithcode.com/paper/real-time-trajectory-prediction-using-deep
Repo https://github.com/sebasutp/trajectory_forcasting
Framework none

GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning

Title GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning
Authors Vikas Verma, Meng Qu, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang
Abstract We present GraphMix, a regularization technique for Graph Neural Network based semi-supervised object classification, leveraging the recent advances in the regularization of classical deep neural networks. Specifically, we propose a unified approach in which we train a fully-connected network jointly with the graph neural network via parameter sharing, interpolation-based regularization, and self-predicted-targets. Our proposed method is architecture agnostic in the sense that it can be applied to any variant of graph neural networks which applies a parametric transformation to the features of the graph nodes. Despite its simplicity, with GraphMix we can consistently improve results and achieve or closely match state-of-the-art performance using even simpler architectures such as Graph Convolutional Networks, across three established graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as three newly proposed datasets : Cora-Full, Co-author-CS and Co-author-Physics.
Tasks Node Classification, Object Classification
Published 2019-09-25
URL https://arxiv.org/abs/1909.11715v1
PDF https://arxiv.org/pdf/1909.11715v1.pdf
PWC https://paperswithcode.com/paper/graphmix-regularized-training-of-graph-neural
Repo https://github.com/vikasverma1077/GraphMix
Framework pytorch

Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction

Title Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction
Authors Hongqian Qin
Abstract Deep learning has achieved impressive prediction performance in the field of sequence learning recently. Dissolved oxygen prediction, as a kind of time-series forecasting, is suitable for this technique. Although many researchers have developed hybrid models or variant models based on deep learning techniques, there is no comprehensive and sound comparison among the deep learning models in this field currently. Plus, most previous studies focused on one-step forecasting by using a small data set. As the convenient access to high-frequency data, this paper compares multi-step deep learning forecasting by using walk-forward validation. Specifically, we test Convolutional Neural Network (CNN), Temporal Convolutional Network (TCN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Recurrent Neural Network (BiRNN) based on the real-time data recorded automatically at a fixed observation point in the Yangtze River from 2012 to 2016. By comparing the average accumulated statistical metrics of root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination in each time step, We find for multi-step time series forecasting, the average performance of each time step does not decrease linearly. GRU outperforms other models with significant advantages.
Tasks Time Series, Time Series Forecasting
Published 2019-11-17
URL https://arxiv.org/abs/1911.08414v2
PDF https://arxiv.org/pdf/1911.08414v2.pdf
PWC https://paperswithcode.com/paper/comparison-of-deep-learning-models-on-time
Repo https://github.com/qin67/Multistep-Time-Series-DO-Case
Framework none

IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters

Title IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters
Authors Xinshao Wang, Yang Hua, Elyor Kodirov, Neil M. Robertson
Abstract In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analysis about MAE, which is theoretically proved to be noise-robust. First, we reveal its underfitting problem in practice. Second, we analyse that MAE’s noise-robustness is from emphasising on uncertain examples instead of treating training samples equally, as claimed in prior work. (2) The Variance of Gradient Magnitude Matters. We propose an effective and simple solution to enhance MAE’s fitting ability while preserving its noise-robustness. Without changing MAE’s overall weighting scheme, i.e., what examples get higher weights, we simply change its weighting variance non-linearly so that the impact ratio between two examples are adjusted. Our solution is termed Improved MAE (IMAE). We prove IMAE’s effectiveness using extensive experiments: image classification under clean labels, synthetic label noise, and real-world unknown noise. We conclude IMAE is superior to CCE, the most popular loss for training DNNs.
Tasks Image Classification, Video Retrieval
Published 2019-03-28
URL https://arxiv.org/abs/1903.12141v8
PDF https://arxiv.org/pdf/1903.12141v8.pdf
PWC https://paperswithcode.com/paper/improving-mae-against-cce-under-label-noise
Repo https://github.com/XinshaoAmosWang/Improving-Mean-Absolute-Error-against-CCE
Framework none
comments powered by Disqus