January 31, 2020

2942 words 14 mins read

Paper Group AWR 372

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection. TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning. Deep Coordination Graphs. Controllable Artistic Text Style Transfer via Shape-Matching GAN. Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-A …

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection


Title	DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection
Authors	Yuemeng Li, Yong Fan
Abstract	Pulmonary nodule detection plays an important role in lung cancer screening with low-dose computed tomography (CT) scans. It remains challenging to build nodule detection deep learning models with good generalization performance due to unbalanced positive and negative samples. In order to overcome this problem and further improve state-of-the-art nodule detection methods, we develop a novel deep 3D convolutional neural network with an Encoder-Decoder structure in conjunction with a region proposal network. Particularly, we utilize a dynamically scaled cross entropy loss to reduce the false positive rate and combat the sample imbalance problem associated with nodule detection. We adopt the squeeze-and-excitation structure to learn effective image features and utilize inter-dependency information of different feature maps. We have validated our method based on publicly available CT scans with manually labelled ground-truth obtained from LIDC/IDRI dataset and its subset LUNA16 with thinner slices. Ablation studies and experimental results have demonstrated that our method could outperform state-of-the-art nodule detection methods by a large margin.
Tasks	Computed Tomography (CT)
Published	2019-04-06
URL	https://arxiv.org/abs/1904.03501v2
PDF	https://arxiv.org/pdf/1904.03501v2.pdf
PWC	https://paperswithcode.com/paper/deepseed-3d-squeeze-and-excitation-encoder
Repo	https://github.com/ymli39/DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection
Framework	pytorch

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning


Title	TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
Authors	Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez
Abstract	Learning good feature embeddings for images often requires substantial training data. As a consequence, in settings where training data is limited (e.g., few-shot and zero-shot learning), we are typically forced to use a generic feature embedding across various tasks. Ideally, we want to construct feature embeddings that are tuned for the given task. In this work, we propose Task-Aware Feature Embedding Networks (TAFE-Nets) to learn how to adapt the image representation to a new task in a meta learning fashion. Our network is composed of a meta learner and a prediction network. Based on a task input, the meta learner generates parameters for the feature layers in the prediction network so that the feature embedding can be accurately adjusted for that task. We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning. Our model matches or exceeds the state-of-the-art on all tasks. In particular, our approach improves the prediction accuracy of unseen attribute-object pairs by 4 to 15 points on the challenging visual attribute-object composition task.
Tasks	Few-Shot Learning, Meta-Learning, Zero-Shot Learning
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05967v1
PDF	http://arxiv.org/pdf/1904.05967v1.pdf
PWC	https://paperswithcode.com/paper/tafe-net-task-aware-feature-embeddings-for-1
Repo	https://github.com/ucbdrive/tafe-net
Framework	pytorch

Deep Coordination Graphs


Title	Deep Coordination Graphs
Authors	Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson
Abstract	This paper introduces the deep coordination graph (DCG) for collaborative multi-agent reinforcement learning. DCG strikes a flexible trade-off between representational capacity and generalization by factoring the joint value function of all agents according to a coordination graph into payoffs between pairs of agents. The value can be maximized by local message passing along the graph, which allows training of the value function end-to-end with Q-learning. Payoff functions are approximated with deep neural networks that employ parameter sharing and low-rank approximations to significantly improve sample efficiency. We show that DCG can solve predator-prey tasks that highlight the relative overgeneralization pathology, as well as challenging StarCraft II micromanagement tasks.
Tasks	Multi-agent Reinforcement Learning, Q-Learning, Starcraft, Starcraft II
Published	2019-09-27
URL	https://arxiv.org/abs/1910.00091v3
PDF	https://arxiv.org/pdf/1910.00091v3.pdf
PWC	https://paperswithcode.com/paper/deep-coordination-graphs
Repo	https://github.com/Denys88/rl_games
Framework	tf

Controllable Artistic Text Style Transfer via Shape-Matching GAN


Title	Controllable Artistic Text Style Transfer via Shape-Matching GAN
Authors	Shuai Yang, Zhangyang Wang, Zhaowen Wang, Ning Xu, Jiaying Liu, Zongming Guo
Abstract	Artistic text style transfer is the task of migrating the style from a source image to the target text to create artistic typography. Recent style transfer methods have considered texture control to enhance usability. However, controlling the stylistic degree in terms of shape deformation remains an important open challenge. In this paper, we present the first text style transfer network that allows for real-time control of the crucial stylistic degree of the glyph through an adjustable parameter. Our key contribution is a novel bidirectional shape matching framework to establish an effective glyph-style mapping at various deformation levels without paired ground truth. Based on this idea, we propose a scale-controllable module to empower a single network to continuously characterize the multi-scale shape features of the style image and transfer these features to the target text. The proposed method demonstrates its superiority over previous state-of-the-arts in generating diverse, controllable and high-quality stylized text.
Tasks	Style Transfer, Text Style Transfer
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01354v2
PDF	https://arxiv.org/pdf/1905.01354v2.pdf
PWC	https://paperswithcode.com/paper/controllable-artistic-text-style-transfer-via
Repo	https://github.com/TAMU-VITA/ShapeMatchingGAN
Framework	pytorch

Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control


Title	Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control
Authors	Jeremy Morton, Freddie D Witherden, Mykel J Kochenderfer
Abstract	Koopman theory asserts that a nonlinear dynamical system can be mapped to a linear system, where the Koopman operator advances observations of the state forward in time. However, the observable functions that map states to observations are generally unknown. We introduce the Deep Variational Koopman (DVK) model, a method for inferring distributions over observations that can be propagated linearly in time. By sampling from the inferred distributions, we obtain a distribution over dynamical models, which in turn provides a distribution over possible outcomes as a modeled system advances in time. Experiments show that the DVK model is effective at long-term prediction for a variety of dynamical systems. Furthermore, we describe how to incorporate the learned models into a control framework, and demonstrate that accounting for the uncertainty present in the distribution over dynamical models enables more effective control.
Tasks
Published	2019-02-26
URL	https://arxiv.org/abs/1902.09742v3
PDF	https://arxiv.org/pdf/1902.09742v3.pdf
PWC	https://paperswithcode.com/paper/deep-variational-koopman-models-inferring
Repo	https://github.com/sisl/variational_koopman
Framework	tf

Hierarchically-Refined Label Attention Network for Sequence Labeling


Title	Hierarchically-Refined Label Attention Network for Sequence Labeling
Authors	Leyang Cui, Yue Zhang
Abstract	CRF has been used as a powerful model for statistical sequence labeling. For neural sequence labeling, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. This can be because the simple Markov label transition model of CRF does not give much information gain over strong neural encoding. For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by giving each word incrementally refined label distributions with hierarchical attention. Results on POS tagging, NER and CCG supertagging show that the proposed model not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to BiLSTM-CRF.
Tasks	CCG Supertagging, Named Entity Recognition, Part-Of-Speech Tagging
Published	2019-08-23
URL	https://arxiv.org/abs/1908.08676v3
PDF	https://arxiv.org/pdf/1908.08676v3.pdf
PWC	https://paperswithcode.com/paper/hierarchically-refined-label-attention
Repo	https://github.com/Nealcly/LAN
Framework	pytorch

Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset


Title	Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset
Authors	Desmond C. Ong, Zhengxuan Wu, Tan Zhi-Xuan, Marianne Reddan, Isabella Kahhale, Alison Mattek, Jamil Zaki
Abstract	Human emotions unfold over time, and more affective computing research has to prioritize capturing this crucial component of real-world affect. Modeling dynamic emotional stimuli requires solving the twin challenges of time-series modeling and of collecting high-quality time-series datasets. We begin by assessing the state-of-the-art in time-series emotion recognition, and we review contemporary time-series approaches in affective computing, including discriminative and generative models. We then introduce the first version of the Stanford Emotional Narratives Dataset (SENDv1): a set of rich, multimodal videos of self-paced, unscripted emotional narratives, annotated for emotional valence over time. The complex narratives and naturalistic expressions in this dataset provide a challenging test for contemporary time-series emotion recognition models. We demonstrate several baseline and state-of-the-art modeling approaches on the SEND, including a Long Short-Term Memory model and a multimodal Variational Recurrent Neural Network, which perform comparably to the human-benchmark. We end by discussing the implications for future research in time-series affective computing.
Tasks	Emotion Recognition, Time Series
Published	2019-11-22
URL	https://arxiv.org/abs/1912.05008v1
PDF	https://arxiv.org/pdf/1912.05008v1.pdf
PWC	https://paperswithcode.com/paper/modeling-emotion-in-complex-stories-the
Repo	https://github.com/desmond-ong/TAC-EA-model
Framework	pytorch

The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition


Title	The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition
Authors	Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noburu Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina, Daniel Ionita
Abstract	Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm"O (MARL"O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.
Tasks	Multi-agent Reinforcement Learning
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08129v1
PDF	http://arxiv.org/pdf/1901.08129v1.pdf
PWC	https://paperswithcode.com/paper/the-multi-agent-reinforcement-learning-in
Repo	https://github.com/maximecohen2/conception_solution_appli_ia_i4_tp_research
Framework	none

Unsupervised Neural Machine Translation with SMT as Posterior Regularization


Title	Unsupervised Neural Machine Translation with SMT as Posterior Regularization
Authors	Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, Shuai Ma
Abstract	Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. However, due to weak supervision, the pseudo data inevitably contain noises and errors that will be accumulated and reinforced in the subsequent training process, leading to bad translation performance. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process. Our method starts from SMT models built with pre-trained language models and word-level translation tables inferred from cross-lingual embeddings. Then SMT and NMT models are optimized jointly and boost each other incrementally in a unified EM framework. In this way, (1) the negative effect caused by errors in the iterative back-translation process can be alleviated timely by SMT filtering noises from its phrase tables; meanwhile, (2) NMT can compensate for the deficiency of fluency inherent in SMT. Experiments conducted on en-fr and en-de translation tasks show that our method outperforms the strong baseline and achieves new state-of-the-art unsupervised machine translation performance.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04112v1
PDF	http://arxiv.org/pdf/1901.04112v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-neural-machine-translation-with
Repo	https://github.com/Imagist-Shuo/UNMT-SPR
Framework	tf

Assessing BERT’s Syntactic Abilities


Title	Assessing BERT’s Syntactic Abilities
Authors	Yoav Goldberg
Abstract	I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) “coloreless green ideas” subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena. The BERT model performs remarkably well on all cases.
Tasks
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05287v1
PDF	http://arxiv.org/pdf/1901.05287v1.pdf
PWC	https://paperswithcode.com/paper/assessing-berts-syntactic-abilities
Repo	https://github.com/woailaosang/repo_treasure
Framework	none

StarGAN v2: Diverse Image Synthesis for Multiple Domains


Title	StarGAN v2: Diverse Image Synthesis for Multiple Domains
Authors	Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha
Abstract	A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2.
Tasks	Image Generation, Image-to-Image Translation
Published	2019-12-04
URL	https://arxiv.org/abs/1912.01865v1
PDF	https://arxiv.org/pdf/1912.01865v1.pdf
PWC	https://paperswithcode.com/paper/stargan-v2-diverse-image-synthesis-for
Repo	https://github.com/clovaai/stargan-v2
Framework	pytorch

Real Time Trajectory Prediction Using Deep Conditional Generative Models


Title	Real Time Trajectory Prediction Using Deep Conditional Generative Models
Authors	Sebastian Gomez-Gonzalez, Sergey Prokudin, Bernhard Scholkopf, Jan Peters
Abstract	Data driven methods for time series forecasting that quantify uncertainty open new important possibilities for robot tasks with hard real time constraints, allowing the robot system to make decisions that trade off between reaction time and accuracy in the predictions. Despite the recent advances in deep learning, it is still challenging to make long term accurate predictions with the low latency required by real time robotic systems. In this paper, we propose a deep conditional generative model for trajectory prediction that is learned from a data set of collected trajectories. Our method uses encoder and decoder deep networks that maps complete or partial trajectories to a Gaussian distributed latent space and back, allowing for fast inference of the future values of a trajectory given previous observations. The encoder and decoder networks are trained using stochastic gradient variational Bayes. In the experiments, we show that our model provides more accurate long term predictions with a lower latency that popular models for trajectory forecasting like recurrent neural networks or physical models based on differential equations. Finally, we test our proposed approach in a robot table tennis scenario to evaluate the performance of the proposed method in a robotic task with hard real time constraints.
Tasks	Time Series, Time Series Forecasting, Trajectory Prediction
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03895v2
PDF	https://arxiv.org/pdf/1909.03895v2.pdf
PWC	https://paperswithcode.com/paper/real-time-trajectory-prediction-using-deep
Repo	https://github.com/sebasutp/trajectory_forcasting
Framework	none

GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning


Title	GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning
Authors	Vikas Verma, Meng Qu, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang
Abstract	We present GraphMix, a regularization technique for Graph Neural Network based semi-supervised object classification, leveraging the recent advances in the regularization of classical deep neural networks. Specifically, we propose a unified approach in which we train a fully-connected network jointly with the graph neural network via parameter sharing, interpolation-based regularization, and self-predicted-targets. Our proposed method is architecture agnostic in the sense that it can be applied to any variant of graph neural networks which applies a parametric transformation to the features of the graph nodes. Despite its simplicity, with GraphMix we can consistently improve results and achieve or closely match state-of-the-art performance using even simpler architectures such as Graph Convolutional Networks, across three established graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as three newly proposed datasets : Cora-Full, Co-author-CS and Co-author-Physics.
Tasks	Node Classification, Object Classification
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11715v1
PDF	https://arxiv.org/pdf/1909.11715v1.pdf
PWC	https://paperswithcode.com/paper/graphmix-regularized-training-of-graph-neural
Repo	https://github.com/vikasverma1077/GraphMix
Framework	pytorch

Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction


Title	Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction
Authors	Hongqian Qin
Abstract	Deep learning has achieved impressive prediction performance in the field of sequence learning recently. Dissolved oxygen prediction, as a kind of time-series forecasting, is suitable for this technique. Although many researchers have developed hybrid models or variant models based on deep learning techniques, there is no comprehensive and sound comparison among the deep learning models in this field currently. Plus, most previous studies focused on one-step forecasting by using a small data set. As the convenient access to high-frequency data, this paper compares multi-step deep learning forecasting by using walk-forward validation. Specifically, we test Convolutional Neural Network (CNN), Temporal Convolutional Network (TCN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Recurrent Neural Network (BiRNN) based on the real-time data recorded automatically at a fixed observation point in the Yangtze River from 2012 to 2016. By comparing the average accumulated statistical metrics of root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination in each time step, We find for multi-step time series forecasting, the average performance of each time step does not decrease linearly. GRU outperforms other models with significant advantages.
Tasks	Time Series, Time Series Forecasting
Published	2019-11-17
URL	https://arxiv.org/abs/1911.08414v2
PDF	https://arxiv.org/pdf/1911.08414v2.pdf
PWC	https://paperswithcode.com/paper/comparison-of-deep-learning-models-on-time
Repo	https://github.com/qin67/Multistep-Time-Series-DO-Case
Framework	none

IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters


Title	IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters
Authors	Xinshao Wang, Yang Hua, Elyor Kodirov, Neil M. Robertson
Abstract	In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analysis about MAE, which is theoretically proved to be noise-robust. First, we reveal its underfitting problem in practice. Second, we analyse that MAE’s noise-robustness is from emphasising on uncertain examples instead of treating training samples equally, as claimed in prior work. (2) The Variance of Gradient Magnitude Matters. We propose an effective and simple solution to enhance MAE’s fitting ability while preserving its noise-robustness. Without changing MAE’s overall weighting scheme, i.e., what examples get higher weights, we simply change its weighting variance non-linearly so that the impact ratio between two examples are adjusted. Our solution is termed Improved MAE (IMAE). We prove IMAE’s effectiveness using extensive experiments: image classification under clean labels, synthetic label noise, and real-world unknown noise. We conclude IMAE is superior to CCE, the most popular loss for training DNNs.
Tasks	Image Classification, Video Retrieval
Published	2019-03-28
URL	https://arxiv.org/abs/1903.12141v8
PDF	https://arxiv.org/pdf/1903.12141v8.pdf
PWC	https://paperswithcode.com/paper/improving-mae-against-cce-under-label-noise
Repo	https://github.com/XinshaoAmosWang/Improving-Mean-Absolute-Error-against-CCE
Framework	none