Paper Group AWR 372
DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection. TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning. Deep Coordination Graphs. Controllable Artistic Text Style Transfer via Shape-Matching GAN. Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-A …
DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection
Title | DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder Convolutional Neural Networks for Pulmonary Nodule Detection |
Authors | Yuemeng Li, Yong Fan |
Abstract | Pulmonary nodule detection plays an important role in lung cancer screening with low-dose computed tomography (CT) scans. It remains challenging to build nodule detection deep learning models with good generalization performance due to unbalanced positive and negative samples. In order to overcome this problem and further improve state-of-the-art nodule detection methods, we develop a novel deep 3D convolutional neural network with an Encoder-Decoder structure in conjunction with a region proposal network. Particularly, we utilize a dynamically scaled cross entropy loss to reduce the false positive rate and combat the sample imbalance problem associated with nodule detection. We adopt the squeeze-and-excitation structure to learn effective image features and utilize inter-dependency information of different feature maps. We have validated our method based on publicly available CT scans with manually labelled ground-truth obtained from LIDC/IDRI dataset and its subset LUNA16 with thinner slices. Ablation studies and experimental results have demonstrated that our method could outperform state-of-the-art nodule detection methods by a large margin. |
Tasks | Computed Tomography (CT) |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03501v2 |
https://arxiv.org/pdf/1904.03501v2.pdf | |
PWC | https://paperswithcode.com/paper/deepseed-3d-squeeze-and-excitation-encoder |
Repo | https://github.com/ymli39/DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection |
Framework | pytorch |
TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
Title | TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning |
Authors | Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez |
Abstract | Learning good feature embeddings for images often requires substantial training data. As a consequence, in settings where training data is limited (e.g., few-shot and zero-shot learning), we are typically forced to use a generic feature embedding across various tasks. Ideally, we want to construct feature embeddings that are tuned for the given task. In this work, we propose Task-Aware Feature Embedding Networks (TAFE-Nets) to learn how to adapt the image representation to a new task in a meta learning fashion. Our network is composed of a meta learner and a prediction network. Based on a task input, the meta learner generates parameters for the feature layers in the prediction network so that the feature embedding can be accurately adjusted for that task. We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning. Our model matches or exceeds the state-of-the-art on all tasks. In particular, our approach improves the prediction accuracy of unseen attribute-object pairs by 4 to 15 points on the challenging visual attribute-object composition task. |
Tasks | Few-Shot Learning, Meta-Learning, Zero-Shot Learning |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05967v1 |
http://arxiv.org/pdf/1904.05967v1.pdf | |
PWC | https://paperswithcode.com/paper/tafe-net-task-aware-feature-embeddings-for-1 |
Repo | https://github.com/ucbdrive/tafe-net |
Framework | pytorch |
Deep Coordination Graphs
Title | Deep Coordination Graphs |
Authors | Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson |
Abstract | This paper introduces the deep coordination graph (DCG) for collaborative multi-agent reinforcement learning. DCG strikes a flexible trade-off between representational capacity and generalization by factoring the joint value function of all agents according to a coordination graph into payoffs between pairs of agents. The value can be maximized by local message passing along the graph, which allows training of the value function end-to-end with Q-learning. Payoff functions are approximated with deep neural networks that employ parameter sharing and low-rank approximations to significantly improve sample efficiency. We show that DCG can solve predator-prey tasks that highlight the relative overgeneralization pathology, as well as challenging StarCraft II micromanagement tasks. |
Tasks | Multi-agent Reinforcement Learning, Q-Learning, Starcraft, Starcraft II |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1910.00091v3 |
https://arxiv.org/pdf/1910.00091v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-coordination-graphs |
Repo | https://github.com/Denys88/rl_games |
Framework | tf |
Controllable Artistic Text Style Transfer via Shape-Matching GAN
Title | Controllable Artistic Text Style Transfer via Shape-Matching GAN |
Authors | Shuai Yang, Zhangyang Wang, Zhaowen Wang, Ning Xu, Jiaying Liu, Zongming Guo |
Abstract | Artistic text style transfer is the task of migrating the style from a source image to the target text to create artistic typography. Recent style transfer methods have considered texture control to enhance usability. However, controlling the stylistic degree in terms of shape deformation remains an important open challenge. In this paper, we present the first text style transfer network that allows for real-time control of the crucial stylistic degree of the glyph through an adjustable parameter. Our key contribution is a novel bidirectional shape matching framework to establish an effective glyph-style mapping at various deformation levels without paired ground truth. Based on this idea, we propose a scale-controllable module to empower a single network to continuously characterize the multi-scale shape features of the style image and transfer these features to the target text. The proposed method demonstrates its superiority over previous state-of-the-arts in generating diverse, controllable and high-quality stylized text. |
Tasks | Style Transfer, Text Style Transfer |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01354v2 |
https://arxiv.org/pdf/1905.01354v2.pdf | |
PWC | https://paperswithcode.com/paper/controllable-artistic-text-style-transfer-via |
Repo | https://github.com/TAMU-VITA/ShapeMatchingGAN |
Framework | pytorch |
Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control
Title | Deep Variational Koopman Models: Inferring Koopman Observations for Uncertainty-Aware Dynamics Modeling and Control |
Authors | Jeremy Morton, Freddie D Witherden, Mykel J Kochenderfer |
Abstract | Koopman theory asserts that a nonlinear dynamical system can be mapped to a linear system, where the Koopman operator advances observations of the state forward in time. However, the observable functions that map states to observations are generally unknown. We introduce the Deep Variational Koopman (DVK) model, a method for inferring distributions over observations that can be propagated linearly in time. By sampling from the inferred distributions, we obtain a distribution over dynamical models, which in turn provides a distribution over possible outcomes as a modeled system advances in time. Experiments show that the DVK model is effective at long-term prediction for a variety of dynamical systems. Furthermore, we describe how to incorporate the learned models into a control framework, and demonstrate that accounting for the uncertainty present in the distribution over dynamical models enables more effective control. |
Tasks | |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1902.09742v3 |
https://arxiv.org/pdf/1902.09742v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-variational-koopman-models-inferring |
Repo | https://github.com/sisl/variational_koopman |
Framework | tf |
Hierarchically-Refined Label Attention Network for Sequence Labeling
Title | Hierarchically-Refined Label Attention Network for Sequence Labeling |
Authors | Leyang Cui, Yue Zhang |
Abstract | CRF has been used as a powerful model for statistical sequence labeling. For neural sequence labeling, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. This can be because the simple Markov label transition model of CRF does not give much information gain over strong neural encoding. For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by giving each word incrementally refined label distributions with hierarchical attention. Results on POS tagging, NER and CCG supertagging show that the proposed model not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to BiLSTM-CRF. |
Tasks | CCG Supertagging, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08676v3 |
https://arxiv.org/pdf/1908.08676v3.pdf | |
PWC | https://paperswithcode.com/paper/hierarchically-refined-label-attention |
Repo | https://github.com/Nealcly/LAN |
Framework | pytorch |
Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset
Title | Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset |
Authors | Desmond C. Ong, Zhengxuan Wu, Tan Zhi-Xuan, Marianne Reddan, Isabella Kahhale, Alison Mattek, Jamil Zaki |
Abstract | Human emotions unfold over time, and more affective computing research has to prioritize capturing this crucial component of real-world affect. Modeling dynamic emotional stimuli requires solving the twin challenges of time-series modeling and of collecting high-quality time-series datasets. We begin by assessing the state-of-the-art in time-series emotion recognition, and we review contemporary time-series approaches in affective computing, including discriminative and generative models. We then introduce the first version of the Stanford Emotional Narratives Dataset (SENDv1): a set of rich, multimodal videos of self-paced, unscripted emotional narratives, annotated for emotional valence over time. The complex narratives and naturalistic expressions in this dataset provide a challenging test for contemporary time-series emotion recognition models. We demonstrate several baseline and state-of-the-art modeling approaches on the SEND, including a Long Short-Term Memory model and a multimodal Variational Recurrent Neural Network, which perform comparably to the human-benchmark. We end by discussing the implications for future research in time-series affective computing. |
Tasks | Emotion Recognition, Time Series |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1912.05008v1 |
https://arxiv.org/pdf/1912.05008v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-emotion-in-complex-stories-the |
Repo | https://github.com/desmond-ong/TAC-EA-model |
Framework | pytorch |
The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition
Title | The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition |
Authors | Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noburu Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina, Daniel Ionita |
Abstract | Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm"O (MARL"O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.08129v1 |
http://arxiv.org/pdf/1901.08129v1.pdf | |
PWC | https://paperswithcode.com/paper/the-multi-agent-reinforcement-learning-in |
Repo | https://github.com/maximecohen2/conception_solution_appli_ia_i4_tp_research |
Framework | none |
Unsupervised Neural Machine Translation with SMT as Posterior Regularization
Title | Unsupervised Neural Machine Translation with SMT as Posterior Regularization |
Authors | Shuo Ren, Zhirui Zhang, Shujie Liu, Ming Zhou, Shuai Ma |
Abstract | Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. However, due to weak supervision, the pseudo data inevitably contain noises and errors that will be accumulated and reinforced in the subsequent training process, leading to bad translation performance. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process. Our method starts from SMT models built with pre-trained language models and word-level translation tables inferred from cross-lingual embeddings. Then SMT and NMT models are optimized jointly and boost each other incrementally in a unified EM framework. In this way, (1) the negative effect caused by errors in the iterative back-translation process can be alleviated timely by SMT filtering noises from its phrase tables; meanwhile, (2) NMT can compensate for the deficiency of fluency inherent in SMT. Experiments conducted on en-fr and en-de translation tasks show that our method outperforms the strong baseline and achieves new state-of-the-art unsupervised machine translation performance. |
Tasks | Machine Translation, Unsupervised Machine Translation |
Published | 2019-01-14 |
URL | http://arxiv.org/abs/1901.04112v1 |
http://arxiv.org/pdf/1901.04112v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-neural-machine-translation-with |
Repo | https://github.com/Imagist-Shuo/UNMT-SPR |
Framework | tf |
Assessing BERT’s Syntactic Abilities
Title | Assessing BERT’s Syntactic Abilities |
Authors | Yoav Goldberg |
Abstract | I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) “coloreless green ideas” subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena. The BERT model performs remarkably well on all cases. |
Tasks | |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05287v1 |
http://arxiv.org/pdf/1901.05287v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-berts-syntactic-abilities |
Repo | https://github.com/woailaosang/repo_treasure |
Framework | none |
StarGAN v2: Diverse Image Synthesis for Multiple Domains
Title | StarGAN v2: Diverse Image Synthesis for Multiple Domains |
Authors | Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha |
Abstract | A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2. |
Tasks | Image Generation, Image-to-Image Translation |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01865v1 |
https://arxiv.org/pdf/1912.01865v1.pdf | |
PWC | https://paperswithcode.com/paper/stargan-v2-diverse-image-synthesis-for |
Repo | https://github.com/clovaai/stargan-v2 |
Framework | pytorch |
Real Time Trajectory Prediction Using Deep Conditional Generative Models
Title | Real Time Trajectory Prediction Using Deep Conditional Generative Models |
Authors | Sebastian Gomez-Gonzalez, Sergey Prokudin, Bernhard Scholkopf, Jan Peters |
Abstract | Data driven methods for time series forecasting that quantify uncertainty open new important possibilities for robot tasks with hard real time constraints, allowing the robot system to make decisions that trade off between reaction time and accuracy in the predictions. Despite the recent advances in deep learning, it is still challenging to make long term accurate predictions with the low latency required by real time robotic systems. In this paper, we propose a deep conditional generative model for trajectory prediction that is learned from a data set of collected trajectories. Our method uses encoder and decoder deep networks that maps complete or partial trajectories to a Gaussian distributed latent space and back, allowing for fast inference of the future values of a trajectory given previous observations. The encoder and decoder networks are trained using stochastic gradient variational Bayes. In the experiments, we show that our model provides more accurate long term predictions with a lower latency that popular models for trajectory forecasting like recurrent neural networks or physical models based on differential equations. Finally, we test our proposed approach in a robot table tennis scenario to evaluate the performance of the proposed method in a robotic task with hard real time constraints. |
Tasks | Time Series, Time Series Forecasting, Trajectory Prediction |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03895v2 |
https://arxiv.org/pdf/1909.03895v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-trajectory-prediction-using-deep |
Repo | https://github.com/sebasutp/trajectory_forcasting |
Framework | none |
GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning
Title | GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning |
Authors | Vikas Verma, Meng Qu, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang |
Abstract | We present GraphMix, a regularization technique for Graph Neural Network based semi-supervised object classification, leveraging the recent advances in the regularization of classical deep neural networks. Specifically, we propose a unified approach in which we train a fully-connected network jointly with the graph neural network via parameter sharing, interpolation-based regularization, and self-predicted-targets. Our proposed method is architecture agnostic in the sense that it can be applied to any variant of graph neural networks which applies a parametric transformation to the features of the graph nodes. Despite its simplicity, with GraphMix we can consistently improve results and achieve or closely match state-of-the-art performance using even simpler architectures such as Graph Convolutional Networks, across three established graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as three newly proposed datasets : Cora-Full, Co-author-CS and Co-author-Physics. |
Tasks | Node Classification, Object Classification |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11715v1 |
https://arxiv.org/pdf/1909.11715v1.pdf | |
PWC | https://paperswithcode.com/paper/graphmix-regularized-training-of-graph-neural |
Repo | https://github.com/vikasverma1077/GraphMix |
Framework | pytorch |
Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction
Title | Comparison of Deep learning models on time series forecasting : a case study of Dissolved Oxygen Prediction |
Authors | Hongqian Qin |
Abstract | Deep learning has achieved impressive prediction performance in the field of sequence learning recently. Dissolved oxygen prediction, as a kind of time-series forecasting, is suitable for this technique. Although many researchers have developed hybrid models or variant models based on deep learning techniques, there is no comprehensive and sound comparison among the deep learning models in this field currently. Plus, most previous studies focused on one-step forecasting by using a small data set. As the convenient access to high-frequency data, this paper compares multi-step deep learning forecasting by using walk-forward validation. Specifically, we test Convolutional Neural Network (CNN), Temporal Convolutional Network (TCN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Recurrent Neural Network (BiRNN) based on the real-time data recorded automatically at a fixed observation point in the Yangtze River from 2012 to 2016. By comparing the average accumulated statistical metrics of root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination in each time step, We find for multi-step time series forecasting, the average performance of each time step does not decrease linearly. GRU outperforms other models with significant advantages. |
Tasks | Time Series, Time Series Forecasting |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.08414v2 |
https://arxiv.org/pdf/1911.08414v2.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-deep-learning-models-on-time |
Repo | https://github.com/qin67/Multistep-Time-Series-DO-Case |
Framework | none |
IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters
Title | IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters |
Authors | Xinshao Wang, Yang Hua, Elyor Kodirov, Neil M. Robertson |
Abstract | In this work, we study robust deep learning against abnormal training data from the perspective of example weighting built in empirical loss functions, i.e., gradient magnitude with respect to logits, an angle that is not thoroughly studied so far. Consequently, we have two key findings: (1) Mean Absolute Error (MAE) Does Not Treat Examples Equally. We present new observations and insightful analysis about MAE, which is theoretically proved to be noise-robust. First, we reveal its underfitting problem in practice. Second, we analyse that MAE’s noise-robustness is from emphasising on uncertain examples instead of treating training samples equally, as claimed in prior work. (2) The Variance of Gradient Magnitude Matters. We propose an effective and simple solution to enhance MAE’s fitting ability while preserving its noise-robustness. Without changing MAE’s overall weighting scheme, i.e., what examples get higher weights, we simply change its weighting variance non-linearly so that the impact ratio between two examples are adjusted. Our solution is termed Improved MAE (IMAE). We prove IMAE’s effectiveness using extensive experiments: image classification under clean labels, synthetic label noise, and real-world unknown noise. We conclude IMAE is superior to CCE, the most popular loss for training DNNs. |
Tasks | Image Classification, Video Retrieval |
Published | 2019-03-28 |
URL | https://arxiv.org/abs/1903.12141v8 |
https://arxiv.org/pdf/1903.12141v8.pdf | |
PWC | https://paperswithcode.com/paper/improving-mae-against-cce-under-label-noise |
Repo | https://github.com/XinshaoAmosWang/Improving-Mean-Absolute-Error-against-CCE |
Framework | none |