February 1, 2020

3034 words 15 mins read

Paper Group AWR 339

RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes. Y-Autoencoders: disentangling latent representations via sequential-encoding. Sentiment analysis is not solved! Assessing and probing sentiment classification. LIDA: Lightweight Interactive Dialogue Annotator. Generation of 3D Brain MRI Using Auto-Encoding Generative Adversari …

RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes


Title	RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes
Authors	Po-Wei Wu, Yu-Jing Lin, Che-Han Chang, Edward Y. Chang, Shih-Wei Liao
Abstract	Multi-domain image-to-image translation has gained increasing attention recently. Previous methods take an image and some target attributes as inputs and generate an output image with the desired attributes. However, such methods have two limitations. First, these methods assume binary-valued attributes and thus cannot yield satisfactory results for fine-grained control. Second, these methods require specifying the entire set of target attributes, even if most of the attributes would not be changed. To address these limitations, we propose RelGAN, a new method for multi-domain image-to-image translation. The key idea is to use relative attributes, which describes the desired change on selected attributes. Our method is capable of modifying images by changing particular attributes of interest in a continuous manner while preserving the other attributes. Experimental results demonstrate both the quantitative and qualitative effectiveness of our method on the tasks of facial attribute transfer and interpolation.
Tasks	Image-to-Image Translation
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07269v1
PDF	https://arxiv.org/pdf/1908.07269v1.pdf
PWC	https://paperswithcode.com/paper/relgan-multi-domain-image-to-image
Repo	https://github.com/elvisyjlin/RelGAN-PyTorch
Framework	pytorch

Y-Autoencoders: disentangling latent representations via sequential-encoding


Title	Y-Autoencoders: disentangling latent representations via sequential-encoding
Authors	Massimiliano Patacchiola, Patrick Fox-Roberts, Edward Rosten
Abstract	In the last few years there have been important advancements in generative models with the two dominant approaches being Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). However, standard Autoencoders (AEs) and closely related structures have remained popular because they are easy to train and adapt to different tasks. An interesting question is if we can achieve state-of-the-art performance with AEs while retaining their good properties. We propose an answer to this question by introducing a new model called Y-Autoencoder (Y-AE). The structure and training procedure of a Y-AE enclose a representation into an implicit and an explicit part. The implicit part is similar to the output of an autoencoder and the explicit part is strongly correlated with labels in the training set. The two parts are separated in the latent space by splitting the output of the encoder into two paths (forming a Y shape) before decoding and re-encoding. We then impose a number of losses, such as reconstruction loss, and a loss on dependence between the implicit and explicit parts. Additionally, the projection in the explicit manifold is monitored by a predictor, that is embedded in the encoder and trained end-to-end with no adversarial losses. We provide significant experimental results on various domains, such as separation of style and content, image-to-image translation, and inverse graphics.
Tasks	Image-to-Image Translation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10949v1
PDF	https://arxiv.org/pdf/1907.10949v1.pdf
PWC	https://paperswithcode.com/paper/y-autoencoders-disentangling-latent
Repo	https://github.com/mpatacchiola/Y-AE
Framework	tf

Sentiment analysis is not solved! Assessing and probing sentiment classification


Title	Sentiment analysis is not solved! Assessing and probing sentiment classification
Authors	Jeremy Barnes, Lilja Øvrelid, Erik Velldal
Abstract	Neural methods for SA have led to quantitative improvements over previous approaches, but these advances are not always accompanied with a thorough analysis of the qualitative differences. Therefore, it is not clear what outstanding conceptual challenges for sentiment analysis remain. In this work, we attempt to discover what challenges still prove a problem for sentiment classifiers for English and to provide a challenging dataset. We collect the subset of sentences that an (oracle) ensemble of state-of-the-art sentiment classifiers misclassify and then annotate them for 18 linguistic and paralinguistic phenomena, such as negation, sarcasm, modality, etc. The dataset is available at https://github.com/ltgoslo/assessing_and_probing_sentiment. Finally, we provide a case study that demonstrates the usefulness of the dataset to probe the performance of a given sentiment classifier with respect to linguistic phenomena.
Tasks	Sentiment Analysis
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05887v1
PDF	https://arxiv.org/pdf/1906.05887v1.pdf
PWC	https://paperswithcode.com/paper/sentiment-analysis-is-not-solved-assessing
Repo	https://github.com/ltgoslo/assessing_and_probing_sentiment
Framework	none

LIDA: Lightweight Interactive Dialogue Annotator


Title	LIDA: Lightweight Interactive Dialogue Annotator
Authors	Edward Collins, Nikolai Rozanov, Bingbing Zhang
Abstract	Dialogue systems have the potential to change how people interact with machines but are highly dependent on the quality of the data used to train them. It is therefore important to develop good dialogue annotation tools which can improve the speed and quality of dialogue data annotation. With this in mind, we introduce LIDA, an annotation tool designed specifically for conversation data. As far as we know, LIDA is the first dialogue annotation system that handles the entire dialogue annotation pipeline from raw text, as may be the output of transcription services, to structured conversation data. Furthermore it supports the integration of arbitrary machine learning models as annotation recommenders and also has a dedicated interface to resolve inter-annotator disagreements such as after crowdsourcing annotations for a dataset. LIDA is fully open source, documented and publicly available [ https://github.com/Wluper/lida ]
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01599v1
PDF	https://arxiv.org/pdf/1911.01599v1.pdf
PWC	https://paperswithcode.com/paper/lida-lightweight-interactive-dialogue-1
Repo	https://github.com/Wluper/lida
Framework	none

Generation of 3D Brain MRI Using Auto-Encoding Generative Adversarial Networks


Title	Generation of 3D Brain MRI Using Auto-Encoding Generative Adversarial Networks
Authors	Gihyun Kwon, Chihye Han, Dae-shik Kim
Abstract	As deep learning is showing unprecedented success in medical image analysis tasks, the lack of sufficient medical data is emerging as a critical problem. While recent attempts to solve the limited data problem using Generative Adversarial Networks (GAN) have been successful in generating realistic images with diversity, most of them are based on image-to-image translation and thus require extensive datasets from different domains. Here, we propose a novel model that can successfully generate 3D brain MRI data from random vectors by learning the data distribution. Our 3D GAN model solves both image blurriness and mode collapse problems by leveraging alpha-GAN that combines the advantages of Variational Auto-Encoder (VAE) and GAN with an additional code discriminator network. We also use the Wasserstein GAN with Gradient Penalty (WGAN-GP) loss to lower the training instability. To demonstrate the effectiveness of our model, we generate new images of normal brain MRI and show that our model outperforms baseline models in both quantitative and qualitative measurements. We also train the model to synthesize brain disorder MRI data to demonstrate the wide applicability of our model. Our results suggest that the proposed model can successfully generate various types and modalities of 3D whole brain volumes from a small set of training data.
Tasks	Image-to-Image Translation
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02498v1
PDF	https://arxiv.org/pdf/1908.02498v1.pdf
PWC	https://paperswithcode.com/paper/generation-of-3d-brain-mri-using-auto
Repo	https://github.com/cyclomon/3dbraingen
Framework	pytorch

ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation


Title	ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation
Authors	Xinxin Hu, Kailun Yang, Lei Fei, Kaiwei Wang
Abstract	Compared to RGB semantic segmentation, RGBD semantic segmentation can achieve better performance by taking depth information into consideration. However, it is still problematic for contemporary segmenters to effectively exploit RGBD information since the feature distributions of RGB and depth (D) images vary significantly in different scenes. In this paper, we propose an Attention Complementary Network (ACNet) that selectively gathers features from RGB and depth branches. The main contributions lie in the Attention Complementary Module (ACM) and the architecture with three parallel branches. More precisely, ACM is a channel attention-based module that extracts weighted features from RGB and depth branches. The architecture preserves the inference of the original RGB and depth branches, and enables the fusion branch at the same time. Based on the above structures, ACNet is capable of exploiting more high-quality features from different channels. We evaluate our model on SUN-RGBD and NYUDv2 datasets, and prove that our model outperforms state-of-the-art methods. In particular, a mIoU score of 48.3% on NYUDv2 test set is achieved with ResNet50. We will release our source code based on PyTorch and the trained segmentation model at https://github.com/anheidelonghu/ACNet.
Tasks	Semantic Segmentation
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10089v1
PDF	https://arxiv.org/pdf/1905.10089v1.pdf
PWC	https://paperswithcode.com/paper/acnet-attention-based-network-to-exploit
Repo	https://github.com/anheidelonghu/ACNet
Framework	pytorch

Capture, Learning, and Synthesis of 3D Speaking Styles


Title	Capture, Learning, and Synthesis of 3D Speaking Styles
Authors	Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, Michael J. Black
Abstract	Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input - even speech in languages other than English - and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes at http://voca.is.tue.mpg.de.
Tasks	3D Face Animation, Talking Face Generation
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03079v1
PDF	https://arxiv.org/pdf/1905.03079v1.pdf
PWC	https://paperswithcode.com/paper/capture-learning-and-synthesis-of-3d-speaking
Repo	https://github.com/TimoBolkart/voca
Framework	tf

The Animal-AI Environment: Training and Testing Animal-Like Artificial Cognition


Title	The Animal-AI Environment: Training and Testing Animal-Like Artificial Cognition
Authors	Benjamin Beyret, José Hernández-Orallo, Lucy Cheke, Marta Halina, Murray Shanahan, Matthew Crosby
Abstract	Recent advances in artificial intelligence have been strongly driven by the use of game environments for training and evaluating agents. Games are often accessible and versatile, with well-defined state-transitions and goals allowing for intensive training and experimentation. However, agents trained in a particular environment are usually tested on the same or slightly varied distributions, and solutions do not necessarily imply any understanding. If we want AI systems that can model and understand their environment, we need environments that explicitly test for this. Inspired by the extensive literature on animal cognition, we present an environment that keeps all the positive elements of standard gaming environments, but is explicitly designed for the testing of animal-like artificial cognition.
Tasks
Published	2019-09-12
URL	https://arxiv.org/abs/1909.07483v2
PDF	https://arxiv.org/pdf/1909.07483v2.pdf
PWC	https://paperswithcode.com/paper/the-animal-ai-environment-training-and
Repo	https://github.com/beyretb/AnimalAI-Olympics
Framework	tf

Out-of-Domain Detection for Low-Resource Text Classification Tasks


Title	Out-of-Domain Detection for Low-Resource Text Classification Tasks
Authors	Ming Tan, Yang Yu, Haoyu Wang, Dakuo Wang, Saloni Potdar, Shiyu Chang, Mo Yu
Abstract	Out-of-domain (OOD) detection for low-resource text classification is a realistic but understudied task. The goal is to detect the OOD cases with limited in-domain (ID) training data, since we observe that training data is often insufficient in machine learning applications. In this work, we propose an OOD-resistant Prototypical Network to tackle this zero-shot OOD detection and few-shot ID classification task. Evaluation on real-world datasets show that the proposed solution outperforms state-of-the-art methods in zero-shot OOD detection task, while maintaining a competitive performance on ID classification task.
Tasks	Text Classification
Published	2019-08-31
URL	https://arxiv.org/abs/1909.05357v1
PDF	https://arxiv.org/pdf/1909.05357v1.pdf
PWC	https://paperswithcode.com/paper/out-of-domain-detection-for-low-resource-text
Repo	https://github.com/SLAD-ml/few-shot-ood
Framework	tf

PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes


Title	PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
Authors	Rundi Wu, Yixin Zhuang, Kai Xu, Hao Zhang, Baoquan Chen
Abstract	We introduce PQ-NET, a deep neural network which represents and generates 3D shapes via sequential part assembly. The input to our network is a 3D shape segmented into parts, where each part is first encoded into a feature representation using a part autoencoder. The core component of PQ-NET is a sequence-to-sequence or Seq2Seq autoencoder which encodes a sequence of part features into a latent vector of fixed size, and the decoder reconstructs the 3D shape, one part at a time, resulting in a sequential assembly. The latent space formed by the Seq2Seq encoder encodes both part structure and fine part geometry. The decoder can be adapted to perform several generative tasks including shape autoencoding, interpolation, novel shape generation, and single-view 3D reconstruction, where the generated shapes are all composed of meaningful parts.
Tasks	3D Reconstruction, Single-View 3D Reconstruction
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10949v1
PDF	https://arxiv.org/pdf/1911.10949v1.pdf
PWC	https://paperswithcode.com/paper/pq-net-a-generative-part-seq2seq-network-for
Repo	https://github.com/ChrisWu1997/PQ-NET
Framework	pytorch

Multi-Agent Adversarial Inverse Reinforcement Learning


Title	Multi-Agent Adversarial Inverse Reinforcement Learning
Authors	Lantao Yu, Jiaming Song, Stefano Ermon
Abstract	Reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios. Inverse reinforcement learning provides a framework to automatically acquire suitable reward functions from expert demonstrations. Its extension to multi-agent settings, however, is difficult due to the more complex notions of rational behaviors. In this paper, we propose MA-AIRL, a new framework for multi-agent inverse reinforcement learning, which is effective and scalable for Markov games with high-dimensional state-action space and unknown dynamics. We derive our algorithm based on a new solution concept and maximum pseudolikelihood estimation within an adversarial reward learning framework. In the experiments, we demonstrate that MA-AIRL can recover reward functions that are highly correlated with ground truth ones, and significantly outperforms prior methods in terms of policy imitation.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13220v1
PDF	https://arxiv.org/pdf/1907.13220v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-adversarial-inverse-reinforcement
Repo	https://github.com/ermongroup/MA-AIRL
Framework	none

Learning Rate Dropout


Title	Learning Rate Dropout
Authors	Huangxing Lin, Weihong Zeng, Xinghao Ding, Yue Huang, Chenxi Huang, John Paisley
Abstract	The performance of a deep neural network is highly dependent on its training, and finding better local optimal solutions is the goal of many optimization algorithms. However, existing optimization algorithms show a preference for descent paths that converge slowly and do not seek to avoid bad local optima. In this work, we propose Learning Rate Dropout (LRD), a simple gradient descent technique for training related to coordinate descent. LRD empirically aids the optimizer to actively explore in the parameter space by randomly setting some learning rates to zero; at each iteration, only parameters whose learning rate is not 0 are updated. As the learning rate of different parameters is dropped, the optimizer will sample a new loss descent path for the current update. The uncertainty of the descent path helps the model avoid saddle points and bad local minima. Experiments show that LRD is surprisingly effective in accelerating training while preventing overfitting.
Tasks
Published	2019-11-30
URL	https://arxiv.org/abs/1912.00144v2
PDF	https://arxiv.org/pdf/1912.00144v2.pdf
PWC	https://paperswithcode.com/paper/learning-rate-dropout
Repo	https://github.com/ifeherva/optimizer-benchmark
Framework	pytorch

OncoNetExplainer: Explainable Predictions of Cancer Types Based on Gene Expression Data


Title	OncoNetExplainer: Explainable Predictions of Cancer Types Based on Gene Expression Data
Authors	Md. Rezaul Karim, Michael Cochez, Oya Beyan, Stefan Decker, Christoph Lange
Abstract	The discovery of important biomarkers is a significant step towards understanding the molecular mechanisms of carcinogenesis; enabling accurate diagnosis for, and prognosis of, a certain cancer type. Before recommending any diagnosis, genomics data such as gene expressions(GE) and clinical outcomes need to be analyzed. However, complex nature, high dimensionality, and heterogeneity in genomics data make the overall analysis challenging. Convolutional neural networks(CNN) have shown tremendous success in solving such problems. However, neural network models are perceived mostly as `black box’ methods because of their not well-understood internal functioning. However, interpretability is important to provide insights on why a given cancer case has a certain type. Besides, finding the most important biomarkers can help in recommending more accurate treatments and drug repositioning. In this paper, we propose a new approach called OncoNetExplainer to make explainable predictions of cancer types based on GE data. We used genomics data about 9,074 cancer patients covering 33 different cancer types from the Pan-Cancer Atlas on which we trained CNN and VGG16 networks using guided-gradient class activation maps++(GradCAM++). Further, we generate class-specific heat maps to identify significant biomarkers and computed feature importance in terms of mean absolute impact to rank top genes across all the cancer types. Quantitative and qualitative analyses show that both models exhibit high confidence at predicting the cancer types correctly giving an average precision of 96.25%. To provide comparisons with the baselines, we identified top genes, and cancer-specific driver genes using gradient boosted trees and SHapley Additive exPlanations(SHAP). Finally, our findings were validated with the annotations provided by the TumorPortal. \|
Tasks	Feature Importance
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04169v1
PDF	https://arxiv.org/pdf/1909.04169v1.pdf
PWC	https://paperswithcode.com/paper/onconetexplainer-explainable-predictions-of
Repo	https://github.com/rezacsedu/XAI_Cancer_Prediction
Framework	tf

Long-term Joint Scheduling for Urban Traffic


Title	Long-term Joint Scheduling for Urban Traffic
Authors	Xianfeng Liang, Likang Wu, Joya Chen, Yang Liu, Runlong Yu, Min Hou, Han Wu, Yuyang Ye, Qi Liu, Enhong Chen
Abstract	Recently, the traffic congestion in modern cities has become a growing worry for the residents. As presented in Baidu traffic report, the commuting stress index has reached surprising 1.973 in Beijing during rush hours, which results in longer trip time and increased vehicular queueing. Previous works have demonstrated that by reasonable scheduling, e.g, rebalancing bike-sharing systems and optimized bus transportation, the traffic efficiency could be significantly improved with little resource consumption. However, there are still two disadvantages that restrict their performance: (1) they only consider single scheduling in a short time, but ignoring the layout after first reposition, and (2) they only focus on the single transport. However, the multi-modal characteristics of urban public transportation are largely under-exploited. In this paper, we propose an efficient and economical multi-modal traffic scheduling scheme named JLRLS based on spatio -temporal prediction, which adopts reinforcement learning to obtain optimal long-term and joint schedule. In JLRLS, we combines multiple transportation to conduct scheduling by their own characteristics, which potentially helps the system to reach the optimal performance. Our implementation of an example by PaddlePaddle is available at https://github.com/bigdata-ustc/Long-term-Joint-Scheduling, with an explaining video at https://youtu.be/t5M2wVPhTyk.
Tasks
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12283v1
PDF	https://arxiv.org/pdf/1910.12283v1.pdf
PWC	https://paperswithcode.com/paper/long-term-joint-scheduling-for-urban-traffic
Repo	https://github.com/bigdata-ustc/Long-term-Joint-Scheduling
Framework	none

Unbiased Measurement of Feature Importance in Tree-Based Methods


Title	Unbiased Measurement of Feature Importance in Tree-Based Methods
Authors	Zhengze Zhou, Giles Hooker
Abstract	We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.
Tasks	Feature Importance
Published	2019-03-12
URL	https://arxiv.org/abs/1903.05179v2
PDF	https://arxiv.org/pdf/1903.05179v2.pdf
PWC	https://paperswithcode.com/paper/unbiased-measurement-of-feature-importance-in
Repo	https://github.com/ZhengzeZhou/unbiased-feature-importance
Framework	none