Paper Group AWR 339
RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes. Y-Autoencoders: disentangling latent representations via sequential-encoding. Sentiment analysis is not solved! Assessing and probing sentiment classification. LIDA: Lightweight Interactive Dialogue Annotator. Generation of 3D Brain MRI Using Auto-Encoding Generative Adversari …
RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes
Title | RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes |
Authors | Po-Wei Wu, Yu-Jing Lin, Che-Han Chang, Edward Y. Chang, Shih-Wei Liao |
Abstract | Multi-domain image-to-image translation has gained increasing attention recently. Previous methods take an image and some target attributes as inputs and generate an output image with the desired attributes. However, such methods have two limitations. First, these methods assume binary-valued attributes and thus cannot yield satisfactory results for fine-grained control. Second, these methods require specifying the entire set of target attributes, even if most of the attributes would not be changed. To address these limitations, we propose RelGAN, a new method for multi-domain image-to-image translation. The key idea is to use relative attributes, which describes the desired change on selected attributes. Our method is capable of modifying images by changing particular attributes of interest in a continuous manner while preserving the other attributes. Experimental results demonstrate both the quantitative and qualitative effectiveness of our method on the tasks of facial attribute transfer and interpolation. |
Tasks | Image-to-Image Translation |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07269v1 |
https://arxiv.org/pdf/1908.07269v1.pdf | |
PWC | https://paperswithcode.com/paper/relgan-multi-domain-image-to-image |
Repo | https://github.com/elvisyjlin/RelGAN-PyTorch |
Framework | pytorch |
Y-Autoencoders: disentangling latent representations via sequential-encoding
Title | Y-Autoencoders: disentangling latent representations via sequential-encoding |
Authors | Massimiliano Patacchiola, Patrick Fox-Roberts, Edward Rosten |
Abstract | In the last few years there have been important advancements in generative models with the two dominant approaches being Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). However, standard Autoencoders (AEs) and closely related structures have remained popular because they are easy to train and adapt to different tasks. An interesting question is if we can achieve state-of-the-art performance with AEs while retaining their good properties. We propose an answer to this question by introducing a new model called Y-Autoencoder (Y-AE). The structure and training procedure of a Y-AE enclose a representation into an implicit and an explicit part. The implicit part is similar to the output of an autoencoder and the explicit part is strongly correlated with labels in the training set. The two parts are separated in the latent space by splitting the output of the encoder into two paths (forming a Y shape) before decoding and re-encoding. We then impose a number of losses, such as reconstruction loss, and a loss on dependence between the implicit and explicit parts. Additionally, the projection in the explicit manifold is monitored by a predictor, that is embedded in the encoder and trained end-to-end with no adversarial losses. We provide significant experimental results on various domains, such as separation of style and content, image-to-image translation, and inverse graphics. |
Tasks | Image-to-Image Translation |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.10949v1 |
https://arxiv.org/pdf/1907.10949v1.pdf | |
PWC | https://paperswithcode.com/paper/y-autoencoders-disentangling-latent |
Repo | https://github.com/mpatacchiola/Y-AE |
Framework | tf |
Sentiment analysis is not solved! Assessing and probing sentiment classification
Title | Sentiment analysis is not solved! Assessing and probing sentiment classification |
Authors | Jeremy Barnes, Lilja Øvrelid, Erik Velldal |
Abstract | Neural methods for SA have led to quantitative improvements over previous approaches, but these advances are not always accompanied with a thorough analysis of the qualitative differences. Therefore, it is not clear what outstanding conceptual challenges for sentiment analysis remain. In this work, we attempt to discover what challenges still prove a problem for sentiment classifiers for English and to provide a challenging dataset. We collect the subset of sentences that an (oracle) ensemble of state-of-the-art sentiment classifiers misclassify and then annotate them for 18 linguistic and paralinguistic phenomena, such as negation, sarcasm, modality, etc. The dataset is available at https://github.com/ltgoslo/assessing_and_probing_sentiment. Finally, we provide a case study that demonstrates the usefulness of the dataset to probe the performance of a given sentiment classifier with respect to linguistic phenomena. |
Tasks | Sentiment Analysis |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05887v1 |
https://arxiv.org/pdf/1906.05887v1.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-analysis-is-not-solved-assessing |
Repo | https://github.com/ltgoslo/assessing_and_probing_sentiment |
Framework | none |
LIDA: Lightweight Interactive Dialogue Annotator
Title | LIDA: Lightweight Interactive Dialogue Annotator |
Authors | Edward Collins, Nikolai Rozanov, Bingbing Zhang |
Abstract | Dialogue systems have the potential to change how people interact with machines but are highly dependent on the quality of the data used to train them. It is therefore important to develop good dialogue annotation tools which can improve the speed and quality of dialogue data annotation. With this in mind, we introduce LIDA, an annotation tool designed specifically for conversation data. As far as we know, LIDA is the first dialogue annotation system that handles the entire dialogue annotation pipeline from raw text, as may be the output of transcription services, to structured conversation data. Furthermore it supports the integration of arbitrary machine learning models as annotation recommenders and also has a dedicated interface to resolve inter-annotator disagreements such as after crowdsourcing annotations for a dataset. LIDA is fully open source, documented and publicly available [ https://github.com/Wluper/lida ] |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01599v1 |
https://arxiv.org/pdf/1911.01599v1.pdf | |
PWC | https://paperswithcode.com/paper/lida-lightweight-interactive-dialogue-1 |
Repo | https://github.com/Wluper/lida |
Framework | none |
Generation of 3D Brain MRI Using Auto-Encoding Generative Adversarial Networks
Title | Generation of 3D Brain MRI Using Auto-Encoding Generative Adversarial Networks |
Authors | Gihyun Kwon, Chihye Han, Dae-shik Kim |
Abstract | As deep learning is showing unprecedented success in medical image analysis tasks, the lack of sufficient medical data is emerging as a critical problem. While recent attempts to solve the limited data problem using Generative Adversarial Networks (GAN) have been successful in generating realistic images with diversity, most of them are based on image-to-image translation and thus require extensive datasets from different domains. Here, we propose a novel model that can successfully generate 3D brain MRI data from random vectors by learning the data distribution. Our 3D GAN model solves both image blurriness and mode collapse problems by leveraging alpha-GAN that combines the advantages of Variational Auto-Encoder (VAE) and GAN with an additional code discriminator network. We also use the Wasserstein GAN with Gradient Penalty (WGAN-GP) loss to lower the training instability. To demonstrate the effectiveness of our model, we generate new images of normal brain MRI and show that our model outperforms baseline models in both quantitative and qualitative measurements. We also train the model to synthesize brain disorder MRI data to demonstrate the wide applicability of our model. Our results suggest that the proposed model can successfully generate various types and modalities of 3D whole brain volumes from a small set of training data. |
Tasks | Image-to-Image Translation |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02498v1 |
https://arxiv.org/pdf/1908.02498v1.pdf | |
PWC | https://paperswithcode.com/paper/generation-of-3d-brain-mri-using-auto |
Repo | https://github.com/cyclomon/3dbraingen |
Framework | pytorch |
ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation
Title | ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation |
Authors | Xinxin Hu, Kailun Yang, Lei Fei, Kaiwei Wang |
Abstract | Compared to RGB semantic segmentation, RGBD semantic segmentation can achieve better performance by taking depth information into consideration. However, it is still problematic for contemporary segmenters to effectively exploit RGBD information since the feature distributions of RGB and depth (D) images vary significantly in different scenes. In this paper, we propose an Attention Complementary Network (ACNet) that selectively gathers features from RGB and depth branches. The main contributions lie in the Attention Complementary Module (ACM) and the architecture with three parallel branches. More precisely, ACM is a channel attention-based module that extracts weighted features from RGB and depth branches. The architecture preserves the inference of the original RGB and depth branches, and enables the fusion branch at the same time. Based on the above structures, ACNet is capable of exploiting more high-quality features from different channels. We evaluate our model on SUN-RGBD and NYUDv2 datasets, and prove that our model outperforms state-of-the-art methods. In particular, a mIoU score of 48.3% on NYUDv2 test set is achieved with ResNet50. We will release our source code based on PyTorch and the trained segmentation model at https://github.com/anheidelonghu/ACNet. |
Tasks | Semantic Segmentation |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10089v1 |
https://arxiv.org/pdf/1905.10089v1.pdf | |
PWC | https://paperswithcode.com/paper/acnet-attention-based-network-to-exploit |
Repo | https://github.com/anheidelonghu/ACNet |
Framework | pytorch |
Capture, Learning, and Synthesis of 3D Speaking Styles
Title | Capture, Learning, and Synthesis of 3D Speaking Styles |
Authors | Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, Michael J. Black |
Abstract | Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input - even speech in languages other than English - and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes at http://voca.is.tue.mpg.de. |
Tasks | 3D Face Animation, Talking Face Generation |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.03079v1 |
https://arxiv.org/pdf/1905.03079v1.pdf | |
PWC | https://paperswithcode.com/paper/capture-learning-and-synthesis-of-3d-speaking |
Repo | https://github.com/TimoBolkart/voca |
Framework | tf |
The Animal-AI Environment: Training and Testing Animal-Like Artificial Cognition
Title | The Animal-AI Environment: Training and Testing Animal-Like Artificial Cognition |
Authors | Benjamin Beyret, José Hernández-Orallo, Lucy Cheke, Marta Halina, Murray Shanahan, Matthew Crosby |
Abstract | Recent advances in artificial intelligence have been strongly driven by the use of game environments for training and evaluating agents. Games are often accessible and versatile, with well-defined state-transitions and goals allowing for intensive training and experimentation. However, agents trained in a particular environment are usually tested on the same or slightly varied distributions, and solutions do not necessarily imply any understanding. If we want AI systems that can model and understand their environment, we need environments that explicitly test for this. Inspired by the extensive literature on animal cognition, we present an environment that keeps all the positive elements of standard gaming environments, but is explicitly designed for the testing of animal-like artificial cognition. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.07483v2 |
https://arxiv.org/pdf/1909.07483v2.pdf | |
PWC | https://paperswithcode.com/paper/the-animal-ai-environment-training-and |
Repo | https://github.com/beyretb/AnimalAI-Olympics |
Framework | tf |
Out-of-Domain Detection for Low-Resource Text Classification Tasks
Title | Out-of-Domain Detection for Low-Resource Text Classification Tasks |
Authors | Ming Tan, Yang Yu, Haoyu Wang, Dakuo Wang, Saloni Potdar, Shiyu Chang, Mo Yu |
Abstract | Out-of-domain (OOD) detection for low-resource text classification is a realistic but understudied task. The goal is to detect the OOD cases with limited in-domain (ID) training data, since we observe that training data is often insufficient in machine learning applications. In this work, we propose an OOD-resistant Prototypical Network to tackle this zero-shot OOD detection and few-shot ID classification task. Evaluation on real-world datasets show that the proposed solution outperforms state-of-the-art methods in zero-shot OOD detection task, while maintaining a competitive performance on ID classification task. |
Tasks | Text Classification |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.05357v1 |
https://arxiv.org/pdf/1909.05357v1.pdf | |
PWC | https://paperswithcode.com/paper/out-of-domain-detection-for-low-resource-text |
Repo | https://github.com/SLAD-ml/few-shot-ood |
Framework | tf |
PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
Title | PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes |
Authors | Rundi Wu, Yixin Zhuang, Kai Xu, Hao Zhang, Baoquan Chen |
Abstract | We introduce PQ-NET, a deep neural network which represents and generates 3D shapes via sequential part assembly. The input to our network is a 3D shape segmented into parts, where each part is first encoded into a feature representation using a part autoencoder. The core component of PQ-NET is a sequence-to-sequence or Seq2Seq autoencoder which encodes a sequence of part features into a latent vector of fixed size, and the decoder reconstructs the 3D shape, one part at a time, resulting in a sequential assembly. The latent space formed by the Seq2Seq encoder encodes both part structure and fine part geometry. The decoder can be adapted to perform several generative tasks including shape autoencoding, interpolation, novel shape generation, and single-view 3D reconstruction, where the generated shapes are all composed of meaningful parts. |
Tasks | 3D Reconstruction, Single-View 3D Reconstruction |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10949v1 |
https://arxiv.org/pdf/1911.10949v1.pdf | |
PWC | https://paperswithcode.com/paper/pq-net-a-generative-part-seq2seq-network-for |
Repo | https://github.com/ChrisWu1997/PQ-NET |
Framework | pytorch |
Multi-Agent Adversarial Inverse Reinforcement Learning
Title | Multi-Agent Adversarial Inverse Reinforcement Learning |
Authors | Lantao Yu, Jiaming Song, Stefano Ermon |
Abstract | Reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios. Inverse reinforcement learning provides a framework to automatically acquire suitable reward functions from expert demonstrations. Its extension to multi-agent settings, however, is difficult due to the more complex notions of rational behaviors. In this paper, we propose MA-AIRL, a new framework for multi-agent inverse reinforcement learning, which is effective and scalable for Markov games with high-dimensional state-action space and unknown dynamics. We derive our algorithm based on a new solution concept and maximum pseudolikelihood estimation within an adversarial reward learning framework. In the experiments, we demonstrate that MA-AIRL can recover reward functions that are highly correlated with ground truth ones, and significantly outperforms prior methods in terms of policy imitation. |
Tasks | |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.13220v1 |
https://arxiv.org/pdf/1907.13220v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-adversarial-inverse-reinforcement |
Repo | https://github.com/ermongroup/MA-AIRL |
Framework | none |
Learning Rate Dropout
Title | Learning Rate Dropout |
Authors | Huangxing Lin, Weihong Zeng, Xinghao Ding, Yue Huang, Chenxi Huang, John Paisley |
Abstract | The performance of a deep neural network is highly dependent on its training, and finding better local optimal solutions is the goal of many optimization algorithms. However, existing optimization algorithms show a preference for descent paths that converge slowly and do not seek to avoid bad local optima. In this work, we propose Learning Rate Dropout (LRD), a simple gradient descent technique for training related to coordinate descent. LRD empirically aids the optimizer to actively explore in the parameter space by randomly setting some learning rates to zero; at each iteration, only parameters whose learning rate is not 0 are updated. As the learning rate of different parameters is dropped, the optimizer will sample a new loss descent path for the current update. The uncertainty of the descent path helps the model avoid saddle points and bad local minima. Experiments show that LRD is surprisingly effective in accelerating training while preventing overfitting. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00144v2 |
https://arxiv.org/pdf/1912.00144v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-rate-dropout |
Repo | https://github.com/ifeherva/optimizer-benchmark |
Framework | pytorch |
OncoNetExplainer: Explainable Predictions of Cancer Types Based on Gene Expression Data
Title | OncoNetExplainer: Explainable Predictions of Cancer Types Based on Gene Expression Data |
Authors | Md. Rezaul Karim, Michael Cochez, Oya Beyan, Stefan Decker, Christoph Lange |
Abstract | The discovery of important biomarkers is a significant step towards understanding the molecular mechanisms of carcinogenesis; enabling accurate diagnosis for, and prognosis of, a certain cancer type. Before recommending any diagnosis, genomics data such as gene expressions(GE) and clinical outcomes need to be analyzed. However, complex nature, high dimensionality, and heterogeneity in genomics data make the overall analysis challenging. Convolutional neural networks(CNN) have shown tremendous success in solving such problems. However, neural network models are perceived mostly as `black box’ methods because of their not well-understood internal functioning. However, interpretability is important to provide insights on why a given cancer case has a certain type. Besides, finding the most important biomarkers can help in recommending more accurate treatments and drug repositioning. In this paper, we propose a new approach called OncoNetExplainer to make explainable predictions of cancer types based on GE data. We used genomics data about 9,074 cancer patients covering 33 different cancer types from the Pan-Cancer Atlas on which we trained CNN and VGG16 networks using guided-gradient class activation maps++(GradCAM++). Further, we generate class-specific heat maps to identify significant biomarkers and computed feature importance in terms of mean absolute impact to rank top genes across all the cancer types. Quantitative and qualitative analyses show that both models exhibit high confidence at predicting the cancer types correctly giving an average precision of 96.25%. To provide comparisons with the baselines, we identified top genes, and cancer-specific driver genes using gradient boosted trees and SHapley Additive exPlanations(SHAP). Finally, our findings were validated with the annotations provided by the TumorPortal. | |
Tasks | Feature Importance |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04169v1 |
https://arxiv.org/pdf/1909.04169v1.pdf | |
PWC | https://paperswithcode.com/paper/onconetexplainer-explainable-predictions-of |
Repo | https://github.com/rezacsedu/XAI_Cancer_Prediction |
Framework | tf |
Long-term Joint Scheduling for Urban Traffic
Title | Long-term Joint Scheduling for Urban Traffic |
Authors | Xianfeng Liang, Likang Wu, Joya Chen, Yang Liu, Runlong Yu, Min Hou, Han Wu, Yuyang Ye, Qi Liu, Enhong Chen |
Abstract | Recently, the traffic congestion in modern cities has become a growing worry for the residents. As presented in Baidu traffic report, the commuting stress index has reached surprising 1.973 in Beijing during rush hours, which results in longer trip time and increased vehicular queueing. Previous works have demonstrated that by reasonable scheduling, e.g, rebalancing bike-sharing systems and optimized bus transportation, the traffic efficiency could be significantly improved with little resource consumption. However, there are still two disadvantages that restrict their performance: (1) they only consider single scheduling in a short time, but ignoring the layout after first reposition, and (2) they only focus on the single transport. However, the multi-modal characteristics of urban public transportation are largely under-exploited. In this paper, we propose an efficient and economical multi-modal traffic scheduling scheme named JLRLS based on spatio -temporal prediction, which adopts reinforcement learning to obtain optimal long-term and joint schedule. In JLRLS, we combines multiple transportation to conduct scheduling by their own characteristics, which potentially helps the system to reach the optimal performance. Our implementation of an example by PaddlePaddle is available at https://github.com/bigdata-ustc/Long-term-Joint-Scheduling, with an explaining video at https://youtu.be/t5M2wVPhTyk. |
Tasks | |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12283v1 |
https://arxiv.org/pdf/1910.12283v1.pdf | |
PWC | https://paperswithcode.com/paper/long-term-joint-scheduling-for-urban-traffic |
Repo | https://github.com/bigdata-ustc/Long-term-Joint-Scheduling |
Framework | none |
Unbiased Measurement of Feature Importance in Tree-Based Methods
Title | Unbiased Measurement of Feature Importance in Tree-Based Methods |
Authors | Zhengze Zhou, Giles Hooker |
Abstract | We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools. |
Tasks | Feature Importance |
Published | 2019-03-12 |
URL | https://arxiv.org/abs/1903.05179v2 |
https://arxiv.org/pdf/1903.05179v2.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-measurement-of-feature-importance-in |
Repo | https://github.com/ZhengzeZhou/unbiased-feature-importance |
Framework | none |