May 7, 2019

2895 words 14 mins read

Paper Group AWR 83

Deep Feature Consistent Variational Autoencoder. Stratification of patient trajectories using covariate latent variable models. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. Faster R-CNN Features for Instance Search. Generative Adversarial Imitation Learning. Robsut Wrod Reocginiton via semi-Character Recurrent Neu …

Deep Feature Consistent Variational Autoencoder


Title	Deep Feature Consistent Variational Autoencoder
Authors	Xianxu Hou, Linlin Shen, Ke Sun, Guoping Qiu
Abstract	We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE’s output to preserve the spatial correlation characteristics of the input, thus leading the output to have a more natural visual appearance and better perceptual quality. Based on recent deep learning works such as style transfer, we employ a pre-trained deep convolutional neural network (CNN) and use its hidden features to define a feature perceptual loss for VAE training. Evaluated on the CelebA face dataset, we show that our model produces better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the semantic information of face expressions and can be used to achieve state-of-the-art performance in facial attribute prediction.
Tasks	Style Transfer
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00291v1
PDF	http://arxiv.org/pdf/1610.00291v1.pdf
PWC	https://paperswithcode.com/paper/deep-feature-consistent-variational
Repo	https://github.com/ku2482/vae.pytorch
Framework	pytorch

Stratification of patient trajectories using covariate latent variable models


Title	Stratification of patient trajectories using covariate latent variable models
Authors	Kieran R. Campbell, Christopher Yau
Abstract	Standard models assign disease progression to discrete categories or stages based on well-characterized clinical markers. However, such a system is potentially at odds with our understanding of the underlying biology, which in highly complex systems may support a (near-)continuous evolution of disease from inception to terminal state. To learn such a continuous disease score one could infer a latent variable from dynamic “omics” data such as RNA-seq that correlates with an outcome of interest such as survival time. However, such analyses may be confounded by additional data such as clinical covariates measured in electronic health records (EHRs). As a solution to this we introduce covariate latent variable models, a novel type of latent variable model that learns a low-dimensional data representation in the presence of two (asymmetric) views of the same data source. We apply our model to TCGA colorectal cancer RNA-seq data and demonstrate how incorporating microsatellite-instability (MSI) status as an external covariate allows us to identify genes that stratify patients on an immune-response trajectory. Finally, we propose an extension termed Covariate Gaussian Process Latent Variable Models for learning nonparametric, nonlinear representations. An R package implementing variational inference for covariate latent variable models is available at http://github.com/kieranrcampbell/clvm.
Tasks	Latent Variable Models
Published	2016-10-27
URL	http://arxiv.org/abs/1610.08735v2
PDF	http://arxiv.org/pdf/1610.08735v2.pdf
PWC	https://paperswithcode.com/paper/stratification-of-patient-trajectories-using
Repo	https://github.com/kieranrcampbell/clvm
Framework	none

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification


Title	Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Authors	Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, Kilian Weinberger
Abstract	In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.
Tasks	Cross-Lingual Document Classification, Cross-Lingual Transfer, Sentiment Analysis
Published	2016-06-06
URL	http://arxiv.org/abs/1606.01614v5
PDF	http://arxiv.org/pdf/1606.01614v5.pdf
PWC	https://paperswithcode.com/paper/adversarial-deep-averaging-networks-for-cross
Repo	https://github.com/ccsasuke/adan
Framework	pytorch

Faster R-CNN Features for Instance Search


Title	Faster R-CNN Features for Instance Search
Authors	Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin’ichi Satoh
Abstract	Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN. We take advantage of the object proposals learned by a Region Proposal Network (RPN) and their associated CNN features to build an instance search pipeline composed of a first filtering stage followed by a spatial reranking. We further investigate the suitability of Faster R-CNN features when the network is fine-tuned for the same objects one wants to retrieve. We assess the performance of our proposed system with the Oxford Buildings 5k, Paris Buildings 6k and a subset of TRECVid Instance Search 2013, achieving competitive results.
Tasks	Instance Search, Object Detection
Published	2016-04-29
URL	http://arxiv.org/abs/1604.08893v1
PDF	http://arxiv.org/pdf/1604.08893v1.pdf
PWC	https://paperswithcode.com/paper/faster-r-cnn-features-for-instance-search
Repo	https://github.com/vohoaiviet/retrieval-2016-deepvision
Framework	caffe2

Generative Adversarial Imitation Learning


Title	Generative Adversarial Imitation Learning
Authors	Jonathan Ho, Stefano Ermon
Abstract	Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.
Tasks	Imitation Learning
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03476v1
PDF	http://arxiv.org/pdf/1606.03476v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-imitation-learning
Repo	https://github.com/nav74neet/gail-tf-gym
Framework	tf

Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network


Title	Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network
Authors	Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme
Abstract	Language processing mechanism by humans is generally more robust than computers. The Cmabrigde Uinervtisy (Cambridge University) effect from the psycholinguistics literature has demonstrated such a robust word processing mechanism, where jumbled words (e.g. Cmabrigde / Cambridge) are recognized with little cost. On the other hand, computational models for word recognition (e.g. spelling checkers) perform poorly on data with such noise. Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN). In our experiments, we demonstrate that scRNN has significantly more robust performance in word spelling correction (i.e. word recognition) compared to existing spelling checkers and character-based convolutional neural network. Furthermore, we demonstrate that the model is cognitively plausible by replicating a psycholinguistics experiment about human reading difficulty using our model.
Tasks	Spelling Correction
Published	2016-08-07
URL	http://arxiv.org/abs/1608.02214v2
PDF	http://arxiv.org/pdf/1608.02214v2.pdf
PWC	https://paperswithcode.com/paper/robsut-wrod-reocginiton-via-semi-character
Repo	https://github.com/simonroquette/CORAP
Framework	none

Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations


Title	Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations
Authors	Vered Shwartz, Ido Dagan
Abstract	Recognizing various semantic relations between terms is beneficial for many NLP tasks. While path-based and distributional information sources are considered complementary for this task, the superior results the latter showed recently suggested that the former’s contribution might have become obsolete. We follow the recent success of an integrated neural method for hypernymy detection (Shwartz et al., 2016) and extend it to recognize multiple relations. The empirical results show that this method is effective in the multiclass setting as well. We further show that the path-based information source always contributes to the classification, and analyze the cases in which it mostly complements the distributional information.
Tasks
Published	2016-08-17
URL	http://arxiv.org/abs/1608.05014v4
PDF	http://arxiv.org/pdf/1608.05014v4.pdf
PWC	https://paperswithcode.com/paper/path-based-vs-distributional-information-in
Repo	https://github.com/vered1986/LexNET
Framework	none

Deep Structured Energy Based Models for Anomaly Detection


Title	Deep Structured Energy Based Models for Anomaly Detection
Authors	Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang
Abstract	In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods.
Tasks	Anomaly Detection
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07717v2
PDF	http://arxiv.org/pdf/1605.07717v2.pdf
PWC	https://paperswithcode.com/paper/deep-structured-energy-based-models-for
Repo	https://github.com/zehuichen123/DSEBM
Framework	none

Efficient Training for Positive Unlabeled Learning


Title	Efficient Training for Positive Unlabeled Learning
Authors	Emanuele Sansone, Francesco G. B. De Natale, Zhi-Hua Zhou
Abstract	Positive unlabeled (PU) learning is useful in various practical situations, where there is a need to learn a classifier for a class of interest from an unlabeled data set, which may contain anomalies as well as samples from unknown classes. The learning task can be formulated as an optimization problem under the framework of statistical learning theory. Recent studies have theoretically analyzed its properties and generalization performance, nevertheless, little effort has been made to consider the problem of scalability, especially when large sets of unlabeled data are available. In this work we propose a novel scalable PU learning algorithm that is theoretically proven to provide the optimal solution, while showing superior computational and memory performance. Experimental evaluation confirms the theoretical evidence and shows that the proposed method can be successfully applied to a large variety of real-world problems involving PU learning.
Tasks
Published	2016-08-24
URL	http://arxiv.org/abs/1608.06807v4
PDF	http://arxiv.org/pdf/1608.06807v4.pdf
PWC	https://paperswithcode.com/paper/efficient-training-for-positive-unlabeled
Repo	https://github.com/emsansone/USMO
Framework	none

A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects


Title	A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects
Authors	Yonatan Belinkov, James Glass
Abstract	Discriminating between closely-related language varieties is considered a challenging and important task. This paper describes our submission to the DSL 2016 shared-task, which included two sub-tasks: one on discriminating similar languages and one on identifying Arabic dialects. We developed a character-level neural network for this task. Given a sequence of characters, our model embeds each character in vector space, runs the sequence through multiple convolutions with different filter widths, and pools the convolutional representations to obtain a hidden vector representation of the text that is used for predicting the language or dialect. We primarily focused on the Arabic dialect identification task and obtained an F1 score of 0.4834, ranking 6th out of 18 participants. We also analyze errors made by our system on the Arabic data in some detail, and point to challenges such an approach is faced with.
Tasks
Published	2016-09-24
URL	http://arxiv.org/abs/1609.07568v1
PDF	http://arxiv.org/pdf/1609.07568v1.pdf
PWC	https://paperswithcode.com/paper/a-character-level-convolutional-neural
Repo	https://github.com/boknilev/dsl-char-cnn
Framework	tf

Conditional Image Synthesis With Auxiliary Classifier GANs


Title	Conditional Image Synthesis With Auxiliary Classifier GANs
Authors	Augustus Odena, Christopher Olah, Jonathon Shlens
Abstract	Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.
Tasks	Conditional Image Generation, Image Generation, Image Quality Assessment
Published	2016-10-30
URL	http://arxiv.org/abs/1610.09585v4
PDF	http://arxiv.org/pdf/1610.09585v4.pdf
PWC	https://paperswithcode.com/paper/conditional-image-synthesis-with-auxiliary
Repo	https://github.com/VitoRazor/Gan_Architecture
Framework	tf

Deep Learning Human Mind for Automated Visual Classification


Title	Deep Learning Human Mind for Automated Visual Classification
Authors	Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Mubarak Shah, Nasim Souly
Abstract	What if we could effectively read the mind and transfer human visual capabilities to computer vision methods? In this paper, we aim at addressing this question by developing the first visual object classifier driven by human brain signals. In particular, we employ EEG data evoked by visual object stimuli combined with Recurrent Neural Networks (RNN) to learn a discriminative brain activity manifold of visual categories. Afterwards, we train a Convolutional Neural Network (CNN)-based regressor to project images onto the learned manifold, thus effectively allowing machines to employ human brain-based features for automated visual classification. We use a 32-channel EEG to record brain activity of seven subjects while looking at images of 40 ImageNet object classes. The proposed RNN based approach for discriminating object classes using brain signals reaches an average accuracy of about 40%, which outperforms existing methods attempting to learn EEG visual object representations. As for automated object categorization, our human brain-driven approach obtains competitive performance, comparable to those achieved by powerful CNN models, both on ImageNet and CalTech 101, thus demonstrating its classification and generalization capabilities. This gives us a real hope that, indeed, human mind can be read and transferred to machines.
Tasks	EEG
Published	2016-09-01
URL	https://arxiv.org/abs/1609.00344v2
PDF	https://arxiv.org/pdf/1609.00344v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-human-mind-for-automated-visual
Repo	https://github.com/AliAbyaneh/Extracting-Image-from-EEG-signals
Framework	none

Joint Learning of Sentence Embeddings for Relevance and Entailment


Title	Joint Learning of Sentence Embeddings for Relevance and Entailment
Authors	Petr Baudis, Silvestr Stanko, Jan Sedivy
Abstract	We consider the problem of Recognizing Textual Entailment within an Information Retrieval context, where we must simultaneously determine the relevancy as well as degree of entailment for individual pieces of evidence to determine a yes/no answer to a binary natural language question. We compare several variants of neural networks for sentence embeddings in a setting of decision-making based on evidence of varying relevance. We propose a basic model to integrate evidence for entailment, show that joint training of the sentence embeddings to model relevance and entailment is feasible even with no explicit per-evidence supervision, and show the importance of evaluating strong baselines. We also demonstrate the benefit of carrying over text comprehension model trained on an unrelated task for our small datasets. Our research is motivated primarily by a new open dataset we introduce, consisting of binary questions and news-based evidence snippets. We also apply the proposed relevance-entailment model on a similar task of ranking multiple-choice test answers, evaluating it on a preliminary dataset of school test questions as well as the standard MCTest dataset, where we improve the neural model state-of-art.
Tasks	Decision Making, Information Retrieval, Natural Language Inference, Reading Comprehension, Sentence Embeddings
Published	2016-05-16
URL	http://arxiv.org/abs/1605.04655v2
PDF	http://arxiv.org/pdf/1605.04655v2.pdf
PWC	https://paperswithcode.com/paper/joint-learning-of-sentence-embeddings-for
Repo	https://github.com/brmson/dataset-sts
Framework	none

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation


Title	Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Authors	Rakshith Shetty, Jorma Laaksonen
Abstract	We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset. Our model is based on the encoder–decoder pipeline, popular in image and video captioning systems. We propose to utilize two different kinds of video features, one to capture the video content in terms of objects and attributes, and the other to capture the motion and action information. Using these diverse features we train models specializing in two separate input sub-domains. We then train an evaluator model which is used to pick the best caption from the pool of candidates generated by these domain expert models. We argue that this approach is better suited for the current video captioning task, compared to using a single model, due to the diversity in the dataset. Efficacy of our method is proven by the fact that it was rated best in MSR Video to Language Challenge, as per human evaluation. Additionally, we were ranked second in the automatic evaluation metrics based table.
Tasks	Video Captioning
Published	2016-08-17
URL	http://arxiv.org/abs/1608.04959v1
PDF	http://arxiv.org/pdf/1608.04959v1.pdf
PWC	https://paperswithcode.com/paper/frame-and-segment-level-features-and
Repo	https://github.com/rakshithShetty/captionGAN
Framework	none

Adapting Deep Network Features to Capture Psychological Representations


Title	Adapting Deep Network Features to Capture Psychological Representations
Authors	Joshua C. Peterson, Joshua T. Abbott, Thomas L. Griffiths
Abstract	Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship between the representations learned by these networks and human psychological representations recovered from similarity judgments. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not capture some qualitative distinctions that are a key part of human representations. To remedy this, we develop a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments.
Tasks	Object Classification, Object Recognition, Scene Understanding, Semantic Segmentation
Published	2016-08-06
URL	http://arxiv.org/abs/1608.02164v1
PDF	http://arxiv.org/pdf/1608.02164v1.pdf
PWC	https://paperswithcode.com/paper/adapting-deep-network-features-to-capture
Repo	https://github.com/kbraunlich/contort_DNN
Framework	none