May 7, 2019

2895 words 14 mins read

Paper Group AWR 83

Paper Group AWR 83

Deep Feature Consistent Variational Autoencoder. Stratification of patient trajectories using covariate latent variable models. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. Faster R-CNN Features for Instance Search. Generative Adversarial Imitation Learning. Robsut Wrod Reocginiton via semi-Character Recurrent Neu …

Deep Feature Consistent Variational Autoencoder

Title Deep Feature Consistent Variational Autoencoder
Authors Xianxu Hou, Linlin Shen, Ke Sun, Guoping Qiu
Abstract We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE’s output to preserve the spatial correlation characteristics of the input, thus leading the output to have a more natural visual appearance and better perceptual quality. Based on recent deep learning works such as style transfer, we employ a pre-trained deep convolutional neural network (CNN) and use its hidden features to define a feature perceptual loss for VAE training. Evaluated on the CelebA face dataset, we show that our model produces better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the semantic information of face expressions and can be used to achieve state-of-the-art performance in facial attribute prediction.
Tasks Style Transfer
Published 2016-10-02
URL http://arxiv.org/abs/1610.00291v1
PDF http://arxiv.org/pdf/1610.00291v1.pdf
PWC https://paperswithcode.com/paper/deep-feature-consistent-variational
Repo https://github.com/ku2482/vae.pytorch
Framework pytorch

Stratification of patient trajectories using covariate latent variable models

Title Stratification of patient trajectories using covariate latent variable models
Authors Kieran R. Campbell, Christopher Yau
Abstract Standard models assign disease progression to discrete categories or stages based on well-characterized clinical markers. However, such a system is potentially at odds with our understanding of the underlying biology, which in highly complex systems may support a (near-)continuous evolution of disease from inception to terminal state. To learn such a continuous disease score one could infer a latent variable from dynamic “omics” data such as RNA-seq that correlates with an outcome of interest such as survival time. However, such analyses may be confounded by additional data such as clinical covariates measured in electronic health records (EHRs). As a solution to this we introduce covariate latent variable models, a novel type of latent variable model that learns a low-dimensional data representation in the presence of two (asymmetric) views of the same data source. We apply our model to TCGA colorectal cancer RNA-seq data and demonstrate how incorporating microsatellite-instability (MSI) status as an external covariate allows us to identify genes that stratify patients on an immune-response trajectory. Finally, we propose an extension termed Covariate Gaussian Process Latent Variable Models for learning nonparametric, nonlinear representations. An R package implementing variational inference for covariate latent variable models is available at http://github.com/kieranrcampbell/clvm.
Tasks Latent Variable Models
Published 2016-10-27
URL http://arxiv.org/abs/1610.08735v2
PDF http://arxiv.org/pdf/1610.08735v2.pdf
PWC https://paperswithcode.com/paper/stratification-of-patient-trajectories-using
Repo https://github.com/kieranrcampbell/clvm
Framework none

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

Title Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Authors Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, Kilian Weinberger
Abstract In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.
Tasks Cross-Lingual Document Classification, Cross-Lingual Transfer, Sentiment Analysis
Published 2016-06-06
URL http://arxiv.org/abs/1606.01614v5
PDF http://arxiv.org/pdf/1606.01614v5.pdf
PWC https://paperswithcode.com/paper/adversarial-deep-averaging-networks-for-cross
Repo https://github.com/ccsasuke/adan
Framework pytorch
Title Faster R-CNN Features for Instance Search
Authors Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin’ichi Satoh
Abstract Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN. We take advantage of the object proposals learned by a Region Proposal Network (RPN) and their associated CNN features to build an instance search pipeline composed of a first filtering stage followed by a spatial reranking. We further investigate the suitability of Faster R-CNN features when the network is fine-tuned for the same objects one wants to retrieve. We assess the performance of our proposed system with the Oxford Buildings 5k, Paris Buildings 6k and a subset of TRECVid Instance Search 2013, achieving competitive results.
Tasks Instance Search, Object Detection
Published 2016-04-29
URL http://arxiv.org/abs/1604.08893v1
PDF http://arxiv.org/pdf/1604.08893v1.pdf
PWC https://paperswithcode.com/paper/faster-r-cnn-features-for-instance-search
Repo https://github.com/vohoaiviet/retrieval-2016-deepvision
Framework caffe2

Generative Adversarial Imitation Learning

Title Generative Adversarial Imitation Learning
Authors Jonathan Ho, Stefano Ermon
Abstract Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.
Tasks Imitation Learning
Published 2016-06-10
URL http://arxiv.org/abs/1606.03476v1
PDF http://arxiv.org/pdf/1606.03476v1.pdf
PWC https://paperswithcode.com/paper/generative-adversarial-imitation-learning
Repo https://github.com/nav74neet/gail-tf-gym
Framework tf

Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network

Title Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network
Authors Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme
Abstract Language processing mechanism by humans is generally more robust than computers. The Cmabrigde Uinervtisy (Cambridge University) effect from the psycholinguistics literature has demonstrated such a robust word processing mechanism, where jumbled words (e.g. Cmabrigde / Cambridge) are recognized with little cost. On the other hand, computational models for word recognition (e.g. spelling checkers) perform poorly on data with such noise. Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN). In our experiments, we demonstrate that scRNN has significantly more robust performance in word spelling correction (i.e. word recognition) compared to existing spelling checkers and character-based convolutional neural network. Furthermore, we demonstrate that the model is cognitively plausible by replicating a psycholinguistics experiment about human reading difficulty using our model.
Tasks Spelling Correction
Published 2016-08-07
URL http://arxiv.org/abs/1608.02214v2
PDF http://arxiv.org/pdf/1608.02214v2.pdf
PWC https://paperswithcode.com/paper/robsut-wrod-reocginiton-via-semi-character
Repo https://github.com/simonroquette/CORAP
Framework none

Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations

Title Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations
Authors Vered Shwartz, Ido Dagan
Abstract Recognizing various semantic relations between terms is beneficial for many NLP tasks. While path-based and distributional information sources are considered complementary for this task, the superior results the latter showed recently suggested that the former’s contribution might have become obsolete. We follow the recent success of an integrated neural method for hypernymy detection (Shwartz et al., 2016) and extend it to recognize multiple relations. The empirical results show that this method is effective in the multiclass setting as well. We further show that the path-based information source always contributes to the classification, and analyze the cases in which it mostly complements the distributional information.
Tasks
Published 2016-08-17
URL http://arxiv.org/abs/1608.05014v4
PDF http://arxiv.org/pdf/1608.05014v4.pdf
PWC https://paperswithcode.com/paper/path-based-vs-distributional-information-in
Repo https://github.com/vered1986/LexNET
Framework none

Deep Structured Energy Based Models for Anomaly Detection

Title Deep Structured Energy Based Models for Anomaly Detection
Authors Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang
Abstract In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods.
Tasks Anomaly Detection
Published 2016-05-25
URL http://arxiv.org/abs/1605.07717v2
PDF http://arxiv.org/pdf/1605.07717v2.pdf
PWC https://paperswithcode.com/paper/deep-structured-energy-based-models-for
Repo https://github.com/zehuichen123/DSEBM
Framework none

Efficient Training for Positive Unlabeled Learning

Title Efficient Training for Positive Unlabeled Learning
Authors Emanuele Sansone, Francesco G. B. De Natale, Zhi-Hua Zhou
Abstract Positive unlabeled (PU) learning is useful in various practical situations, where there is a need to learn a classifier for a class of interest from an unlabeled data set, which may contain anomalies as well as samples from unknown classes. The learning task can be formulated as an optimization problem under the framework of statistical learning theory. Recent studies have theoretically analyzed its properties and generalization performance, nevertheless, little effort has been made to consider the problem of scalability, especially when large sets of unlabeled data are available. In this work we propose a novel scalable PU learning algorithm that is theoretically proven to provide the optimal solution, while showing superior computational and memory performance. Experimental evaluation confirms the theoretical evidence and shows that the proposed method can be successfully applied to a large variety of real-world problems involving PU learning.
Tasks
Published 2016-08-24
URL http://arxiv.org/abs/1608.06807v4
PDF http://arxiv.org/pdf/1608.06807v4.pdf
PWC https://paperswithcode.com/paper/efficient-training-for-positive-unlabeled
Repo https://github.com/emsansone/USMO
Framework none

A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

Title A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects
Authors Yonatan Belinkov, James Glass
Abstract Discriminating between closely-related language varieties is considered a challenging and important task. This paper describes our submission to the DSL 2016 shared-task, which included two sub-tasks: one on discriminating similar languages and one on identifying Arabic dialects. We developed a character-level neural network for this task. Given a sequence of characters, our model embeds each character in vector space, runs the sequence through multiple convolutions with different filter widths, and pools the convolutional representations to obtain a hidden vector representation of the text that is used for predicting the language or dialect. We primarily focused on the Arabic dialect identification task and obtained an F1 score of 0.4834, ranking 6th out of 18 participants. We also analyze errors made by our system on the Arabic data in some detail, and point to challenges such an approach is faced with.
Tasks
Published 2016-09-24
URL http://arxiv.org/abs/1609.07568v1
PDF http://arxiv.org/pdf/1609.07568v1.pdf
PWC https://paperswithcode.com/paper/a-character-level-convolutional-neural
Repo https://github.com/boknilev/dsl-char-cnn
Framework tf

Conditional Image Synthesis With Auxiliary Classifier GANs

Title Conditional Image Synthesis With Auxiliary Classifier GANs
Authors Augustus Odena, Christopher Olah, Jonathon Shlens
Abstract Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.
Tasks Conditional Image Generation, Image Generation, Image Quality Assessment
Published 2016-10-30
URL http://arxiv.org/abs/1610.09585v4
PDF http://arxiv.org/pdf/1610.09585v4.pdf
PWC https://paperswithcode.com/paper/conditional-image-synthesis-with-auxiliary
Repo https://github.com/VitoRazor/Gan_Architecture
Framework tf

Deep Learning Human Mind for Automated Visual Classification

Title Deep Learning Human Mind for Automated Visual Classification
Authors Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Mubarak Shah, Nasim Souly
Abstract What if we could effectively read the mind and transfer human visual capabilities to computer vision methods? In this paper, we aim at addressing this question by developing the first visual object classifier driven by human brain signals. In particular, we employ EEG data evoked by visual object stimuli combined with Recurrent Neural Networks (RNN) to learn a discriminative brain activity manifold of visual categories. Afterwards, we train a Convolutional Neural Network (CNN)-based regressor to project images onto the learned manifold, thus effectively allowing machines to employ human brain-based features for automated visual classification. We use a 32-channel EEG to record brain activity of seven subjects while looking at images of 40 ImageNet object classes. The proposed RNN based approach for discriminating object classes using brain signals reaches an average accuracy of about 40%, which outperforms existing methods attempting to learn EEG visual object representations. As for automated object categorization, our human brain-driven approach obtains competitive performance, comparable to those achieved by powerful CNN models, both on ImageNet and CalTech 101, thus demonstrating its classification and generalization capabilities. This gives us a real hope that, indeed, human mind can be read and transferred to machines.
Tasks EEG
Published 2016-09-01
URL https://arxiv.org/abs/1609.00344v2
PDF https://arxiv.org/pdf/1609.00344v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-human-mind-for-automated-visual
Repo https://github.com/AliAbyaneh/Extracting-Image-from-EEG-signals
Framework none

Joint Learning of Sentence Embeddings for Relevance and Entailment

Title Joint Learning of Sentence Embeddings for Relevance and Entailment
Authors Petr Baudis, Silvestr Stanko, Jan Sedivy
Abstract We consider the problem of Recognizing Textual Entailment within an Information Retrieval context, where we must simultaneously determine the relevancy as well as degree of entailment for individual pieces of evidence to determine a yes/no answer to a binary natural language question. We compare several variants of neural networks for sentence embeddings in a setting of decision-making based on evidence of varying relevance. We propose a basic model to integrate evidence for entailment, show that joint training of the sentence embeddings to model relevance and entailment is feasible even with no explicit per-evidence supervision, and show the importance of evaluating strong baselines. We also demonstrate the benefit of carrying over text comprehension model trained on an unrelated task for our small datasets. Our research is motivated primarily by a new open dataset we introduce, consisting of binary questions and news-based evidence snippets. We also apply the proposed relevance-entailment model on a similar task of ranking multiple-choice test answers, evaluating it on a preliminary dataset of school test questions as well as the standard MCTest dataset, where we improve the neural model state-of-art.
Tasks Decision Making, Information Retrieval, Natural Language Inference, Reading Comprehension, Sentence Embeddings
Published 2016-05-16
URL http://arxiv.org/abs/1605.04655v2
PDF http://arxiv.org/pdf/1605.04655v2.pdf
PWC https://paperswithcode.com/paper/joint-learning-of-sentence-embeddings-for
Repo https://github.com/brmson/dataset-sts
Framework none

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

Title Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Authors Rakshith Shetty, Jorma Laaksonen
Abstract We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset. Our model is based on the encoder–decoder pipeline, popular in image and video captioning systems. We propose to utilize two different kinds of video features, one to capture the video content in terms of objects and attributes, and the other to capture the motion and action information. Using these diverse features we train models specializing in two separate input sub-domains. We then train an evaluator model which is used to pick the best caption from the pool of candidates generated by these domain expert models. We argue that this approach is better suited for the current video captioning task, compared to using a single model, due to the diversity in the dataset. Efficacy of our method is proven by the fact that it was rated best in MSR Video to Language Challenge, as per human evaluation. Additionally, we were ranked second in the automatic evaluation metrics based table.
Tasks Video Captioning
Published 2016-08-17
URL http://arxiv.org/abs/1608.04959v1
PDF http://arxiv.org/pdf/1608.04959v1.pdf
PWC https://paperswithcode.com/paper/frame-and-segment-level-features-and
Repo https://github.com/rakshithShetty/captionGAN
Framework none

Adapting Deep Network Features to Capture Psychological Representations

Title Adapting Deep Network Features to Capture Psychological Representations
Authors Joshua C. Peterson, Joshua T. Abbott, Thomas L. Griffiths
Abstract Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship between the representations learned by these networks and human psychological representations recovered from similarity judgments. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not capture some qualitative distinctions that are a key part of human representations. To remedy this, we develop a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments.
Tasks Object Classification, Object Recognition, Scene Understanding, Semantic Segmentation
Published 2016-08-06
URL http://arxiv.org/abs/1608.02164v1
PDF http://arxiv.org/pdf/1608.02164v1.pdf
PWC https://paperswithcode.com/paper/adapting-deep-network-features-to-capture
Repo https://github.com/kbraunlich/contort_DNN
Framework none
comments powered by Disqus