Paper Group AWR 83
Deep Feature Consistent Variational Autoencoder. Stratification of patient trajectories using covariate latent variable models. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. Faster R-CNN Features for Instance Search. Generative Adversarial Imitation Learning. Robsut Wrod Reocginiton via semi-Character Recurrent Neu …
Deep Feature Consistent Variational Autoencoder
Title | Deep Feature Consistent Variational Autoencoder |
Authors | Xianxu Hou, Linlin Shen, Ke Sun, Guoping Qiu |
Abstract | We present a novel method for constructing Variational Autoencoder (VAE). Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE, which ensures the VAE’s output to preserve the spatial correlation characteristics of the input, thus leading the output to have a more natural visual appearance and better perceptual quality. Based on recent deep learning works such as style transfer, we employ a pre-trained deep convolutional neural network (CNN) and use its hidden features to define a feature perceptual loss for VAE training. Evaluated on the CelebA face dataset, we show that our model produces better results than other methods in the literature. We also show that our method can produce latent vectors that can capture the semantic information of face expressions and can be used to achieve state-of-the-art performance in facial attribute prediction. |
Tasks | Style Transfer |
Published | 2016-10-02 |
URL | http://arxiv.org/abs/1610.00291v1 |
http://arxiv.org/pdf/1610.00291v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-feature-consistent-variational |
Repo | https://github.com/ku2482/vae.pytorch |
Framework | pytorch |
Stratification of patient trajectories using covariate latent variable models
Title | Stratification of patient trajectories using covariate latent variable models |
Authors | Kieran R. Campbell, Christopher Yau |
Abstract | Standard models assign disease progression to discrete categories or stages based on well-characterized clinical markers. However, such a system is potentially at odds with our understanding of the underlying biology, which in highly complex systems may support a (near-)continuous evolution of disease from inception to terminal state. To learn such a continuous disease score one could infer a latent variable from dynamic “omics” data such as RNA-seq that correlates with an outcome of interest such as survival time. However, such analyses may be confounded by additional data such as clinical covariates measured in electronic health records (EHRs). As a solution to this we introduce covariate latent variable models, a novel type of latent variable model that learns a low-dimensional data representation in the presence of two (asymmetric) views of the same data source. We apply our model to TCGA colorectal cancer RNA-seq data and demonstrate how incorporating microsatellite-instability (MSI) status as an external covariate allows us to identify genes that stratify patients on an immune-response trajectory. Finally, we propose an extension termed Covariate Gaussian Process Latent Variable Models for learning nonparametric, nonlinear representations. An R package implementing variational inference for covariate latent variable models is available at http://github.com/kieranrcampbell/clvm. |
Tasks | Latent Variable Models |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08735v2 |
http://arxiv.org/pdf/1610.08735v2.pdf | |
PWC | https://paperswithcode.com/paper/stratification-of-patient-trajectories-using |
Repo | https://github.com/kieranrcampbell/clvm |
Framework | none |
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Title | Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification |
Authors | Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, Kilian Weinberger |
Abstract | In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems. |
Tasks | Cross-Lingual Document Classification, Cross-Lingual Transfer, Sentiment Analysis |
Published | 2016-06-06 |
URL | http://arxiv.org/abs/1606.01614v5 |
http://arxiv.org/pdf/1606.01614v5.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-deep-averaging-networks-for-cross |
Repo | https://github.com/ccsasuke/adan |
Framework | pytorch |
Faster R-CNN Features for Instance Search
Title | Faster R-CNN Features for Instance Search |
Authors | Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin’ichi Satoh |
Abstract | Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN. We take advantage of the object proposals learned by a Region Proposal Network (RPN) and their associated CNN features to build an instance search pipeline composed of a first filtering stage followed by a spatial reranking. We further investigate the suitability of Faster R-CNN features when the network is fine-tuned for the same objects one wants to retrieve. We assess the performance of our proposed system with the Oxford Buildings 5k, Paris Buildings 6k and a subset of TRECVid Instance Search 2013, achieving competitive results. |
Tasks | Instance Search, Object Detection |
Published | 2016-04-29 |
URL | http://arxiv.org/abs/1604.08893v1 |
http://arxiv.org/pdf/1604.08893v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-r-cnn-features-for-instance-search |
Repo | https://github.com/vohoaiviet/retrieval-2016-deepvision |
Framework | caffe2 |
Generative Adversarial Imitation Learning
Title | Generative Adversarial Imitation Learning |
Authors | Jonathan Ho, Stefano Ermon |
Abstract | Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments. |
Tasks | Imitation Learning |
Published | 2016-06-10 |
URL | http://arxiv.org/abs/1606.03476v1 |
http://arxiv.org/pdf/1606.03476v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-imitation-learning |
Repo | https://github.com/nav74neet/gail-tf-gym |
Framework | tf |
Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network
Title | Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network |
Authors | Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme |
Abstract | Language processing mechanism by humans is generally more robust than computers. The Cmabrigde Uinervtisy (Cambridge University) effect from the psycholinguistics literature has demonstrated such a robust word processing mechanism, where jumbled words (e.g. Cmabrigde / Cambridge) are recognized with little cost. On the other hand, computational models for word recognition (e.g. spelling checkers) perform poorly on data with such noise. Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN). In our experiments, we demonstrate that scRNN has significantly more robust performance in word spelling correction (i.e. word recognition) compared to existing spelling checkers and character-based convolutional neural network. Furthermore, we demonstrate that the model is cognitively plausible by replicating a psycholinguistics experiment about human reading difficulty using our model. |
Tasks | Spelling Correction |
Published | 2016-08-07 |
URL | http://arxiv.org/abs/1608.02214v2 |
http://arxiv.org/pdf/1608.02214v2.pdf | |
PWC | https://paperswithcode.com/paper/robsut-wrod-reocginiton-via-semi-character |
Repo | https://github.com/simonroquette/CORAP |
Framework | none |
Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations
Title | Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations |
Authors | Vered Shwartz, Ido Dagan |
Abstract | Recognizing various semantic relations between terms is beneficial for many NLP tasks. While path-based and distributional information sources are considered complementary for this task, the superior results the latter showed recently suggested that the former’s contribution might have become obsolete. We follow the recent success of an integrated neural method for hypernymy detection (Shwartz et al., 2016) and extend it to recognize multiple relations. The empirical results show that this method is effective in the multiclass setting as well. We further show that the path-based information source always contributes to the classification, and analyze the cases in which it mostly complements the distributional information. |
Tasks | |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.05014v4 |
http://arxiv.org/pdf/1608.05014v4.pdf | |
PWC | https://paperswithcode.com/paper/path-based-vs-distributional-information-in |
Repo | https://github.com/vered1986/LexNET |
Framework | none |
Deep Structured Energy Based Models for Anomaly Detection
Title | Deep Structured Energy Based Models for Anomaly Detection |
Authors | Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang |
Abstract | In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods. |
Tasks | Anomaly Detection |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07717v2 |
http://arxiv.org/pdf/1605.07717v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-structured-energy-based-models-for |
Repo | https://github.com/zehuichen123/DSEBM |
Framework | none |
Efficient Training for Positive Unlabeled Learning
Title | Efficient Training for Positive Unlabeled Learning |
Authors | Emanuele Sansone, Francesco G. B. De Natale, Zhi-Hua Zhou |
Abstract | Positive unlabeled (PU) learning is useful in various practical situations, where there is a need to learn a classifier for a class of interest from an unlabeled data set, which may contain anomalies as well as samples from unknown classes. The learning task can be formulated as an optimization problem under the framework of statistical learning theory. Recent studies have theoretically analyzed its properties and generalization performance, nevertheless, little effort has been made to consider the problem of scalability, especially when large sets of unlabeled data are available. In this work we propose a novel scalable PU learning algorithm that is theoretically proven to provide the optimal solution, while showing superior computational and memory performance. Experimental evaluation confirms the theoretical evidence and shows that the proposed method can be successfully applied to a large variety of real-world problems involving PU learning. |
Tasks | |
Published | 2016-08-24 |
URL | http://arxiv.org/abs/1608.06807v4 |
http://arxiv.org/pdf/1608.06807v4.pdf | |
PWC | https://paperswithcode.com/paper/efficient-training-for-positive-unlabeled |
Repo | https://github.com/emsansone/USMO |
Framework | none |
A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects
Title | A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects |
Authors | Yonatan Belinkov, James Glass |
Abstract | Discriminating between closely-related language varieties is considered a challenging and important task. This paper describes our submission to the DSL 2016 shared-task, which included two sub-tasks: one on discriminating similar languages and one on identifying Arabic dialects. We developed a character-level neural network for this task. Given a sequence of characters, our model embeds each character in vector space, runs the sequence through multiple convolutions with different filter widths, and pools the convolutional representations to obtain a hidden vector representation of the text that is used for predicting the language or dialect. We primarily focused on the Arabic dialect identification task and obtained an F1 score of 0.4834, ranking 6th out of 18 participants. We also analyze errors made by our system on the Arabic data in some detail, and point to challenges such an approach is faced with. |
Tasks | |
Published | 2016-09-24 |
URL | http://arxiv.org/abs/1609.07568v1 |
http://arxiv.org/pdf/1609.07568v1.pdf | |
PWC | https://paperswithcode.com/paper/a-character-level-convolutional-neural |
Repo | https://github.com/boknilev/dsl-char-cnn |
Framework | tf |
Conditional Image Synthesis With Auxiliary Classifier GANs
Title | Conditional Image Synthesis With Auxiliary Classifier GANs |
Authors | Augustus Odena, Christopher Olah, Jonathon Shlens |
Abstract | Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data. |
Tasks | Conditional Image Generation, Image Generation, Image Quality Assessment |
Published | 2016-10-30 |
URL | http://arxiv.org/abs/1610.09585v4 |
http://arxiv.org/pdf/1610.09585v4.pdf | |
PWC | https://paperswithcode.com/paper/conditional-image-synthesis-with-auxiliary |
Repo | https://github.com/VitoRazor/Gan_Architecture |
Framework | tf |
Deep Learning Human Mind for Automated Visual Classification
Title | Deep Learning Human Mind for Automated Visual Classification |
Authors | Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Mubarak Shah, Nasim Souly |
Abstract | What if we could effectively read the mind and transfer human visual capabilities to computer vision methods? In this paper, we aim at addressing this question by developing the first visual object classifier driven by human brain signals. In particular, we employ EEG data evoked by visual object stimuli combined with Recurrent Neural Networks (RNN) to learn a discriminative brain activity manifold of visual categories. Afterwards, we train a Convolutional Neural Network (CNN)-based regressor to project images onto the learned manifold, thus effectively allowing machines to employ human brain-based features for automated visual classification. We use a 32-channel EEG to record brain activity of seven subjects while looking at images of 40 ImageNet object classes. The proposed RNN based approach for discriminating object classes using brain signals reaches an average accuracy of about 40%, which outperforms existing methods attempting to learn EEG visual object representations. As for automated object categorization, our human brain-driven approach obtains competitive performance, comparable to those achieved by powerful CNN models, both on ImageNet and CalTech 101, thus demonstrating its classification and generalization capabilities. This gives us a real hope that, indeed, human mind can be read and transferred to machines. |
Tasks | EEG |
Published | 2016-09-01 |
URL | https://arxiv.org/abs/1609.00344v2 |
https://arxiv.org/pdf/1609.00344v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-human-mind-for-automated-visual |
Repo | https://github.com/AliAbyaneh/Extracting-Image-from-EEG-signals |
Framework | none |
Joint Learning of Sentence Embeddings for Relevance and Entailment
Title | Joint Learning of Sentence Embeddings for Relevance and Entailment |
Authors | Petr Baudis, Silvestr Stanko, Jan Sedivy |
Abstract | We consider the problem of Recognizing Textual Entailment within an Information Retrieval context, where we must simultaneously determine the relevancy as well as degree of entailment for individual pieces of evidence to determine a yes/no answer to a binary natural language question. We compare several variants of neural networks for sentence embeddings in a setting of decision-making based on evidence of varying relevance. We propose a basic model to integrate evidence for entailment, show that joint training of the sentence embeddings to model relevance and entailment is feasible even with no explicit per-evidence supervision, and show the importance of evaluating strong baselines. We also demonstrate the benefit of carrying over text comprehension model trained on an unrelated task for our small datasets. Our research is motivated primarily by a new open dataset we introduce, consisting of binary questions and news-based evidence snippets. We also apply the proposed relevance-entailment model on a similar task of ranking multiple-choice test answers, evaluating it on a preliminary dataset of school test questions as well as the standard MCTest dataset, where we improve the neural model state-of-art. |
Tasks | Decision Making, Information Retrieval, Natural Language Inference, Reading Comprehension, Sentence Embeddings |
Published | 2016-05-16 |
URL | http://arxiv.org/abs/1605.04655v2 |
http://arxiv.org/pdf/1605.04655v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-learning-of-sentence-embeddings-for |
Repo | https://github.com/brmson/dataset-sts |
Framework | none |
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Title | Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation |
Authors | Rakshith Shetty, Jorma Laaksonen |
Abstract | We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset. Our model is based on the encoder–decoder pipeline, popular in image and video captioning systems. We propose to utilize two different kinds of video features, one to capture the video content in terms of objects and attributes, and the other to capture the motion and action information. Using these diverse features we train models specializing in two separate input sub-domains. We then train an evaluator model which is used to pick the best caption from the pool of candidates generated by these domain expert models. We argue that this approach is better suited for the current video captioning task, compared to using a single model, due to the diversity in the dataset. Efficacy of our method is proven by the fact that it was rated best in MSR Video to Language Challenge, as per human evaluation. Additionally, we were ranked second in the automatic evaluation metrics based table. |
Tasks | Video Captioning |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.04959v1 |
http://arxiv.org/pdf/1608.04959v1.pdf | |
PWC | https://paperswithcode.com/paper/frame-and-segment-level-features-and |
Repo | https://github.com/rakshithShetty/captionGAN |
Framework | none |
Adapting Deep Network Features to Capture Psychological Representations
Title | Adapting Deep Network Features to Capture Psychological Representations |
Authors | Joshua C. Peterson, Joshua T. Abbott, Thomas L. Griffiths |
Abstract | Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship between the representations learned by these networks and human psychological representations recovered from similarity judgments. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not capture some qualitative distinctions that are a key part of human representations. To remedy this, we develop a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments. |
Tasks | Object Classification, Object Recognition, Scene Understanding, Semantic Segmentation |
Published | 2016-08-06 |
URL | http://arxiv.org/abs/1608.02164v1 |
http://arxiv.org/pdf/1608.02164v1.pdf | |
PWC | https://paperswithcode.com/paper/adapting-deep-network-features-to-capture |
Repo | https://github.com/kbraunlich/contort_DNN |
Framework | none |