Paper Group AWR 27
Demystifying Neural Style Transfer. Deep Keyphrase Generation. Learning a CNN-based End-to-End Controller for a Formula SAE Racecar. Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network. Question Answering through Transfer Learning from Large Fine-grained Supervision Data. Model-Powered Conditional Independence …
Demystifying Neural Style Transfer
Title | Demystifying Neural Style Transfer |
Authors | Yanghao Li, Naiyan Wang, Jiaying Liu, Xiaodi Hou |
Abstract | Neural Style Transfer has recently demonstrated very exciting results which catches eyes in both academia and industry. Despite the amazing results, the principle of neural style transfer, especially why the Gram matrices could represent style remains unclear. In this paper, we propose a novel interpretation of neural style transfer by treating it as a domain adaptation problem. Specifically, we theoretically show that matching the Gram matrices of feature maps is equivalent to minimize the Maximum Mean Discrepancy (MMD) with the second order polynomial kernel. Thus, we argue that the essence of neural style transfer is to match the feature distributions between the style images and the generated images. To further support our standpoint, we experiment with several other distribution alignment methods, and achieve appealing results. We believe this novel interpretation connects these two important research fields, and could enlighten future researches. |
Tasks | Domain Adaptation, Style Transfer |
Published | 2017-01-04 |
URL | http://arxiv.org/abs/1701.01036v2 |
http://arxiv.org/pdf/1701.01036v2.pdf | |
PWC | https://paperswithcode.com/paper/demystifying-neural-style-transfer |
Repo | https://github.com/aryan-mann/style-transfer |
Framework | none |
Deep Keyphrase Generation
Title | Deep Keyphrase Generation |
Authors | Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi |
Abstract | Keyphrase provides highly-summative information that can be effectively used for understanding, organizing and retrieving text content. Though previous studies have provided many workable solutions for automated keyphrase extraction, they commonly divided the to-be-summarized content into multiple text chunks, then ranked and selected the most meaningful ones. These approaches could neither identify keyphrases that do not appear in the text, nor capture the real semantic meaning behind the text. We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks. We name it as deep keyphrase generation since it attempts to capture the deep semantic meaning of the content with a deep learning method. Empirical analysis on six datasets demonstrates that our proposed model not only achieves a significant performance boost on extracting keyphrases that appear in the source text, but also can generate absent keyphrases based on the semantic meaning of the text. Code and dataset are available at https://github.com/memray/seq2seq-keyphrase. |
Tasks | |
Published | 2017-04-23 |
URL | http://arxiv.org/abs/1704.06879v2 |
http://arxiv.org/pdf/1704.06879v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-keyphrase-generation |
Repo | https://github.com/supercoderhawk/deep-keyphrase |
Framework | pytorch |
Learning a CNN-based End-to-End Controller for a Formula SAE Racecar
Title | Learning a CNN-based End-to-End Controller for a Formula SAE Racecar |
Authors | Skanda Koppula |
Abstract | We present a set of CNN-based end-to-end models for controls of a Formula SAE racecar, along with various benchmarking and visualization tools to understand model performance. We tackled three main problems in the context of cone-delineated racetrack driving: (1) discretized steering, which translates a first-person frame along to the track to a predicted steering direction. (2) real-value steering, which translates a frame view to a real-value steering angle, and (3) a network design for predicting brake and throttle. We demonstrate high accuracy on our discretization task, low theoretical testing errors with our model for real-value steering, and a starting point for future work regarding a controller for our vehicle’s brake and throttle. Timing benchmarks suggests that the networks we propose have the latency and throughput required for real-time controllers, when run on GPU-enabled hardware. |
Tasks | |
Published | 2017-07-12 |
URL | http://arxiv.org/abs/1708.02215v1 |
http://arxiv.org/pdf/1708.02215v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-cnn-based-end-to-end-controller |
Repo | https://github.com/vasubansal1033/FS-Electric |
Framework | none |
Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network
Title | Drug-Drug Interaction Extraction from Biomedical Text Using Long Short Term Memory Network |
Authors | Sunil Kumar Sahu, Ashish Anand |
Abstract | Simultaneous administration of multiple drugs can have synergistic or antagonistic effects as one drug can affect activities of other drugs. Synergistic effects lead to improved therapeutic outcomes, whereas, antagonistic effects can be life-threatening, may lead to increased healthcare cost, or may even cause death. Thus identification of unknown drug-drug interaction (DDI) is an important concern for efficient and effective healthcare. Although multiple resources for DDI exist, they are often unable to keep pace with rich amount of information available in fast growing biomedical texts. Most existing methods model DDI extraction from text as a classification problem and mainly rely on handcrafted features. Some of these features further depend on domain specific tools. Recently neural network models using latent features have been shown to give similar or better performance than the other existing models dependent on handcrafted features. In this paper, we present three models namely, {\it B-LSTM}, {\it AB-LSTM} and {\it Joint AB-LSTM} based on long short-term memory (LSTM) network. All three models utilize word and position embedding as latent features and thus do not rely on explicit feature engineering. Further use of bidirectional long short-term memory (Bi-LSTM) networks allow implicit feature extraction from the whole sentence. The two models, {\it AB-LSTM} and {\it Joint AB-LSTM} also use attentive pooling in the output of Bi-LSTM layer to assign weights to features. Our experimental results on the SemEval-2013 DDI extraction dataset show that the {\it Joint AB-LSTM} model outperforms all the existing methods, including those relying on handcrafted features. The other two proposed LSTM models also perform competitively with state-of-the-art methods. |
Tasks | Feature Engineering, Medical Relation Extraction |
Published | 2017-01-28 |
URL | http://arxiv.org/abs/1701.08303v2 |
http://arxiv.org/pdf/1701.08303v2.pdf | |
PWC | https://paperswithcode.com/paper/drug-drug-interaction-extraction-from |
Repo | https://github.com/sunilitggu/DDI-extraction-through-LSTM |
Framework | tf |
Question Answering through Transfer Learning from Large Fine-grained Supervision Data
Title | Question Answering through Transfer Learning from Large Fine-grained Supervision Data |
Authors | Sewon Min, Minjoon Seo, Hannaneh Hajishirzi |
Abstract | We show that the task of question answering (QA) can significantly benefit from the transfer learning of models trained on a different large, fine-grained QA dataset. We achieve the state of the art in two well-studied QA datasets, WikiQA and SemEval-2016 (Task 3A), through a basic transfer learning technique from SQuAD. For WikiQA, our model outperforms the previous best model by more than 8%. We demonstrate that finer supervision provides better guidance for learning lexical and syntactic information than coarser supervision, through quantitative results and visual analysis. We also show that a similar transfer learning procedure achieves the state of the art on an entailment task. |
Tasks | Question Answering, Transfer Learning |
Published | 2017-02-07 |
URL | http://arxiv.org/abs/1702.02171v6 |
http://arxiv.org/pdf/1702.02171v6.pdf | |
PWC | https://paperswithcode.com/paper/question-answering-through-transfer-learning |
Repo | https://github.com/shmsw25/qa-transfer |
Framework | tf |
Model-Powered Conditional Independence Test
Title | Model-Powered Conditional Independence Test |
Authors | Rajat Sen, Ananda Theertha Suresh, Karthikeyan Shanmugam, Alexandros G. Dimakis, Sanjay Shakkottai |
Abstract | We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution $f(x,y,z)$ of continuous random vectors $X,Y$ and $Z,$ we determine whether $X \perp Y Z$. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution $f^{CI}(x,y,z) = f(xz)f(yz)f(z)$ – the joint distribution if and only if $X \perp Y Z.$ – when given access only to i.i.d. samples from the true joint distribution $f(x,y,z)$. To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to $f^{CI}$ in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods. |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06138v1 |
http://arxiv.org/pdf/1709.06138v1.pdf | |
PWC | https://paperswithcode.com/paper/model-powered-conditional-independence-test |
Repo | https://github.com/rajatsen91/CCIT |
Framework | none |
Semantic Document Distance Measures and Unsupervised Document Revision Detection
Title | Semantic Document Distance Measures and Unsupervised Document Revision Detection |
Authors | Xiaofeng Zhu, Diego Klabjan, Patrick Bless |
Abstract | In this paper, we model the document revision detection problem as a minimum cost branching problem that relies on computing document distances. Furthermore, we propose two new document distance measures, word vector-based Dynamic Time Warping (wDTW) and word vector-based Tree Edit Distance (wTED). Our revision detection system is designed for a large scale corpus and implemented in Apache Spark. We demonstrate that our system can more precisely detect revisions than state-of-the-art methods by utilizing the Wikipedia revision dumps https://snap.stanford.edu/data/wiki-meta.html and simulated data sets. |
Tasks | |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01256v2 |
http://arxiv.org/pdf/1709.01256v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-document-distance-measures-and |
Repo | https://github.com/XiaofengZhu/wDTW-wTED |
Framework | none |
Feature-Fused SSD: Fast Detection for Small Objects
Title | Feature-Fused SSD: Fast Detection for Small Objects |
Authors | Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu |
Abstract | Small objects detection is a challenging task in computer vision due to its limited resolution and information. In order to solve this problem, the majority of existing methods sacrifice speed for improvement in accuracy. In this paper, we aim to detect small objects at a fast speed, using the best object detector Single Shot Multibox Detector (SSD) with respect to accuracy-vs-speed trade-off as base architecture. We propose a multi-level feature fusion method for introducing contextual information in SSD, in order to improve the accuracy for small objects. In detailed fusion operation, we design two feature fusion modules, concatenation module and element-sum module, different in the way of adding contextual information. Experimental results show that these two fusion modules obtain higher mAP on PASCALVOC2007 than baseline SSD by 1.6 and 1.7 points respectively, especially with 2-3 points improvement on some smallobjects categories. The testing speed of them is 43 and 40 FPS respectively, superior to the state of the art Deconvolutional single shot detector (DSSD) by 29.4 and 26.4 FPS. Code is available at https://github.com/wnzhyee/Feature-Fused-SSD. Keywords: small object detection, feature fusion, real-time, single shot multi-box detector |
Tasks | Object Detection, Small Object Detection |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05054v3 |
http://arxiv.org/pdf/1709.05054v3.pdf | |
PWC | https://paperswithcode.com/paper/feature-fused-ssd-fast-detection-for-small |
Repo | https://github.com/wnzhyee/Feature-Fused-SSD |
Framework | none |
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture
Title | Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture |
Authors | Katsunori Ohnishi, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada |
Abstract | Learning to represent and generate videos from unlabeled data is a very challenging problem. To generate realistic videos, it is important not only to ensure that the appearance of each frame is real, but also to ensure the plausibility of a video motion and consistency of a video appearance in the time direction. The process of video generation should be divided according to these intrinsic difficulties. In this study, we focus on the motion and appearance information as two important orthogonal components of a video, and propose Flow-and-Texture-Generative Adversarial Networks (FTGAN) consisting of FlowGAN and TextureGAN. In order to avoid a huge annotation cost, we have to explore a way to learn from unlabeled data. Thus, we employ optical flow as motion information to generate videos. FlowGAN generates optical flow, which contains only the edge and motion of the videos to be begerated. On the other hand, TextureGAN specializes in giving a texture to optical flow generated by FlowGAN. This hierarchical approach brings more realistic videos with plausible motion and appearance consistency. Our experiments show that our model generates more plausible motion videos and also achieves significantly improved performance for unsupervised action classification in comparison to previous GAN works. In addition, because our model generates videos from two independent information, our model can generate new combinations of motion and attribute that are not seen in training data, such as a video in which a person is doing sit-up in a baseball ground. |
Tasks | Action Classification, Optical Flow Estimation, Video Generation |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09618v2 |
http://arxiv.org/pdf/1711.09618v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-video-generation-from-orthogonal |
Repo | https://github.com/mil-tokyo/FTGAN |
Framework | none |
Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser
Title | Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser |
Authors | Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, Jun Zhu |
Abstract | Neural networks are vulnerable to adversarial examples, which poses a threat to their application in security sensitive systems. We propose high-level representation guided denoiser (HGD) as a defense for image classification. Standard denoiser suffers from the error amplification effect, in which small residual adversarial noise is progressively amplified and leads to wrong classifications. HGD overcomes this problem by using a loss function defined as the difference between the target model’s outputs activated by the clean image and denoised image. Compared with ensemble adversarial training which is the state-of-the-art defending method on large images, HGD has three advantages. First, with HGD as a defense, the target model is more robust to either white-box or black-box adversarial attacks. Second, HGD can be trained on a small subset of the images and generalizes well to other images and unseen classes. Third, HGD can be transferred to defend models other than the one guiding it. In NIPS competition on defense against adversarial attacks, our HGD solution won the first place and outperformed other models by a large margin. |
Tasks | Adversarial Attack, Adversarial Defense, Image Classification |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.02976v2 |
http://arxiv.org/pdf/1712.02976v2.pdf | |
PWC | https://paperswithcode.com/paper/defense-against-adversarial-attacks-using |
Repo | https://github.com/anishathalye/Guided-Denoise |
Framework | tf |
Preserving Differential Privacy in Convolutional Deep Belief Networks
Title | Preserving Differential Privacy in Convolutional Deep Belief Networks |
Authors | NhatHai Phan, Xintao Wu, Dejing Dou |
Abstract | The remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built on users’ personal and highly sensitive data, e.g., clinical records, user profiles, biomedical images, etc. However, only a few scientific studies on preserving privacy in deep learning have been conducted. In this paper, we focus on developing a private convolutional deep belief network (pCDBN), which essentially is a convolutional deep belief network (CDBN) under differential privacy. Our main idea of enforcing epsilon-differential privacy is to leverage the functional mechanism to perturb the energy-based objective functions of traditional CDBNs, rather than their results. One key contribution of this work is that we propose the use of Chebyshev expansion to derive the approximate polynomial representation of objective functions. Our theoretical analysis shows that we can further derive the sensitivity and error bounds of the approximate polynomial representation. As a result, preserving differential privacy in CDBNs is feasible. We applied our model in a health social network, i.e., YesiWell data, and in a handwriting digit dataset, i.e., MNIST data, for human behavior prediction, human behavior classification, and handwriting digit recognition tasks. Theoretical analysis and rigorous experimental evaluations show that the pCDBN is highly effective. It significantly outperforms existing solutions. |
Tasks | |
Published | 2017-06-25 |
URL | http://arxiv.org/abs/1706.08839v2 |
http://arxiv.org/pdf/1706.08839v2.pdf | |
PWC | https://paperswithcode.com/paper/preserving-differential-privacy-in |
Repo | https://github.com/haiphanNJIT/PrivateDeepLearning |
Framework | tf |
StarSpace: Embed All The Things!
Title | StarSpace: Embed All The Things! |
Authors | Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston |
Abstract | A framework for training and evaluating AI models on a variety of openly available dialogue datasets. |
Tasks | Text Classification, Word Embeddings |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03856v5 |
http://arxiv.org/pdf/1709.03856v5.pdf | |
PWC | https://paperswithcode.com/paper/starspace-embed-all-the-things |
Repo | https://github.com/facebookresearch/StarSpace |
Framework | none |
Can you tell where in India I am from? Comparing humans and computers on fine-grained race face classification
Title | Can you tell where in India I am from? Comparing humans and computers on fine-grained race face classification |
Authors | Harish Katti, S. P. Arun |
Abstract | Faces form the basis for a rich variety of judgments in humans, yet the underlying features remain poorly understood. Although fine-grained distinctions within a race might more strongly constrain possible facial features used by humans than in case of coarse categories such as race or gender, such fine grained distinctions are relatively less studied. Fine-grained race classification is also interesting because even humans may not be perfectly accurate on these tasks. This allows us to compare errors made by humans and machines, in contrast to standard object detection tasks where human performance is nearly perfect. We have developed a novel face database of close to 1650 diverse Indian faces labeled for fine-grained race (South vs North India) as well as for age, weight, height and gender. We then asked close to 130 human subjects who were instructed to categorize each face as belonging toa Northern or Southern state in India. We then compared human performance on this task with that of computational models trained on the ground-truth labels. Our main results are as follows: (1) Humans are highly consistent (average accuracy: 63.6%), with some faces being consistently classified with > 90% accuracy and others consistently misclassified with < 30% accuracy; (2) Models trained on ground-truth labels showed slightly worse performance (average accuracy: 62%) but showed higher accuracy (72.2%) on faces classified with > 80% accuracy by humans. This was true for models trained on simple spatial and intensity measurements extracted from faces as well as deep neural networks trained on race or gender classification; (3) Using overcomplete banks of features derived from each face part, we found that mouth shape was the single largest contributor towards fine-grained race classification, whereas distances between face parts was the strongest predictor of gender. |
Tasks | Object Detection |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07595v2 |
http://arxiv.org/pdf/1703.07595v2.pdf | |
PWC | https://paperswithcode.com/paper/can-you-tell-where-in-india-i-am-from |
Repo | https://github.com/harish2006/IISCIFD |
Framework | none |
Deep Hashing Network for Unsupervised Domain Adaptation
Title | Deep Hashing Network for Unsupervised Domain Adaptation |
Authors | Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, Sethuraman Panchanathan |
Abstract | In recent years, deep neural networks have emerged as a dominant machine learning tool for a wide variety of application domains. However, training a deep neural network requires a large amount of labeled data, which is an expensive process in terms of time, labor and human expertise. Domain adaptation or transfer learning algorithms address this challenge by leveraging labeled data in a different, but related source domain, to develop a model for the target domain. Further, the explosive growth of digital data has posed a fundamental challenge concerning its storage and retrieval. Due to its storage and retrieval efficiency, recent years have witnessed a wide application of hashing in a variety of computer vision applications. In this paper, we first introduce a new dataset, Office-Home, to evaluate domain adaptation algorithms. The dataset contains images of a variety of everyday objects from multiple domains. We then propose a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data. To the best of our knowledge, this is the first research effort to exploit the feature learning capabilities of deep neural networks to learn representative hash codes to address the domain adaptation problem. Our extensive empirical studies on multiple transfer tasks corroborate the usefulness of the framework in learning efficient hash codes which outperform existing competitive baselines for unsupervised domain adaptation. |
Tasks | Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07522v1 |
http://arxiv.org/pdf/1706.07522v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-hashing-network-for-unsupervised-domain |
Repo | https://github.com/hemanthdv/da-hash |
Framework | none |
Adversarial-Playground: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning
Title | Adversarial-Playground: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning |
Authors | Andrew P. Norton, Yanjun Qi |
Abstract | Recent studies have shown that attackers can force deep learning models to misclassify so-called “adversarial examples”: maliciously generated images formed by making imperceptible modifications to pixel values. With growing interest in deep learning for security applications, it is important for security experts and users of machine learning to recognize how learning systems may be attacked. Due to the complex nature of deep learning, it is challenging to understand how deep models can be fooled by adversarial examples. Thus, we present a web-based visualization tool, Adversarial-Playground, to demonstrate the efficacy of common adversarial methods against a convolutional neural network (CNN) system. Adversarial-Playground is educational, modular and interactive. (1) It enables non-experts to compare examples visually and to understand why an adversarial example can fool a CNN-based image classifier. (2) It can help security experts explore more vulnerability of deep learning as a software module. (3) Building an interactive visualization is challenging in this domain due to the large feature space of image classification (generating adversarial examples is slow in general and visualizing images are costly). Through multiple novel design choices, our tool can provide fast and accurate responses to user requests. Empirically, we find that our client-server division strategy reduced the response time by an average of 1.5 seconds per sample. Our other innovation, a faster variant of JSMA evasion algorithm, empirically performed twice as fast as JSMA and yet maintains a comparable evasion rate. Project source code and data from our experiments available at: https://github.com/QData/AdversarialDNN-Playground |
Tasks | Adversarial Attack, Adversarial Defense, Image Classification |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00807v1 |
http://arxiv.org/pdf/1708.00807v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-playground-a-visualization-suite |
Repo | https://github.com/QData/AdversarialDNN-Playground |
Framework | tf |