Paper Group AWR 69
A Context-aware Natural Language Generator for Dialogue Systems. Predicting Domain Generation Algorithms with Long Short-Term Memory Networks. Harnessing Deep Neural Networks with Logic Rules. Globally Normalized Transition-Based Neural Networks. Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data. Theoretical …
A Context-aware Natural Language Generator for Dialogue Systems
Title | A Context-aware Natural Language Generator for Dialogue Systems |
Authors | Ondřej Dušek, Filip Jurčíček |
Abstract | We present a novel natural language generation system for spoken dialogue systems capable of entraining (adapting) to users’ way of speaking, providing contextually appropriate responses. The generator is based on recurrent neural networks and the sequence-to-sequence approach. It is fully trainable from data which include preceding context along with responses to be generated. We show that the context-aware generator yields significant improvements over the baseline in both automatic metrics and a human pairwise preference test. |
Tasks | Spoken Dialogue Systems, Text Generation |
Published | 2016-08-25 |
URL | http://arxiv.org/abs/1608.07076v1 |
http://arxiv.org/pdf/1608.07076v1.pdf | |
PWC | https://paperswithcode.com/paper/a-context-aware-natural-language-generator |
Repo | https://github.com/UFAL-DSG/tgen |
Framework | tf |
Predicting Domain Generation Algorithms with Long Short-Term Memory Networks
Title | Predicting Domain Generation Algorithms with Long Short-Term Memory Networks |
Authors | Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, Daniel Grant |
Abstract | Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to a command and control (C&C) server. In order to block DGA C&C traffic, security organizations must first discover the algorithm by reverse engineering malware samples, then generating a list of domains for a given seed. The domains are then either preregistered or published in a DNS blacklist. This process is not only tedious, but can be readily circumvented by malware authors using a large number of seeds in algorithms with multivariate recurrence properties (e.g., banjori) or by using a dynamic list of seeds (e.g., bedep). Another technique to stop malware from using DGAs is to intercept DNS queries on a network and predict whether domains are DGA generated. Such a technique will alert network administrators to the presence of malware on their networks. In addition, if the predictor can also accurately predict the family of DGAs, then network administrators can also be alerted to the type of malware that is on their networks. This paper presents a DGA classifier that leverages long short-term memory (LSTM) networks to predict DGAs and their respective families without the need for a priori feature extraction. Results are significantly better than state-of-the-art techniques, providing 0.9993 area under the receiver operating characteristic curve for binary classification and a micro-averaged F1 score of 0.9906. In other terms, the LSTM technique can provide a 90% detection rate with a 1:10000 false positive (FP) rate—a twenty times FP improvement over comparable methods. Experiments in this paper are run on open datasets and code snippets are provided to reproduce the results. |
Tasks | |
Published | 2016-11-02 |
URL | http://arxiv.org/abs/1611.00791v1 |
http://arxiv.org/pdf/1611.00791v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-domain-generation-algorithms-with |
Repo | https://github.com/endgameinc/dga_predict |
Framework | none |
Harnessing Deep Neural Networks with Logic Rules
Title | Harnessing Deep Neural Networks with Logic Rules |
Authors | Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, Eric Xing |
Abstract | Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems. |
Tasks | Named Entity Recognition, Sentiment Analysis |
Published | 2016-03-21 |
URL | http://arxiv.org/abs/1603.06318v5 |
http://arxiv.org/pdf/1603.06318v5.pdf | |
PWC | https://paperswithcode.com/paper/harnessing-deep-neural-networks-with-logic |
Repo | https://github.com/ZhitingHu/logicnn |
Framework | none |
Globally Normalized Transition-Based Neural Networks
Title | Globally Normalized Transition-Based Neural Networks |
Authors | Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins |
Abstract | We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. We discuss the importance of global as opposed to local normalization: a key insight is that the label bias problem implies that globally normalized models can be strictly more expressive than locally normalized models. |
Tasks | Dependency Parsing, Part-Of-Speech Tagging, Sentence Compression |
Published | 2016-03-19 |
URL | http://arxiv.org/abs/1603.06042v2 |
http://arxiv.org/pdf/1603.06042v2.pdf | |
PWC | https://paperswithcode.com/paper/globally-normalized-transition-based-neural |
Repo | https://github.com/tensorflow/models/tree/master/research/syntaxnet |
Framework | tf |
Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
Title | Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data |
Authors | Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt |
Abstract | We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06432v3 |
http://arxiv.org/pdf/1605.06432v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-variational-bayes-filters-unsupervised |
Repo | https://github.com/baggepinnen/DVBF.jl |
Framework | none |
Theoretical Robopsychology: Samu Has Learned Turing Machines
Title | Theoretical Robopsychology: Samu Has Learned Turing Machines |
Authors | Norbert Bátfai |
Abstract | From the point of view of a programmer, the robopsychology is a synonym for the activity is done by developers to implement their machine learning applications. This robopsychological approach raises some fundamental theoretical questions of machine learning. Our discussion of these questions is constrained to Turing machines. Alan Turing had given an algorithm (aka the Turing Machine) to describe algorithms. If it has been applied to describe itself then this brings us to Turing’s notion of the universal machine. In the present paper, we investigate algorithms to write algorithms. From a pedagogy point of view, this way of writing programs can be considered as a combination of learning by listening and learning by doing due to it is based on applying agent technology and machine learning. As the main result we introduce the problem of learning and then we show that it cannot easily be handled in reality therefore it is reasonable to use machine learning algorithm for learning Turing machines. |
Tasks | |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02767v2 |
http://arxiv.org/pdf/1606.02767v2.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-robopsychology-samu-has-learned |
Repo | https://github.com/nbatfai/SamuCTuring |
Framework | none |
On-Demand Learning for Deep Image Restoration
Title | On-Demand Learning for Deep Image Restoration |
Authors | Ruohan Gao, Kristen Grauman |
Abstract | While machine learning approaches to image restoration offer great promise, current methods risk training models fixated on performing well only for image corruption of a particular level of difficulty—such as a certain level of noise or blur. First, we examine the weakness of conventional “fixated” models and demonstrate that training general models to handle arbitrary levels of corruption is indeed non-trivial. Then, we propose an on-demand learning algorithm for training image restoration models with deep convolutional neural networks. The main idea is to exploit a feedback mechanism to self-generate training instances where they are needed most, thereby learning models that can generalize across difficulty levels. On four restoration tasks—image inpainting, pixel interpolation, image deblurring, and image denoising—and three diverse datasets, our approach consistently outperforms both the status quo training procedure and curriculum learning alternatives. |
Tasks | Deblurring, Denoising, Image Denoising, Image Inpainting, Image Restoration |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01380v3 |
http://arxiv.org/pdf/1612.01380v3.pdf | |
PWC | https://paperswithcode.com/paper/on-demand-learning-for-deep-image-restoration |
Repo | https://github.com/rhgao/on-demand-learning |
Framework | torch |
Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks
Title | Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks |
Authors | Michael Bukatin, Steve Matthews, Andrey Radul |
Abstract | Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of arbitrary linear streams, multiple types of powerful neurons, and allow to incorporate higher-order constructions. We expect them to be useful in machine learning and probabilistic programming, and in the synthesis of dynamic systems and of deterministic and probabilistic programs. |
Tasks | Probabilistic Programming |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.09002v2 |
http://arxiv.org/pdf/1603.09002v2.pdf | |
PWC | https://paperswithcode.com/paper/dataflow-matrix-machines-as-a-generalization |
Repo | https://github.com/anhinga/fluid |
Framework | none |
Least Squares Generative Adversarial Networks
Title | Least Squares Generative Adversarial Networks |
Authors | Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, Stephen Paul Smolley |
Abstract | Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $\chi^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs. |
Tasks | |
Published | 2016-11-13 |
URL | http://arxiv.org/abs/1611.04076v3 |
http://arxiv.org/pdf/1611.04076v3.pdf | |
PWC | https://paperswithcode.com/paper/least-squares-generative-adversarial-networks |
Repo | https://github.com/eriklindernoren/PyTorch-GAN |
Framework | pytorch |
Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields
Title | Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields |
Authors | Patrick Ferdinand Christ, Mohamed Ezzeldin A. Elshaer, Florian Ettlinger, Sunil Tatavarty, Marc Bickel, Patrick Bilic, Markus Rempfler, Marco Armbruster, Felix Hofmann, Melvin D’Anastasi, Wieland H. Sommer, Seyed-Ahmad Ahmadi, Bjoern H. Menze |
Abstract | Automatic segmentation of the liver and its lesion is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT abdomen images using cascaded fully convolutional neural networks (CFCNs) and dense 3D conditional random fields (CRFs). We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions from the predicted liver ROIs of step 1. We refine the segmentations of the CFCN using a dense 3D CRF that accounts for both spatial coherence and appearance. CFCN models were trained in a 2-fold cross-validation on the abdominal CT dataset 3DIRCAD comprising 15 hepatic tumor volumes. Our results show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We experimentally demonstrate the robustness of the proposed method as a decision support system with a high accuracy and speed for usage in daily clinical routine. |
Tasks | Lesion Segmentation |
Published | 2016-10-07 |
URL | http://arxiv.org/abs/1610.02177v1 |
http://arxiv.org/pdf/1610.02177v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-liver-and-lesion-segmentation-in-ct |
Repo | https://github.com/IBBM/Cascaded-FCN |
Framework | tf |
SimpleDS: A Simple Deep Reinforcement Learning Dialogue System
Title | SimpleDS: A Simple Deep Reinforcement Learning Dialogue System |
Authors | Heriberto Cuayáhuitl |
Abstract | This paper presents ‘SimpleDS’, a simple and publicly available dialogue system trained with deep reinforcement learning. In contrast to previous reinforcement learning dialogue systems, this system avoids manual feature engineering by performing action selection directly from raw text of the last system and (noisy) user responses. Our initial results, in the restaurant domain, show that it is indeed possible to induce reasonable dialogue behaviour with an approach that aims for high levels of automation in dialogue control for intelligent interactive agents. |
Tasks | Feature Engineering |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04574v1 |
http://arxiv.org/pdf/1601.04574v1.pdf | |
PWC | https://paperswithcode.com/paper/simpleds-a-simple-deep-reinforcement-learning |
Repo | https://github.com/cuayahuitl/SimpleDS |
Framework | none |
Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text
Title | Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text |
Authors | Raj Nath Patel, Prakash B. Pimpale, M Sasikumar |
Abstract | This paper describes Centre for Development of Advanced Computing’s (CDACM) submission to the shared task-‘Tool Contest on POS tagging for Code-Mixed Indian Social Media (Facebook, Twitter, and Whatsapp) Text’, collocated with ICON-2016. The shared task was to predict Part of Speech (POS) tag at word level for a given text. The code-mixed text is generated mostly on social media by multilingual users. The presence of the multilingual words, transliterations, and spelling variations make such content linguistically complex. In this paper, we propose an approach to POS tag code-mixed social media text using Recurrent Neural Network Language Model (RNN-LM) architecture. We submitted the results for Hindi-English (hi-en), Bengali-English (bn-en), and Telugu-English (te-en) code-mixed data. |
Tasks | Language Modelling |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04989v2 |
http://arxiv.org/pdf/1611.04989v2.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-network-based-part-of-speech |
Repo | https://github.com/patelrajnath/rnn4nlp |
Framework | pytorch |
ArtTrack: Articulated Multi-person Tracking in the Wild
Title | ArtTrack: Articulated Multi-person Tracking in the Wild |
Authors | Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Evgeny Levinkov, Bjoern Andres, Bernt Schiele |
Abstract | In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos. Our starting point is a model that resembles existing architectures for single-frame pose estimation but is substantially faster. We achieve this in two ways: (1) by simplifying and sparsifying the body-part relationship graph and leveraging recent methods for faster inference, and (2) by offloading a substantial share of computation onto a feed-forward convolutional architecture that is able to detect and associate body joints of the same person even in clutter. We use this model to generate proposals for body joint locations and formulate articulated tracking as spatio-temporal grouping of such proposals. This allows to jointly solve the association problem for all people in the scene by propagating evidence from strong detections through time and enforcing constraints that each proposal can be assigned to one person only. We report results on a public MPII Human Pose benchmark and on a new MPII Video Pose dataset of image sequences with multiple people. We demonstrate that our model achieves state-of-the-art results while using only a fraction of time and is able to leverage temporal information to improve state-of-the-art for crowded scenes. |
Tasks | Multi-Person Pose Estimation, Pose Estimation |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01465v3 |
http://arxiv.org/pdf/1612.01465v3.pdf | |
PWC | https://paperswithcode.com/paper/arttrack-articulated-multi-person-tracking-in |
Repo | https://github.com/PJunhyuk/exercise-pose-analyzer |
Framework | tf |
Query-Reduction Networks for Question Answering
Title | Query-Reduction Networks for Question Answering |
Authors | Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi |
Abstract | In this paper, we study the problem of question answering when reasoning over multiple facts is required. We propose Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that effectively handles both short-term (local) and long-term (global) sequential dependencies to reason over multiple facts. QRN considers the context sentences as a sequence of state-changing triggers, and reduces the original query to a more informed query as it observes each trigger (context sentence) through time. Our experiments show that QRN produces the state-of-the-art results in bAbI QA and dialog tasks, and in a real goal-oriented dialog dataset. In addition, QRN formulation allows parallelization on RNN’s time axis, saving an order of magnitude in time complexity for training and inference. |
Tasks | Goal-Oriented Dialog, Question Answering |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04582v6 |
http://arxiv.org/pdf/1606.04582v6.pdf | |
PWC | https://paperswithcode.com/paper/query-reduction-networks-for-question |
Repo | https://github.com/uwnlp/qrn |
Framework | tf |
Using Deep Learning for Image-Based Plant Disease Detection
Title | Using Deep Learning for Image-Based Plant Disease Detection |
Authors | Sharada Prasanna Mohanty, David Hughes, Marcel Salathe |
Abstract | Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. When testing the model on a set of images collected from trusted online sources - i.e. taken under conditions different from the images used for training - the model still achieves an accuracy of 31.4%. While this accuracy is much higher than the one based on random selection (2.6%), a more diverse set of training data is needed to improve the general accuracy. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path towards smartphone-assisted crop disease diagnosis on a massive global scale. |
Tasks | |
Published | 2016-04-11 |
URL | http://arxiv.org/abs/1604.03169v2 |
http://arxiv.org/pdf/1604.03169v2.pdf | |
PWC | https://paperswithcode.com/paper/using-deep-learning-for-image-based-plant |
Repo | https://github.com/deepcpatel/GreenDoc |
Framework | tf |