May 7, 2019

2749 words 13 mins read

Paper Group AWR 69

Paper Group AWR 69

A Context-aware Natural Language Generator for Dialogue Systems. Predicting Domain Generation Algorithms with Long Short-Term Memory Networks. Harnessing Deep Neural Networks with Logic Rules. Globally Normalized Transition-Based Neural Networks. Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data. Theoretical …

A Context-aware Natural Language Generator for Dialogue Systems

Title A Context-aware Natural Language Generator for Dialogue Systems
Authors Ondřej Dušek, Filip Jurčíček
Abstract We present a novel natural language generation system for spoken dialogue systems capable of entraining (adapting) to users’ way of speaking, providing contextually appropriate responses. The generator is based on recurrent neural networks and the sequence-to-sequence approach. It is fully trainable from data which include preceding context along with responses to be generated. We show that the context-aware generator yields significant improvements over the baseline in both automatic metrics and a human pairwise preference test.
Tasks Spoken Dialogue Systems, Text Generation
Published 2016-08-25
URL http://arxiv.org/abs/1608.07076v1
PDF http://arxiv.org/pdf/1608.07076v1.pdf
PWC https://paperswithcode.com/paper/a-context-aware-natural-language-generator
Repo https://github.com/UFAL-DSG/tgen
Framework tf

Predicting Domain Generation Algorithms with Long Short-Term Memory Networks

Title Predicting Domain Generation Algorithms with Long Short-Term Memory Networks
Authors Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, Daniel Grant
Abstract Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to a command and control (C&C) server. In order to block DGA C&C traffic, security organizations must first discover the algorithm by reverse engineering malware samples, then generating a list of domains for a given seed. The domains are then either preregistered or published in a DNS blacklist. This process is not only tedious, but can be readily circumvented by malware authors using a large number of seeds in algorithms with multivariate recurrence properties (e.g., banjori) or by using a dynamic list of seeds (e.g., bedep). Another technique to stop malware from using DGAs is to intercept DNS queries on a network and predict whether domains are DGA generated. Such a technique will alert network administrators to the presence of malware on their networks. In addition, if the predictor can also accurately predict the family of DGAs, then network administrators can also be alerted to the type of malware that is on their networks. This paper presents a DGA classifier that leverages long short-term memory (LSTM) networks to predict DGAs and their respective families without the need for a priori feature extraction. Results are significantly better than state-of-the-art techniques, providing 0.9993 area under the receiver operating characteristic curve for binary classification and a micro-averaged F1 score of 0.9906. In other terms, the LSTM technique can provide a 90% detection rate with a 1:10000 false positive (FP) rate—a twenty times FP improvement over comparable methods. Experiments in this paper are run on open datasets and code snippets are provided to reproduce the results.
Tasks
Published 2016-11-02
URL http://arxiv.org/abs/1611.00791v1
PDF http://arxiv.org/pdf/1611.00791v1.pdf
PWC https://paperswithcode.com/paper/predicting-domain-generation-algorithms-with
Repo https://github.com/endgameinc/dga_predict
Framework none

Harnessing Deep Neural Networks with Logic Rules

Title Harnessing Deep Neural Networks with Logic Rules
Authors Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, Eric Xing
Abstract Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.
Tasks Named Entity Recognition, Sentiment Analysis
Published 2016-03-21
URL http://arxiv.org/abs/1603.06318v5
PDF http://arxiv.org/pdf/1603.06318v5.pdf
PWC https://paperswithcode.com/paper/harnessing-deep-neural-networks-with-logic
Repo https://github.com/ZhitingHu/logicnn
Framework none

Globally Normalized Transition-Based Neural Networks

Title Globally Normalized Transition-Based Neural Networks
Authors Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins
Abstract We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. We discuss the importance of global as opposed to local normalization: a key insight is that the label bias problem implies that globally normalized models can be strictly more expressive than locally normalized models.
Tasks Dependency Parsing, Part-Of-Speech Tagging, Sentence Compression
Published 2016-03-19
URL http://arxiv.org/abs/1603.06042v2
PDF http://arxiv.org/pdf/1603.06042v2.pdf
PWC https://paperswithcode.com/paper/globally-normalized-transition-based-neural
Repo https://github.com/tensorflow/models/tree/master/research/syntaxnet
Framework tf

Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data

Title Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
Authors Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt
Abstract We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction.
Tasks
Published 2016-05-20
URL http://arxiv.org/abs/1605.06432v3
PDF http://arxiv.org/pdf/1605.06432v3.pdf
PWC https://paperswithcode.com/paper/deep-variational-bayes-filters-unsupervised
Repo https://github.com/baggepinnen/DVBF.jl
Framework none

Theoretical Robopsychology: Samu Has Learned Turing Machines

Title Theoretical Robopsychology: Samu Has Learned Turing Machines
Authors Norbert Bátfai
Abstract From the point of view of a programmer, the robopsychology is a synonym for the activity is done by developers to implement their machine learning applications. This robopsychological approach raises some fundamental theoretical questions of machine learning. Our discussion of these questions is constrained to Turing machines. Alan Turing had given an algorithm (aka the Turing Machine) to describe algorithms. If it has been applied to describe itself then this brings us to Turing’s notion of the universal machine. In the present paper, we investigate algorithms to write algorithms. From a pedagogy point of view, this way of writing programs can be considered as a combination of learning by listening and learning by doing due to it is based on applying agent technology and machine learning. As the main result we introduce the problem of learning and then we show that it cannot easily be handled in reality therefore it is reasonable to use machine learning algorithm for learning Turing machines.
Tasks
Published 2016-06-08
URL http://arxiv.org/abs/1606.02767v2
PDF http://arxiv.org/pdf/1606.02767v2.pdf
PWC https://paperswithcode.com/paper/theoretical-robopsychology-samu-has-learned
Repo https://github.com/nbatfai/SamuCTuring
Framework none

On-Demand Learning for Deep Image Restoration

Title On-Demand Learning for Deep Image Restoration
Authors Ruohan Gao, Kristen Grauman
Abstract While machine learning approaches to image restoration offer great promise, current methods risk training models fixated on performing well only for image corruption of a particular level of difficulty—such as a certain level of noise or blur. First, we examine the weakness of conventional “fixated” models and demonstrate that training general models to handle arbitrary levels of corruption is indeed non-trivial. Then, we propose an on-demand learning algorithm for training image restoration models with deep convolutional neural networks. The main idea is to exploit a feedback mechanism to self-generate training instances where they are needed most, thereby learning models that can generalize across difficulty levels. On four restoration tasks—image inpainting, pixel interpolation, image deblurring, and image denoising—and three diverse datasets, our approach consistently outperforms both the status quo training procedure and curriculum learning alternatives.
Tasks Deblurring, Denoising, Image Denoising, Image Inpainting, Image Restoration
Published 2016-12-05
URL http://arxiv.org/abs/1612.01380v3
PDF http://arxiv.org/pdf/1612.01380v3.pdf
PWC https://paperswithcode.com/paper/on-demand-learning-for-deep-image-restoration
Repo https://github.com/rhgao/on-demand-learning
Framework torch

Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks

Title Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks
Authors Michael Bukatin, Steve Matthews, Andrey Radul
Abstract Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of arbitrary linear streams, multiple types of powerful neurons, and allow to incorporate higher-order constructions. We expect them to be useful in machine learning and probabilistic programming, and in the synthesis of dynamic systems and of deterministic and probabilistic programs.
Tasks Probabilistic Programming
Published 2016-03-29
URL http://arxiv.org/abs/1603.09002v2
PDF http://arxiv.org/pdf/1603.09002v2.pdf
PWC https://paperswithcode.com/paper/dataflow-matrix-machines-as-a-generalization
Repo https://github.com/anhinga/fluid
Framework none

Least Squares Generative Adversarial Networks

Title Least Squares Generative Adversarial Networks
Authors Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, Stephen Paul Smolley
Abstract Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $\chi^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
Tasks
Published 2016-11-13
URL http://arxiv.org/abs/1611.04076v3
PDF http://arxiv.org/pdf/1611.04076v3.pdf
PWC https://paperswithcode.com/paper/least-squares-generative-adversarial-networks
Repo https://github.com/eriklindernoren/PyTorch-GAN
Framework pytorch

Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields

Title Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields
Authors Patrick Ferdinand Christ, Mohamed Ezzeldin A. Elshaer, Florian Ettlinger, Sunil Tatavarty, Marc Bickel, Patrick Bilic, Markus Rempfler, Marco Armbruster, Felix Hofmann, Melvin D’Anastasi, Wieland H. Sommer, Seyed-Ahmad Ahmadi, Bjoern H. Menze
Abstract Automatic segmentation of the liver and its lesion is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT abdomen images using cascaded fully convolutional neural networks (CFCNs) and dense 3D conditional random fields (CRFs). We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions from the predicted liver ROIs of step 1. We refine the segmentations of the CFCN using a dense 3D CRF that accounts for both spatial coherence and appearance. CFCN models were trained in a 2-fold cross-validation on the abdominal CT dataset 3DIRCAD comprising 15 hepatic tumor volumes. Our results show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We experimentally demonstrate the robustness of the proposed method as a decision support system with a high accuracy and speed for usage in daily clinical routine.
Tasks Lesion Segmentation
Published 2016-10-07
URL http://arxiv.org/abs/1610.02177v1
PDF http://arxiv.org/pdf/1610.02177v1.pdf
PWC https://paperswithcode.com/paper/automatic-liver-and-lesion-segmentation-in-ct
Repo https://github.com/IBBM/Cascaded-FCN
Framework tf

SimpleDS: A Simple Deep Reinforcement Learning Dialogue System

Title SimpleDS: A Simple Deep Reinforcement Learning Dialogue System
Authors Heriberto Cuayáhuitl
Abstract This paper presents ‘SimpleDS’, a simple and publicly available dialogue system trained with deep reinforcement learning. In contrast to previous reinforcement learning dialogue systems, this system avoids manual feature engineering by performing action selection directly from raw text of the last system and (noisy) user responses. Our initial results, in the restaurant domain, show that it is indeed possible to induce reasonable dialogue behaviour with an approach that aims for high levels of automation in dialogue control for intelligent interactive agents.
Tasks Feature Engineering
Published 2016-01-18
URL http://arxiv.org/abs/1601.04574v1
PDF http://arxiv.org/pdf/1601.04574v1.pdf
PWC https://paperswithcode.com/paper/simpleds-a-simple-deep-reinforcement-learning
Repo https://github.com/cuayahuitl/SimpleDS
Framework none

Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text

Title Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text
Authors Raj Nath Patel, Prakash B. Pimpale, M Sasikumar
Abstract This paper describes Centre for Development of Advanced Computing’s (CDACM) submission to the shared task-‘Tool Contest on POS tagging for Code-Mixed Indian Social Media (Facebook, Twitter, and Whatsapp) Text’, collocated with ICON-2016. The shared task was to predict Part of Speech (POS) tag at word level for a given text. The code-mixed text is generated mostly on social media by multilingual users. The presence of the multilingual words, transliterations, and spelling variations make such content linguistically complex. In this paper, we propose an approach to POS tag code-mixed social media text using Recurrent Neural Network Language Model (RNN-LM) architecture. We submitted the results for Hindi-English (hi-en), Bengali-English (bn-en), and Telugu-English (te-en) code-mixed data.
Tasks Language Modelling
Published 2016-11-15
URL http://arxiv.org/abs/1611.04989v2
PDF http://arxiv.org/pdf/1611.04989v2.pdf
PWC https://paperswithcode.com/paper/recurrent-neural-network-based-part-of-speech
Repo https://github.com/patelrajnath/rnn4nlp
Framework pytorch

ArtTrack: Articulated Multi-person Tracking in the Wild

Title ArtTrack: Articulated Multi-person Tracking in the Wild
Authors Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Evgeny Levinkov, Bjoern Andres, Bernt Schiele
Abstract In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos. Our starting point is a model that resembles existing architectures for single-frame pose estimation but is substantially faster. We achieve this in two ways: (1) by simplifying and sparsifying the body-part relationship graph and leveraging recent methods for faster inference, and (2) by offloading a substantial share of computation onto a feed-forward convolutional architecture that is able to detect and associate body joints of the same person even in clutter. We use this model to generate proposals for body joint locations and formulate articulated tracking as spatio-temporal grouping of such proposals. This allows to jointly solve the association problem for all people in the scene by propagating evidence from strong detections through time and enforcing constraints that each proposal can be assigned to one person only. We report results on a public MPII Human Pose benchmark and on a new MPII Video Pose dataset of image sequences with multiple people. We demonstrate that our model achieves state-of-the-art results while using only a fraction of time and is able to leverage temporal information to improve state-of-the-art for crowded scenes.
Tasks Multi-Person Pose Estimation, Pose Estimation
Published 2016-12-05
URL http://arxiv.org/abs/1612.01465v3
PDF http://arxiv.org/pdf/1612.01465v3.pdf
PWC https://paperswithcode.com/paper/arttrack-articulated-multi-person-tracking-in
Repo https://github.com/PJunhyuk/exercise-pose-analyzer
Framework tf

Query-Reduction Networks for Question Answering

Title Query-Reduction Networks for Question Answering
Authors Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
Abstract In this paper, we study the problem of question answering when reasoning over multiple facts is required. We propose Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that effectively handles both short-term (local) and long-term (global) sequential dependencies to reason over multiple facts. QRN considers the context sentences as a sequence of state-changing triggers, and reduces the original query to a more informed query as it observes each trigger (context sentence) through time. Our experiments show that QRN produces the state-of-the-art results in bAbI QA and dialog tasks, and in a real goal-oriented dialog dataset. In addition, QRN formulation allows parallelization on RNN’s time axis, saving an order of magnitude in time complexity for training and inference.
Tasks Goal-Oriented Dialog, Question Answering
Published 2016-06-14
URL http://arxiv.org/abs/1606.04582v6
PDF http://arxiv.org/pdf/1606.04582v6.pdf
PWC https://paperswithcode.com/paper/query-reduction-networks-for-question
Repo https://github.com/uwnlp/qrn
Framework tf

Using Deep Learning for Image-Based Plant Disease Detection

Title Using Deep Learning for Image-Based Plant Disease Detection
Authors Sharada Prasanna Mohanty, David Hughes, Marcel Salathe
Abstract Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. When testing the model on a set of images collected from trusted online sources - i.e. taken under conditions different from the images used for training - the model still achieves an accuracy of 31.4%. While this accuracy is much higher than the one based on random selection (2.6%), a more diverse set of training data is needed to improve the general accuracy. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path towards smartphone-assisted crop disease diagnosis on a massive global scale.
Tasks
Published 2016-04-11
URL http://arxiv.org/abs/1604.03169v2
PDF http://arxiv.org/pdf/1604.03169v2.pdf
PWC https://paperswithcode.com/paper/using-deep-learning-for-image-based-plant
Repo https://github.com/deepcpatel/GreenDoc
Framework tf
comments powered by Disqus