May 7, 2019

2749 words 13 mins read

Paper Group AWR 69

A Context-aware Natural Language Generator for Dialogue Systems. Predicting Domain Generation Algorithms with Long Short-Term Memory Networks. Harnessing Deep Neural Networks with Logic Rules. Globally Normalized Transition-Based Neural Networks. Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data. Theoretical …

A Context-aware Natural Language Generator for Dialogue Systems


Title	A Context-aware Natural Language Generator for Dialogue Systems
Authors	Ondřej Dušek, Filip Jurčíček
Abstract	We present a novel natural language generation system for spoken dialogue systems capable of entraining (adapting) to users’ way of speaking, providing contextually appropriate responses. The generator is based on recurrent neural networks and the sequence-to-sequence approach. It is fully trainable from data which include preceding context along with responses to be generated. We show that the context-aware generator yields significant improvements over the baseline in both automatic metrics and a human pairwise preference test.
Tasks	Spoken Dialogue Systems, Text Generation
Published	2016-08-25
URL	http://arxiv.org/abs/1608.07076v1
PDF	http://arxiv.org/pdf/1608.07076v1.pdf
PWC	https://paperswithcode.com/paper/a-context-aware-natural-language-generator
Repo	https://github.com/UFAL-DSG/tgen
Framework	tf

Predicting Domain Generation Algorithms with Long Short-Term Memory Networks


Title	Predicting Domain Generation Algorithms with Long Short-Term Memory Networks
Authors	Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, Daniel Grant
Abstract	Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to a command and control (C&C) server. In order to block DGA C&C traffic, security organizations must first discover the algorithm by reverse engineering malware samples, then generating a list of domains for a given seed. The domains are then either preregistered or published in a DNS blacklist. This process is not only tedious, but can be readily circumvented by malware authors using a large number of seeds in algorithms with multivariate recurrence properties (e.g., banjori) or by using a dynamic list of seeds (e.g., bedep). Another technique to stop malware from using DGAs is to intercept DNS queries on a network and predict whether domains are DGA generated. Such a technique will alert network administrators to the presence of malware on their networks. In addition, if the predictor can also accurately predict the family of DGAs, then network administrators can also be alerted to the type of malware that is on their networks. This paper presents a DGA classifier that leverages long short-term memory (LSTM) networks to predict DGAs and their respective families without the need for a priori feature extraction. Results are significantly better than state-of-the-art techniques, providing 0.9993 area under the receiver operating characteristic curve for binary classification and a micro-averaged F1 score of 0.9906. In other terms, the LSTM technique can provide a 90% detection rate with a 1:10000 false positive (FP) rate—a twenty times FP improvement over comparable methods. Experiments in this paper are run on open datasets and code snippets are provided to reproduce the results.
Tasks
Published	2016-11-02
URL	http://arxiv.org/abs/1611.00791v1
PDF	http://arxiv.org/pdf/1611.00791v1.pdf
PWC	https://paperswithcode.com/paper/predicting-domain-generation-algorithms-with
Repo	https://github.com/endgameinc/dga_predict
Framework	none

Harnessing Deep Neural Networks with Logic Rules


Title	Harnessing Deep Neural Networks with Logic Rules
Authors	Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, Eric Xing
Abstract	Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.
Tasks	Named Entity Recognition, Sentiment Analysis
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06318v5
PDF	http://arxiv.org/pdf/1603.06318v5.pdf
PWC	https://paperswithcode.com/paper/harnessing-deep-neural-networks-with-logic
Repo	https://github.com/ZhitingHu/logicnn
Framework	none

Globally Normalized Transition-Based Neural Networks


Title	Globally Normalized Transition-Based Neural Networks
Authors	Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins
Abstract	We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models. We discuss the importance of global as opposed to local normalization: a key insight is that the label bias problem implies that globally normalized models can be strictly more expressive than locally normalized models.
Tasks	Dependency Parsing, Part-Of-Speech Tagging, Sentence Compression
Published	2016-03-19
URL	http://arxiv.org/abs/1603.06042v2
PDF	http://arxiv.org/pdf/1603.06042v2.pdf
PWC	https://paperswithcode.com/paper/globally-normalized-transition-based-neural
Repo	https://github.com/tensorflow/models/tree/master/research/syntaxnet
Framework	tf

Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data


Title	Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
Authors	Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt
Abstract	We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction.
Tasks
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06432v3
PDF	http://arxiv.org/pdf/1605.06432v3.pdf
PWC	https://paperswithcode.com/paper/deep-variational-bayes-filters-unsupervised
Repo	https://github.com/baggepinnen/DVBF.jl
Framework	none

Theoretical Robopsychology: Samu Has Learned Turing Machines


Title	Theoretical Robopsychology: Samu Has Learned Turing Machines
Authors	Norbert Bátfai
Abstract	From the point of view of a programmer, the robopsychology is a synonym for the activity is done by developers to implement their machine learning applications. This robopsychological approach raises some fundamental theoretical questions of machine learning. Our discussion of these questions is constrained to Turing machines. Alan Turing had given an algorithm (aka the Turing Machine) to describe algorithms. If it has been applied to describe itself then this brings us to Turing’s notion of the universal machine. In the present paper, we investigate algorithms to write algorithms. From a pedagogy point of view, this way of writing programs can be considered as a combination of learning by listening and learning by doing due to it is based on applying agent technology and machine learning. As the main result we introduce the problem of learning and then we show that it cannot easily be handled in reality therefore it is reasonable to use machine learning algorithm for learning Turing machines.
Tasks
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02767v2
PDF	http://arxiv.org/pdf/1606.02767v2.pdf
PWC	https://paperswithcode.com/paper/theoretical-robopsychology-samu-has-learned
Repo	https://github.com/nbatfai/SamuCTuring
Framework	none

On-Demand Learning for Deep Image Restoration


Title	On-Demand Learning for Deep Image Restoration
Authors	Ruohan Gao, Kristen Grauman
Abstract	While machine learning approaches to image restoration offer great promise, current methods risk training models fixated on performing well only for image corruption of a particular level of difficulty—such as a certain level of noise or blur. First, we examine the weakness of conventional “fixated” models and demonstrate that training general models to handle arbitrary levels of corruption is indeed non-trivial. Then, we propose an on-demand learning algorithm for training image restoration models with deep convolutional neural networks. The main idea is to exploit a feedback mechanism to self-generate training instances where they are needed most, thereby learning models that can generalize across difficulty levels. On four restoration tasks—image inpainting, pixel interpolation, image deblurring, and image denoising—and three diverse datasets, our approach consistently outperforms both the status quo training procedure and curriculum learning alternatives.
Tasks	Deblurring, Denoising, Image Denoising, Image Inpainting, Image Restoration
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01380v3
PDF	http://arxiv.org/pdf/1612.01380v3.pdf
PWC	https://paperswithcode.com/paper/on-demand-learning-for-deep-image-restoration
Repo	https://github.com/rhgao/on-demand-learning
Framework	torch

Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks


Title	Dataflow Matrix Machines as a Generalization of Recurrent Neural Networks
Authors	Michael Bukatin, Steve Matthews, Andrey Radul
Abstract	Dataflow matrix machines are a powerful generalization of recurrent neural networks. They work with multiple types of arbitrary linear streams, multiple types of powerful neurons, and allow to incorporate higher-order constructions. We expect them to be useful in machine learning and probabilistic programming, and in the synthesis of dynamic systems and of deterministic and probabilistic programs.
Tasks	Probabilistic Programming
Published	2016-03-29
URL	http://arxiv.org/abs/1603.09002v2
PDF	http://arxiv.org/pdf/1603.09002v2.pdf
PWC	https://paperswithcode.com/paper/dataflow-matrix-machines-as-a-generalization
Repo	https://github.com/anhinga/fluid
Framework	none

Least Squares Generative Adversarial Networks


Title	Least Squares Generative Adversarial Networks
Authors	Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, Stephen Paul Smolley
Abstract	Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $\chi^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
Tasks
Published	2016-11-13
URL	http://arxiv.org/abs/1611.04076v3
PDF	http://arxiv.org/pdf/1611.04076v3.pdf
PWC	https://paperswithcode.com/paper/least-squares-generative-adversarial-networks
Repo	https://github.com/eriklindernoren/PyTorch-GAN
Framework	pytorch

Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields


Title	Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields
Authors	Patrick Ferdinand Christ, Mohamed Ezzeldin A. Elshaer, Florian Ettlinger, Sunil Tatavarty, Marc Bickel, Patrick Bilic, Markus Rempfler, Marco Armbruster, Felix Hofmann, Melvin D’Anastasi, Wieland H. Sommer, Seyed-Ahmad Ahmadi, Bjoern H. Menze
Abstract	Automatic segmentation of the liver and its lesion is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT abdomen images using cascaded fully convolutional neural networks (CFCNs) and dense 3D conditional random fields (CRFs). We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions from the predicted liver ROIs of step 1. We refine the segmentations of the CFCN using a dense 3D CRF that accounts for both spatial coherence and appearance. CFCN models were trained in a 2-fold cross-validation on the abdominal CT dataset 3DIRCAD comprising 15 hepatic tumor volumes. Our results show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We experimentally demonstrate the robustness of the proposed method as a decision support system with a high accuracy and speed for usage in daily clinical routine.
Tasks	Lesion Segmentation
Published	2016-10-07
URL	http://arxiv.org/abs/1610.02177v1
PDF	http://arxiv.org/pdf/1610.02177v1.pdf
PWC	https://paperswithcode.com/paper/automatic-liver-and-lesion-segmentation-in-ct
Repo	https://github.com/IBBM/Cascaded-FCN
Framework	tf

SimpleDS: A Simple Deep Reinforcement Learning Dialogue System


Title	SimpleDS: A Simple Deep Reinforcement Learning Dialogue System
Authors	Heriberto Cuayáhuitl
Abstract	This paper presents ‘SimpleDS’, a simple and publicly available dialogue system trained with deep reinforcement learning. In contrast to previous reinforcement learning dialogue systems, this system avoids manual feature engineering by performing action selection directly from raw text of the last system and (noisy) user responses. Our initial results, in the restaurant domain, show that it is indeed possible to induce reasonable dialogue behaviour with an approach that aims for high levels of automation in dialogue control for intelligent interactive agents.
Tasks	Feature Engineering
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04574v1
PDF	http://arxiv.org/pdf/1601.04574v1.pdf
PWC	https://paperswithcode.com/paper/simpleds-a-simple-deep-reinforcement-learning
Repo	https://github.com/cuayahuitl/SimpleDS
Framework	none


Title	Recurrent Neural Network based Part-of-Speech Tagger for Code-Mixed Social Media Text
Authors	Raj Nath Patel, Prakash B. Pimpale, M Sasikumar
Abstract	This paper describes Centre for Development of Advanced Computing’s (CDACM) submission to the shared task-‘Tool Contest on POS tagging for Code-Mixed Indian Social Media (Facebook, Twitter, and Whatsapp) Text’, collocated with ICON-2016. The shared task was to predict Part of Speech (POS) tag at word level for a given text. The code-mixed text is generated mostly on social media by multilingual users. The presence of the multilingual words, transliterations, and spelling variations make such content linguistically complex. In this paper, we propose an approach to POS tag code-mixed social media text using Recurrent Neural Network Language Model (RNN-LM) architecture. We submitted the results for Hindi-English (hi-en), Bengali-English (bn-en), and Telugu-English (te-en) code-mixed data.
Tasks	Language Modelling
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04989v2
PDF	http://arxiv.org/pdf/1611.04989v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-network-based-part-of-speech
Repo	https://github.com/patelrajnath/rnn4nlp
Framework	pytorch

ArtTrack: Articulated Multi-person Tracking in the Wild


Title	ArtTrack: Articulated Multi-person Tracking in the Wild
Authors	Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Evgeny Levinkov, Bjoern Andres, Bernt Schiele
Abstract	In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos. Our starting point is a model that resembles existing architectures for single-frame pose estimation but is substantially faster. We achieve this in two ways: (1) by simplifying and sparsifying the body-part relationship graph and leveraging recent methods for faster inference, and (2) by offloading a substantial share of computation onto a feed-forward convolutional architecture that is able to detect and associate body joints of the same person even in clutter. We use this model to generate proposals for body joint locations and formulate articulated tracking as spatio-temporal grouping of such proposals. This allows to jointly solve the association problem for all people in the scene by propagating evidence from strong detections through time and enforcing constraints that each proposal can be assigned to one person only. We report results on a public MPII Human Pose benchmark and on a new MPII Video Pose dataset of image sequences with multiple people. We demonstrate that our model achieves state-of-the-art results while using only a fraction of time and is able to leverage temporal information to improve state-of-the-art for crowded scenes.
Tasks	Multi-Person Pose Estimation, Pose Estimation
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01465v3
PDF	http://arxiv.org/pdf/1612.01465v3.pdf
PWC	https://paperswithcode.com/paper/arttrack-articulated-multi-person-tracking-in
Repo	https://github.com/PJunhyuk/exercise-pose-analyzer
Framework	tf

Query-Reduction Networks for Question Answering


Title	Query-Reduction Networks for Question Answering
Authors	Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
Abstract	In this paper, we study the problem of question answering when reasoning over multiple facts is required. We propose Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that effectively handles both short-term (local) and long-term (global) sequential dependencies to reason over multiple facts. QRN considers the context sentences as a sequence of state-changing triggers, and reduces the original query to a more informed query as it observes each trigger (context sentence) through time. Our experiments show that QRN produces the state-of-the-art results in bAbI QA and dialog tasks, and in a real goal-oriented dialog dataset. In addition, QRN formulation allows parallelization on RNN’s time axis, saving an order of magnitude in time complexity for training and inference.
Tasks	Goal-Oriented Dialog, Question Answering
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04582v6
PDF	http://arxiv.org/pdf/1606.04582v6.pdf
PWC	https://paperswithcode.com/paper/query-reduction-networks-for-question
Repo	https://github.com/uwnlp/qrn
Framework	tf

Using Deep Learning for Image-Based Plant Disease Detection


Title	Using Deep Learning for Image-Based Plant Disease Detection
Authors	Sharada Prasanna Mohanty, David Hughes, Marcel Salathe
Abstract	Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. When testing the model on a set of images collected from trusted online sources - i.e. taken under conditions different from the images used for training - the model still achieves an accuracy of 31.4%. While this accuracy is much higher than the one based on random selection (2.6%), a more diverse set of training data is needed to improve the general accuracy. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path towards smartphone-assisted crop disease diagnosis on a massive global scale.
Tasks
Published	2016-04-11
URL	http://arxiv.org/abs/1604.03169v2
PDF	http://arxiv.org/pdf/1604.03169v2.pdf
PWC	https://paperswithcode.com/paper/using-deep-learning-for-image-based-plant
Repo	https://github.com/deepcpatel/GreenDoc
Framework	tf