July 30, 2019

2954 words 14 mins read

Paper Group AWR 22

Evolving Deep Neural Networks. See, Hear, and Read: Deep Aligned Representations. A Joint Model for Question Answering and Question Generation. Deep Learning as a Mixed Convex-Combinatorial Optimization Problem. End-to-end Adversarial Learning for Generative Conversational Agents. Learning Bag-of-Features Pooling for Deep Convolutional Neural Netwo …

Evolving Deep Neural Networks


Title	Evolving Deep Neural Networks
Authors	Risto Miikkulainen, Jason Liang, Elliot Meyerson, Aditya Rawal, Dan Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, Babak Hodjat
Abstract	The success of deep learning depends on finding an architecture to fit the task. As deep learning has scaled up to more challenging tasks, the architectures have become difficult to design by hand. This paper proposes an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution. By extending existing neuroevolution methods to topology, components, and hyperparameters, this method achieves results comparable to best human designs in standard benchmarks in object recognition and language modeling. It also supports building a real-world application of automated image captioning on a magazine website. Given the anticipated increases in available computing power, evolution of deep networks is promising approach to constructing deep learning applications in the future.
Tasks	Image Captioning, Language Modelling, Object Recognition
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00548v2
PDF	http://arxiv.org/pdf/1703.00548v2.pdf
PWC	https://paperswithcode.com/paper/evolving-deep-neural-networks
Repo	https://github.com/sbcblab/Keras-CoDeepNEAT
Framework	tf

See, Hear, and Read: Deep Aligned Representations


Title	See, Hear, and Read: Deep Aligned Representations
Authors	Yusuf Aytar, Carl Vondrick, Antonio Torralba
Abstract	We capitalize on large amounts of readily-available, synchronous data to learn a deep discriminative representations shared across three major natural modalities: vision, sound and language. By leveraging over a year of sound from video and millions of sentences paired with images, we jointly train a deep convolutional network for aligned representation learning. Our experiments suggest that this representation is useful for several tasks, such as cross-modal retrieval or transferring classifiers between modalities. Moreover, although our network is only trained with image+text and image+sound pairs, it can transfer between text and sound as well, a transfer the network never observed during training. Visualizations of our representation reveal many hidden units which automatically emerge to detect concepts, independent of the modality.
Tasks	Cross-Modal Retrieval, Representation Learning
Published	2017-06-03
URL	http://arxiv.org/abs/1706.00932v1
PDF	http://arxiv.org/pdf/1706.00932v1.pdf
PWC	https://paperswithcode.com/paper/see-hear-and-read-deep-aligned
Repo	https://github.com/jingliao132/CrossModalRetrieval
Framework	pytorch

A Joint Model for Question Answering and Question Generation


Title	A Joint Model for Question Answering and Question Generation
Authors	Tong Wang, Xingdi Yuan, Adam Trischler
Abstract	We propose a generative machine comprehension model that learns jointly to ask and answer questions based on documents. The proposed model uses a sequence-to-sequence framework that encodes the document and generates a question (answer) given an answer (question). Significant improvement in model performance is observed empirically on the SQuAD corpus, confirming our hypothesis that the model benefits from jointly learning to perform both tasks. We believe the joint model’s novelty offers a new perspective on machine comprehension beyond architectural engineering, and serves as a first step towards autonomous information seeking.
Tasks	Question Answering, Question Generation, Reading Comprehension
Published	2017-06-05
URL	http://arxiv.org/abs/1706.01450v1
PDF	http://arxiv.org/pdf/1706.01450v1.pdf
PWC	https://paperswithcode.com/paper/a-joint-model-for-question-answering-and
Repo	https://github.com/partoftheorigin/100DaysOfMLCode
Framework	tf

Deep Learning as a Mixed Convex-Combinatorial Optimization Problem


Title	Deep Learning as a Mixed Convex-Combinatorial Optimization Problem
Authors	Abram L. Friesen, Pedro Domingos
Abstract	As neural networks grow deeper and wider, learning networks with hard-threshold activations is becoming increasingly important, both for network quantization, which can drastically reduce time and energy requirements, and for creating large integrated systems of deep networks, which may have non-differentiable components and must avoid vanishing and exploding gradients for effective learning. However, since gradient descent is not applicable to hard-threshold functions, it is not clear how to learn networks of them in a principled way. We address this problem by observing that setting targets for hard-threshold hidden units in order to minimize loss is a discrete optimization problem, and can be solved as such. The discrete optimization goal is to find a set of targets such that each unit, including the output, has a linearly separable problem to solve. Given these targets, the network decomposes into individual perceptrons, which can then be learned with standard convex approaches. Based on this, we develop a recursive mini-batch algorithm for learning deep hard-threshold networks that includes the popular but poorly justified straight-through estimator as a special case. Empirically, we show that our algorithm improves classification accuracy in a number of settings, including for AlexNet and ResNet-18 on ImageNet, when compared to the straight-through estimator.
Tasks	Combinatorial Optimization, Quantization
Published	2017-10-31
URL	http://arxiv.org/abs/1710.11573v3
PDF	http://arxiv.org/pdf/1710.11573v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-as-a-mixed-convex-combinatorial
Repo	https://github.com/afriesen/ftprop
Framework	pytorch

End-to-end Adversarial Learning for Generative Conversational Agents


Title	End-to-end Adversarial Learning for Generative Conversational Agents
Authors	Oswaldo Ludwig
Abstract	This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations show that the adversarial method yields significant performance gains over the usual teacher forcing training.
Tasks	Dialogue Generation
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10122v3
PDF	http://arxiv.org/pdf/1711.10122v3.pdf
PWC	https://paperswithcode.com/paper/end-to-end-adversarial-learning-for
Repo	https://github.com/oswaldoludwig/Seq2seq-Chatbot-for-Keras
Framework	none

Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks


Title	Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Authors	Nikolaos Passalis, Anastasios Tefas
Abstract	Convolutional Neural Networks (CNNs) are well established models capable of achieving state-of-the-art classification accuracy for various computer vision tasks. However, they are becoming increasingly larger, using millions of parameters, while they are restricted to handling images of fixed size. In this paper, a quantization-based approach, inspired from the well-known Bag-of-Features model, is proposed to overcome these limitations. The proposed approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the information extracted from the convolutional layers and it is able to natively classify images of various sizes as well as to significantly reduce the number of parameters in the network. In contrast to other global pooling operators and CNN compression techniques the proposed method utilizes a trainable pooling layer that it is end-to-end differentiable, allowing the network to be trained using regular back-propagation and to achieve greater distribution shift invariance than competitive methods. The ability of the proposed method to reduce the parameters of the network and increase the classification accuracy over other state-of-the-art techniques is demonstrated using three image datasets.
Tasks	Quantization
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08105v2
PDF	http://arxiv.org/pdf/1707.08105v2.pdf
PWC	https://paperswithcode.com/paper/learning-bag-of-features-pooling-for-deep
Repo	https://github.com/passalis/cbof
Framework	none


Title	Social Attention: Modeling Attention in Human Crowds
Authors	Anirudh Vemula, Katharina Muelling, Jean Oh
Abstract	Robots that navigate through human crowds need to be able to plan safe, efficient, and human predictable trajectories. This is a particularly challenging problem as it requires the robot to predict future human trajectories within a crowd where everyone implicitly cooperates with each other to avoid collisions. Previous approaches to human trajectory prediction have modeled the interactions between humans as a function of proximity. However, that is not necessarily true as some people in our immediate vicinity moving in the same direction might not be as important as other people that are further away, but that might collide with us in the future. In this work, we propose Social Attention, a novel trajectory prediction model that captures the relative importance of each person when navigating in the crowd, irrespective of their proximity. We demonstrate the performance of our method against a state-of-the-art approach on two publicly available crowd datasets and analyze the trained attention model to gain a better understanding of which surrounding agents humans attend to, when navigating in a crowd.
Tasks	Trajectory Prediction
Published	2017-10-12
URL	http://arxiv.org/abs/1710.04689v2
PDF	http://arxiv.org/pdf/1710.04689v2.pdf
PWC	https://paperswithcode.com/paper/social-attention-modeling-attention-in-human
Repo	https://github.com/huang-xx/STGAT
Framework	pytorch

Learning to Ask: Neural Question Generation for Reading Comprehension


Title	Learning to Ask: Neural Question Generation for Reading Comprehension
Authors	Xinya Du, Junru Shao, Claire Cardie
Abstract	We study automatic question generation for sentences from text passages in reading comprehension. We introduce an attention-based sequence learning model for the task and investigate the effect of encoding sentence- vs. paragraph-level information. In contrast to all previous work, our model does not rely on hand-crafted rules or a sophisticated NLP pipeline; it is instead trainable end-to-end via sequence-to-sequence learning. Automatic evaluation results show that our system significantly outperforms the state-of-the-art rule-based system. In human evaluations, questions generated by our system are also rated as being more natural (i.e., grammaticality, fluency) and as more difficult to answer (in terms of syntactic and lexical divergence from the original text and reasoning needed to answer).
Tasks	Question Generation, Reading Comprehension
Published	2017-04-29
URL	http://arxiv.org/abs/1705.00106v1
PDF	http://arxiv.org/pdf/1705.00106v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-ask-neural-question-generation
Repo	https://github.com/MinyeLee/Question-Generation-Pytorch
Framework	pytorch

The NarrativeQA Reading Comprehension Challenge


Title	The NarrativeQA Reading Comprehension Challenge
Authors	Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette
Abstract	Reading comprehension (RC)—in contrast to information retrieval—requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess RC ability, in both artificial agents and children learning to read. However, existing RC datasets and tasks are dominated by questions that can be solved by selecting answers using superficial information (e.g., local context similarity or global term frequency); they thus fail to test for the essential integrative aspect of RC. To encourage progress on deeper comprehension of language, we present a new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts. These tasks are designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard RC models struggle on the tasks presented here. We provide an analysis of the dataset and the challenges it presents.
Tasks	Information Retrieval, Question Answering, Reading Comprehension
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07040v1
PDF	http://arxiv.org/pdf/1712.07040v1.pdf
PWC	https://paperswithcode.com/paper/the-narrativeqa-reading-comprehension
Repo	https://github.com/deepmind/narrativeqa
Framework	none

KBLRN : End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features


Title	KBLRN : End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features
Authors	Alberto Garcia-Duran, Mathias Niepert
Abstract	We present KBLRN, a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features. KBLRN integrates feature types with a novel combination of neural representation learning and probabilistic product of experts models. To the best of our knowledge, KBLRN is the first approach that learns representations of knowledge bases by integrating latent, relational, and numerical features. We show that instances of KBLRN outperform existing methods on a range of knowledge base completion tasks. We contribute a novel data sets enriching commonly used knowledge base completion benchmarks with numerical features. The data sets are available under a permissive BSD-3 license. We also investigate the impact numerical features have on the KB completion performance of KBLRN.
Tasks	Knowledge Base Completion, Representation Learning
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04676v3
PDF	http://arxiv.org/pdf/1709.04676v3.pdf
PWC	https://paperswithcode.com/paper/kblrn-end-to-end-learning-of-knowledge-base
Repo	https://github.com/nle-ml/mmkb
Framework	none

Interpreting Blackbox Models via Model Extraction


Title	Interpreting Blackbox Models via Model Extraction
Authors	Osbert Bastani, Carolyn Kim, Hamsa Bastani
Abstract	Interpretability has become incredibly important as machine learning is increasingly used to inform consequential decisions. We propose to construct global explanations of complex, blackbox models in the form of a decision tree approximating the original model—as long as the decision tree is a good approximation, then it mirrors the computation performed by the blackbox model. We devise a novel algorithm for extracting decision tree explanations that actively samples new training points to avoid overfitting. We evaluate our algorithm on a random forest to predict diabetes risk and a learned controller for cart-pole. Compared to several baselines, our decision trees are both substantially more accurate and equally or more interpretable based on a user study. Finally, we describe several insights provided by our interpretations, including a causal issue validated by a physician.
Tasks
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08504v6
PDF	http://arxiv.org/pdf/1705.08504v6.pdf
PWC	https://paperswithcode.com/paper/interpreting-blackbox-models-via-model
Repo	https://github.com/aclarkData/MachineLearningInterpretability
Framework	none

Differentially Private Federated Learning: A Client Level Perspective


Title	Differentially Private Federated Learning: A Client Level Perspective
Authors	Robin C. Geyer, Tassilo Klein, Moin Nabi
Abstract	Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, which could originate from any party contributing during federated optimization. In such an attack, a client’s contribution during training and information about their data set is revealed through analyzing the distributed model. We tackle this problem and propose an algorithm for client sided differential privacy preserving federated optimization. The aim is to hide clients’ contributions during training, balancing the trade-off between privacy loss and model performance. Empirical studies suggest that given a sufficiently large number of participating clients, our proposed procedure can maintain client-level differential privacy at only a minor cost in model performance.
Tasks
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07557v2
PDF	http://arxiv.org/pdf/1712.07557v2.pdf
PWC	https://paperswithcode.com/paper/differentially-private-federated-learning-a
Repo	https://github.com/cyrusgeyer/DiffPrivate_FedLearning
Framework	tf

Updating the VESICLE-CNN Synapse Detector


Title	Updating the VESICLE-CNN Synapse Detector
Authors	Andrew Warrington, Frank Wood
Abstract	We present an updated version of the VESICLE-CNN algorithm presented by Roncal et al. (2014). The original implementation makes use of a patch-based approach. This methodology is known to be slow due to repeated computations. We update this implementation to be fully convolutional through the use of dilated convolutions, recovering the expanded field of view achieved through the use of strided maxpools, but without a degradation of spatial resolution. This updated implementation performs as well as the original implementation, but with a $600\times$ speedup at test time. We release source code and data into the public domain.
Tasks
Published	2017-10-31
URL	http://arxiv.org/abs/1710.11397v1
PDF	http://arxiv.org/pdf/1710.11397v1.pdf
PWC	https://paperswithcode.com/paper/updating-the-vesicle-cnn-synapse-detector
Repo	https://github.com/andrewwarrington/vesicle-cnn-2
Framework	none

A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data


Title	A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data
Authors	Abien Fred Agarap
Abstract	Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are types of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing (Wen et al., 2015), speech recognition (Chorowski et al., 2015), and text classification (Yang et al., 2016). Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies (Alalshekmubarak & Smith, 2013; Tang, 2013), this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a training accuracy of ~81.54% and a testing accuracy of ~84.15%, while the latter was able to reach a training accuracy of ~63.07% and a testing accuracy of ~70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time - a theoretical implication which was supported by the actual training and testing time in the study.
Tasks	Intrusion Detection, Speech Recognition, Text Classification
Published	2017-09-10
URL	http://arxiv.org/abs/1709.03082v8
PDF	http://arxiv.org/pdf/1709.03082v8.pdf
PWC	https://paperswithcode.com/paper/a-neural-network-architecture-combining-gated
Repo	https://github.com/AFAgarap/cnn-svm
Framework	tf

Semantic Autoencoder for Zero-Shot Learning


Title	Semantic Autoencoder for Zero-Shot Learning
Authors	Elyor Kodirov, Tao Xiang, Shaogang Gong
Abstract	Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g.~attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g.~attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen) classes without training data, a ZSL model typically suffers from the project domain shift problem. In this work, we present a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the encoder-decoder paradigm, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models. However, the decoder exerts an additional constraint, that is, the projection/code must be able to reconstruct the original visual feature. We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes. Importantly, the encoder and decoder are linear and symmetric which enable us to develop an extremely efficient learning algorithm. Extensive experiments on six benchmark datasets demonstrate that the proposed SAE outperforms significantly the existing ZSL models with the additional benefit of lower computational cost. Furthermore, when the SAE is applied to supervised clustering problem, it also beats the state-of-the-art.
Tasks	Zero-Shot Learning
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08345v1
PDF	http://arxiv.org/pdf/1704.08345v1.pdf
PWC	https://paperswithcode.com/paper/semantic-autoencoder-for-zero-shot-learning
Repo	https://github.com/hoseong-kim/sae-pytorch
Framework	pytorch