Paper Group AWR 22
Evolving Deep Neural Networks. See, Hear, and Read: Deep Aligned Representations. A Joint Model for Question Answering and Question Generation. Deep Learning as a Mixed Convex-Combinatorial Optimization Problem. End-to-end Adversarial Learning for Generative Conversational Agents. Learning Bag-of-Features Pooling for Deep Convolutional Neural Netwo …
Evolving Deep Neural Networks
Title | Evolving Deep Neural Networks |
Authors | Risto Miikkulainen, Jason Liang, Elliot Meyerson, Aditya Rawal, Dan Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, Babak Hodjat |
Abstract | The success of deep learning depends on finding an architecture to fit the task. As deep learning has scaled up to more challenging tasks, the architectures have become difficult to design by hand. This paper proposes an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution. By extending existing neuroevolution methods to topology, components, and hyperparameters, this method achieves results comparable to best human designs in standard benchmarks in object recognition and language modeling. It also supports building a real-world application of automated image captioning on a magazine website. Given the anticipated increases in available computing power, evolution of deep networks is promising approach to constructing deep learning applications in the future. |
Tasks | Image Captioning, Language Modelling, Object Recognition |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00548v2 |
http://arxiv.org/pdf/1703.00548v2.pdf | |
PWC | https://paperswithcode.com/paper/evolving-deep-neural-networks |
Repo | https://github.com/sbcblab/Keras-CoDeepNEAT |
Framework | tf |
See, Hear, and Read: Deep Aligned Representations
Title | See, Hear, and Read: Deep Aligned Representations |
Authors | Yusuf Aytar, Carl Vondrick, Antonio Torralba |
Abstract | We capitalize on large amounts of readily-available, synchronous data to learn a deep discriminative representations shared across three major natural modalities: vision, sound and language. By leveraging over a year of sound from video and millions of sentences paired with images, we jointly train a deep convolutional network for aligned representation learning. Our experiments suggest that this representation is useful for several tasks, such as cross-modal retrieval or transferring classifiers between modalities. Moreover, although our network is only trained with image+text and image+sound pairs, it can transfer between text and sound as well, a transfer the network never observed during training. Visualizations of our representation reveal many hidden units which automatically emerge to detect concepts, independent of the modality. |
Tasks | Cross-Modal Retrieval, Representation Learning |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00932v1 |
http://arxiv.org/pdf/1706.00932v1.pdf | |
PWC | https://paperswithcode.com/paper/see-hear-and-read-deep-aligned |
Repo | https://github.com/jingliao132/CrossModalRetrieval |
Framework | pytorch |
A Joint Model for Question Answering and Question Generation
Title | A Joint Model for Question Answering and Question Generation |
Authors | Tong Wang, Xingdi Yuan, Adam Trischler |
Abstract | We propose a generative machine comprehension model that learns jointly to ask and answer questions based on documents. The proposed model uses a sequence-to-sequence framework that encodes the document and generates a question (answer) given an answer (question). Significant improvement in model performance is observed empirically on the SQuAD corpus, confirming our hypothesis that the model benefits from jointly learning to perform both tasks. We believe the joint model’s novelty offers a new perspective on machine comprehension beyond architectural engineering, and serves as a first step towards autonomous information seeking. |
Tasks | Question Answering, Question Generation, Reading Comprehension |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01450v1 |
http://arxiv.org/pdf/1706.01450v1.pdf | |
PWC | https://paperswithcode.com/paper/a-joint-model-for-question-answering-and |
Repo | https://github.com/partoftheorigin/100DaysOfMLCode |
Framework | tf |
Deep Learning as a Mixed Convex-Combinatorial Optimization Problem
Title | Deep Learning as a Mixed Convex-Combinatorial Optimization Problem |
Authors | Abram L. Friesen, Pedro Domingos |
Abstract | As neural networks grow deeper and wider, learning networks with hard-threshold activations is becoming increasingly important, both for network quantization, which can drastically reduce time and energy requirements, and for creating large integrated systems of deep networks, which may have non-differentiable components and must avoid vanishing and exploding gradients for effective learning. However, since gradient descent is not applicable to hard-threshold functions, it is not clear how to learn networks of them in a principled way. We address this problem by observing that setting targets for hard-threshold hidden units in order to minimize loss is a discrete optimization problem, and can be solved as such. The discrete optimization goal is to find a set of targets such that each unit, including the output, has a linearly separable problem to solve. Given these targets, the network decomposes into individual perceptrons, which can then be learned with standard convex approaches. Based on this, we develop a recursive mini-batch algorithm for learning deep hard-threshold networks that includes the popular but poorly justified straight-through estimator as a special case. Empirically, we show that our algorithm improves classification accuracy in a number of settings, including for AlexNet and ResNet-18 on ImageNet, when compared to the straight-through estimator. |
Tasks | Combinatorial Optimization, Quantization |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11573v3 |
http://arxiv.org/pdf/1710.11573v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-as-a-mixed-convex-combinatorial |
Repo | https://github.com/afriesen/ftprop |
Framework | pytorch |
End-to-end Adversarial Learning for Generative Conversational Agents
Title | End-to-end Adversarial Learning for Generative Conversational Agents |
Authors | Oswaldo Ludwig |
Abstract | This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations show that the adversarial method yields significant performance gains over the usual teacher forcing training. |
Tasks | Dialogue Generation |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10122v3 |
http://arxiv.org/pdf/1711.10122v3.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-adversarial-learning-for |
Repo | https://github.com/oswaldoludwig/Seq2seq-Chatbot-for-Keras |
Framework | none |
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Title | Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks |
Authors | Nikolaos Passalis, Anastasios Tefas |
Abstract | Convolutional Neural Networks (CNNs) are well established models capable of achieving state-of-the-art classification accuracy for various computer vision tasks. However, they are becoming increasingly larger, using millions of parameters, while they are restricted to handling images of fixed size. In this paper, a quantization-based approach, inspired from the well-known Bag-of-Features model, is proposed to overcome these limitations. The proposed approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the information extracted from the convolutional layers and it is able to natively classify images of various sizes as well as to significantly reduce the number of parameters in the network. In contrast to other global pooling operators and CNN compression techniques the proposed method utilizes a trainable pooling layer that it is end-to-end differentiable, allowing the network to be trained using regular back-propagation and to achieve greater distribution shift invariance than competitive methods. The ability of the proposed method to reduce the parameters of the network and increase the classification accuracy over other state-of-the-art techniques is demonstrated using three image datasets. |
Tasks | Quantization |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08105v2 |
http://arxiv.org/pdf/1707.08105v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-bag-of-features-pooling-for-deep |
Repo | https://github.com/passalis/cbof |
Framework | none |
Social Attention: Modeling Attention in Human Crowds
Title | Social Attention: Modeling Attention in Human Crowds |
Authors | Anirudh Vemula, Katharina Muelling, Jean Oh |
Abstract | Robots that navigate through human crowds need to be able to plan safe, efficient, and human predictable trajectories. This is a particularly challenging problem as it requires the robot to predict future human trajectories within a crowd where everyone implicitly cooperates with each other to avoid collisions. Previous approaches to human trajectory prediction have modeled the interactions between humans as a function of proximity. However, that is not necessarily true as some people in our immediate vicinity moving in the same direction might not be as important as other people that are further away, but that might collide with us in the future. In this work, we propose Social Attention, a novel trajectory prediction model that captures the relative importance of each person when navigating in the crowd, irrespective of their proximity. We demonstrate the performance of our method against a state-of-the-art approach on two publicly available crowd datasets and analyze the trained attention model to gain a better understanding of which surrounding agents humans attend to, when navigating in a crowd. |
Tasks | Trajectory Prediction |
Published | 2017-10-12 |
URL | http://arxiv.org/abs/1710.04689v2 |
http://arxiv.org/pdf/1710.04689v2.pdf | |
PWC | https://paperswithcode.com/paper/social-attention-modeling-attention-in-human |
Repo | https://github.com/huang-xx/STGAT |
Framework | pytorch |
Learning to Ask: Neural Question Generation for Reading Comprehension
Title | Learning to Ask: Neural Question Generation for Reading Comprehension |
Authors | Xinya Du, Junru Shao, Claire Cardie |
Abstract | We study automatic question generation for sentences from text passages in reading comprehension. We introduce an attention-based sequence learning model for the task and investigate the effect of encoding sentence- vs. paragraph-level information. In contrast to all previous work, our model does not rely on hand-crafted rules or a sophisticated NLP pipeline; it is instead trainable end-to-end via sequence-to-sequence learning. Automatic evaluation results show that our system significantly outperforms the state-of-the-art rule-based system. In human evaluations, questions generated by our system are also rated as being more natural (i.e., grammaticality, fluency) and as more difficult to answer (in terms of syntactic and lexical divergence from the original text and reasoning needed to answer). |
Tasks | Question Generation, Reading Comprehension |
Published | 2017-04-29 |
URL | http://arxiv.org/abs/1705.00106v1 |
http://arxiv.org/pdf/1705.00106v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-ask-neural-question-generation |
Repo | https://github.com/MinyeLee/Question-Generation-Pytorch |
Framework | pytorch |
The NarrativeQA Reading Comprehension Challenge
Title | The NarrativeQA Reading Comprehension Challenge |
Authors | Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette |
Abstract | Reading comprehension (RC)—in contrast to information retrieval—requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess RC ability, in both artificial agents and children learning to read. However, existing RC datasets and tasks are dominated by questions that can be solved by selecting answers using superficial information (e.g., local context similarity or global term frequency); they thus fail to test for the essential integrative aspect of RC. To encourage progress on deeper comprehension of language, we present a new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts. These tasks are designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard RC models struggle on the tasks presented here. We provide an analysis of the dataset and the challenges it presents. |
Tasks | Information Retrieval, Question Answering, Reading Comprehension |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.07040v1 |
http://arxiv.org/pdf/1712.07040v1.pdf | |
PWC | https://paperswithcode.com/paper/the-narrativeqa-reading-comprehension |
Repo | https://github.com/deepmind/narrativeqa |
Framework | none |
KBLRN : End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features
Title | KBLRN : End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features |
Authors | Alberto Garcia-Duran, Mathias Niepert |
Abstract | We present KBLRN, a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features. KBLRN integrates feature types with a novel combination of neural representation learning and probabilistic product of experts models. To the best of our knowledge, KBLRN is the first approach that learns representations of knowledge bases by integrating latent, relational, and numerical features. We show that instances of KBLRN outperform existing methods on a range of knowledge base completion tasks. We contribute a novel data sets enriching commonly used knowledge base completion benchmarks with numerical features. The data sets are available under a permissive BSD-3 license. We also investigate the impact numerical features have on the KB completion performance of KBLRN. |
Tasks | Knowledge Base Completion, Representation Learning |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04676v3 |
http://arxiv.org/pdf/1709.04676v3.pdf | |
PWC | https://paperswithcode.com/paper/kblrn-end-to-end-learning-of-knowledge-base |
Repo | https://github.com/nle-ml/mmkb |
Framework | none |
Interpreting Blackbox Models via Model Extraction
Title | Interpreting Blackbox Models via Model Extraction |
Authors | Osbert Bastani, Carolyn Kim, Hamsa Bastani |
Abstract | Interpretability has become incredibly important as machine learning is increasingly used to inform consequential decisions. We propose to construct global explanations of complex, blackbox models in the form of a decision tree approximating the original model—as long as the decision tree is a good approximation, then it mirrors the computation performed by the blackbox model. We devise a novel algorithm for extracting decision tree explanations that actively samples new training points to avoid overfitting. We evaluate our algorithm on a random forest to predict diabetes risk and a learned controller for cart-pole. Compared to several baselines, our decision trees are both substantially more accurate and equally or more interpretable based on a user study. Finally, we describe several insights provided by our interpretations, including a causal issue validated by a physician. |
Tasks | |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08504v6 |
http://arxiv.org/pdf/1705.08504v6.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-blackbox-models-via-model |
Repo | https://github.com/aclarkData/MachineLearningInterpretability |
Framework | none |
Differentially Private Federated Learning: A Client Level Perspective
Title | Differentially Private Federated Learning: A Client Level Perspective |
Authors | Robin C. Geyer, Tassilo Klein, Moin Nabi |
Abstract | Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, which could originate from any party contributing during federated optimization. In such an attack, a client’s contribution during training and information about their data set is revealed through analyzing the distributed model. We tackle this problem and propose an algorithm for client sided differential privacy preserving federated optimization. The aim is to hide clients’ contributions during training, balancing the trade-off between privacy loss and model performance. Empirical studies suggest that given a sufficiently large number of participating clients, our proposed procedure can maintain client-level differential privacy at only a minor cost in model performance. |
Tasks | |
Published | 2017-12-20 |
URL | http://arxiv.org/abs/1712.07557v2 |
http://arxiv.org/pdf/1712.07557v2.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-federated-learning-a |
Repo | https://github.com/cyrusgeyer/DiffPrivate_FedLearning |
Framework | tf |
Updating the VESICLE-CNN Synapse Detector
Title | Updating the VESICLE-CNN Synapse Detector |
Authors | Andrew Warrington, Frank Wood |
Abstract | We present an updated version of the VESICLE-CNN algorithm presented by Roncal et al. (2014). The original implementation makes use of a patch-based approach. This methodology is known to be slow due to repeated computations. We update this implementation to be fully convolutional through the use of dilated convolutions, recovering the expanded field of view achieved through the use of strided maxpools, but without a degradation of spatial resolution. This updated implementation performs as well as the original implementation, but with a $600\times$ speedup at test time. We release source code and data into the public domain. |
Tasks | |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11397v1 |
http://arxiv.org/pdf/1710.11397v1.pdf | |
PWC | https://paperswithcode.com/paper/updating-the-vesicle-cnn-synapse-detector |
Repo | https://github.com/andrewwarrington/vesicle-cnn-2 |
Framework | none |
A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data
Title | A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data |
Authors | Abien Fred Agarap |
Abstract | Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are types of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing (Wen et al., 2015), speech recognition (Chorowski et al., 2015), and text classification (Yang et al., 2016). Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies (Alalshekmubarak & Smith, 2013; Tang, 2013), this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a training accuracy of ~81.54% and a testing accuracy of ~84.15%, while the latter was able to reach a training accuracy of ~63.07% and a testing accuracy of ~70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time - a theoretical implication which was supported by the actual training and testing time in the study. |
Tasks | Intrusion Detection, Speech Recognition, Text Classification |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.03082v8 |
http://arxiv.org/pdf/1709.03082v8.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-architecture-combining-gated |
Repo | https://github.com/AFAgarap/cnn-svm |
Framework | tf |
Semantic Autoencoder for Zero-Shot Learning
Title | Semantic Autoencoder for Zero-Shot Learning |
Authors | Elyor Kodirov, Tao Xiang, Shaogang Gong |
Abstract | Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g.~attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g.~attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen) classes without training data, a ZSL model typically suffers from the project domain shift problem. In this work, we present a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the encoder-decoder paradigm, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models. However, the decoder exerts an additional constraint, that is, the projection/code must be able to reconstruct the original visual feature. We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes. Importantly, the encoder and decoder are linear and symmetric which enable us to develop an extremely efficient learning algorithm. Extensive experiments on six benchmark datasets demonstrate that the proposed SAE outperforms significantly the existing ZSL models with the additional benefit of lower computational cost. Furthermore, when the SAE is applied to supervised clustering problem, it also beats the state-of-the-art. |
Tasks | Zero-Shot Learning |
Published | 2017-04-26 |
URL | http://arxiv.org/abs/1704.08345v1 |
http://arxiv.org/pdf/1704.08345v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-autoencoder-for-zero-shot-learning |
Repo | https://github.com/hoseong-kim/sae-pytorch |
Framework | pytorch |