October 21, 2019

3226 words 16 mins read

Paper Group AWR 57

Paper Group AWR 57

Concolic Testing for Deep Neural Networks. Active Neural Localization. Fast Convex Pruning of Deep Neural Networks. A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation. Improving Review Representations with User Attention and Product Attention for Sentiment Classification. dhSegment: A generic deep-learning …

Concolic Testing for Deep Neural Networks

Title Concolic Testing for Deep Neural Networks
Authors Youcheng Sun, Min Wu, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska, Daniel Kroening
Abstract Concolic testing combines program execution and symbolic analysis to explore the execution paths of a software program. This paper presents the first concolic testing approach for Deep Neural Networks (DNNs). More specifically, we formalise coverage criteria for DNNs that have been studied in the literature, and then develop a coherent method for performing concolic testing to increase test coverage. Our experimental results show the effectiveness of the concolic testing approach in both achieving high coverage and finding adversarial examples.
Tasks
Published 2018-04-30
URL http://arxiv.org/abs/1805.00089v2
PDF http://arxiv.org/pdf/1805.00089v2.pdf
PWC https://paperswithcode.com/paper/concolic-testing-for-deep-neural-networks
Repo https://github.com/TrustAI/DeepConcolic
Framework tf

Active Neural Localization

Title Active Neural Localization
Authors Devendra Singh Chaplot, Emilio Parisotto, Ruslan Salakhutdinov
Abstract Localization is the problem of estimating the location of an autonomous agent from an observation and a map of the environment. Traditional methods of localization, which filter the belief based on the observations, are sub-optimal in the number of steps required, as they do not decide the actions taken by the agent. We propose “Active Neural Localizer”, a fully differentiable neural network that learns to localize accurately and efficiently. The proposed model incorporates ideas of traditional filtering-based localization methods, by using a structured belief of the state with multiplicative interactions to propagate belief, and combines it with a policy model to localize accurately while minimizing the number of steps required for localization. Active Neural Localizer is trained end-to-end with reinforcement learning. We use a variety of simulation environments for our experiments which include random 2D mazes, random mazes in the Doom game engine and a photo-realistic environment in the Unreal game engine. The results on the 2D environments show the effectiveness of the learned policy in an idealistic setting while results on the 3D environments demonstrate the model’s capability of learning the policy and perceptual model jointly from raw-pixel based RGB observations. We also show that a model trained on random textures in the Doom environment generalizes well to a photo-realistic office space environment in the Unreal engine.
Tasks FPS Games, Game of Doom
Published 2018-01-24
URL http://arxiv.org/abs/1801.08214v1
PDF http://arxiv.org/pdf/1801.08214v1.pdf
PWC https://paperswithcode.com/paper/active-neural-localization
Repo https://github.com/devendrachaplot/Neural-Localization
Framework pytorch

Fast Convex Pruning of Deep Neural Networks

Title Fast Convex Pruning of Deep Neural Networks
Authors Alireza Aghasi, Afshin Abdi, Justin Romberg
Abstract We develop a fast, tractable technique called Net-Trim for simplifying a trained neural network. The method is a convex post-processing module, which prunes (sparsifies) a trained network layer by layer, while preserving the internal responses. We present a comprehensive analysis of Net-Trim from both the algorithmic and sample complexity standpoints, centered on a fast, scalable convex optimization program. Our analysis includes consistency results between the initial and retrained models before and after Net-Trim application and guarantees on the number of training samples needed to discover a network that can be expressed using a certain number of nonzero terms. Specifically, if there is a set of weights that uses at most $s$ terms that can re-create the layer outputs from the layer inputs, we can find these weights from $\mathcal{O}(s\log N/s)$ samples, where $N$ is the input size. These theoretical results are similar to those for sparse regression using the Lasso, and our analysis uses some of the same recently-developed tools (namely recent results on the concentration of measure and convex analysis). Finally, we propose an algorithmic framework based on the alternating direction method of multipliers (ADMM), which allows a fast and simple implementation of Net-Trim for network pruning and compression.
Tasks Network Pruning
Published 2018-06-17
URL http://arxiv.org/abs/1806.06457v2
PDF http://arxiv.org/pdf/1806.06457v2.pdf
PWC https://paperswithcode.com/paper/fast-convex-pruning-of-deep-neural-networks
Repo https://github.com/DNNToolBox/Net-Trim-v1
Framework tf

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

Title A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
Authors Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu Sun
Abstract Narrative story generation is a challenging problem because it demands the generated sentences with tight semantic connections, which has not been well studied by most existing generative models. To address this problem, we propose a skeleton-based model to promote the coherence of generated stories. Different from traditional models that generate a complete sentence at a stroke, the proposed model first generates the most critical phrases, called skeleton, and then expands the skeleton to a complete and fluent sentence. The skeleton is not manually defined, but learned by a reinforcement learning method. Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation. The G-score is improved by 20.1% in the human evaluation. The code is available at https://github.com/lancopku/Skeleton-Based-Generation-Model
Tasks
Published 2018-08-21
URL http://arxiv.org/abs/1808.06945v2
PDF http://arxiv.org/pdf/1808.06945v2.pdf
PWC https://paperswithcode.com/paper/a-skeleton-based-model-for-promoting
Repo https://github.com/lancopku/Skeleton-Based-Generation-Model
Framework tf

Improving Review Representations with User Attention and Product Attention for Sentiment Classification

Title Improving Review Representations with User Attention and Product Attention for Sentiment Classification
Authors Zhen Wu, Xin-Yu Dai, Cunyan Yin, Shujian Huang, Jiajun Chen
Abstract Neural network methods have achieved great success in reviews sentiment classification. Recently, some works achieved improvement by incorporating user and product information to generate a review representation. However, in reviews, we observe that some words or sentences show strong user’s preference, and some others tend to indicate product’s characteristic. The two kinds of information play different roles in determining the sentiment label of a review. Therefore, it is not reasonable to encode user and product information together into one representation. In this paper, we propose a novel framework to encode user and product information. Firstly, we apply two individual hierarchical neural networks to generate two representations, with user attention or with product attention. Then, we design a combined strategy to make full use of the two representations for training and final prediction. The experimental results show that our model obviously outperforms other state-of-the-art methods on IMDB and Yelp datasets. Through the visualization of attention over words related to user or product, we validate our observation mentioned above.
Tasks Sentiment Analysis
Published 2018-01-24
URL http://arxiv.org/abs/1801.07861v1
PDF http://arxiv.org/pdf/1801.07861v1.pdf
PWC https://paperswithcode.com/paper/improving-review-representations-with-user
Repo https://github.com/wuzhen247/HUAPA
Framework tf

dhSegment: A generic deep-learning approach for document segmentation

Title dhSegment: A generic deep-learning approach for document segmentation
Authors Sofia Ares Oliveira, Benoit Seguin, Frederic Kaplan
Abstract In recent years there have been multiple successful attempts tackling document processing problems separately by designing task specific hand-tuned strategies. We argue that the diversity of historical document processing tasks prohibits to solve them one at a time and shows a need for designing generic approaches in order to handle the variability of historical series. In this paper, we address multiple tasks simultaneously such as page extraction, baseline extraction, layout analysis or multiple typologies of illustrations and photograph extraction. We propose an open-source implementation of a CNN-based pixel-wise predictor coupled with task dependent post-processing blocks. We show that a single CNN-architecture can be used across tasks with competitive results. Moreover most of the task-specific post-precessing steps can be decomposed in a small number of simple and standard reusable operations, adding to the flexibility of our approach.
Tasks
Published 2018-04-27
URL https://arxiv.org/abs/1804.10371v2
PDF https://arxiv.org/pdf/1804.10371v2.pdf
PWC https://paperswithcode.com/paper/dhsegment-a-generic-deep-learning-approach
Repo https://github.com/raphaelBarman/dhSegment
Framework tf

The Sound of Pixels

Title The Sound of Pixels
Authors Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba
Abstract We introduce PixelPlayer, a system that, by leveraging large amounts of unlabeled videos, learns to locate image regions which produce sounds and separate the input sounds into a set of components that represents the sound from each pixel. Our approach capitalizes on the natural synchronization of the visual and audio modalities to learn models that jointly parse sounds and images, without requiring additional manual supervision. Experimental results on a newly collected MUSIC dataset show that our proposed Mix-and-Separate framework outperforms several baselines on source separation. Qualitative results suggest our model learns to ground sounds in vision, enabling applications such as independently adjusting the volume of sound sources.
Tasks
Published 2018-04-09
URL http://arxiv.org/abs/1804.03160v4
PDF http://arxiv.org/pdf/1804.03160v4.pdf
PWC https://paperswithcode.com/paper/the-sound-of-pixels
Repo https://github.com/hangzhaomit/Sound-of-Pixels
Framework pytorch

AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

Title AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation
Authors Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, Mathieu Aubry
Abstract We introduce a method for learning to generate the surface of 3D shapes. Our approach represents a 3D shape as a collection of parametric surface elements and, in contrast to methods generating voxel grids or point clouds, naturally infers a surface representation of the shape. Beyond its novelty, our new shape generation framework, AtlasNet, comes with significant advantages, such as improved precision and generalization capabilities, and the possibility to generate a shape of arbitrary resolution without memory issues. We demonstrate these benefits and compare to strong baselines on the ShapeNet benchmark for two applications: (i) auto-encoding shapes, and (ii) single-view reconstruction from a still image. We also provide results showing its potential for other applications, such as morphing, parametrization, super-resolution, matching, and co-segmentation.
Tasks 3D Surface Generation, Super-Resolution
Published 2018-02-15
URL http://arxiv.org/abs/1802.05384v3
PDF http://arxiv.org/pdf/1802.05384v3.pdf
PWC https://paperswithcode.com/paper/atlasnet-a-papier-mache-approach-to-learning
Repo https://github.com/ThibaultGROUEIX/AtlasNet
Framework pytorch

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

Title Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data
Authors Yuanbo Hou, Qiuqiang Kong, Shengchen Li
Abstract Audio tagging aims to predict one or several labels in an audio clip. Many previous works use weakly labelled data (WLD) for audio tagging, where only presence or absence of sound events is known, but the order of sound events is unknown. To use the order information of sound events, we propose sequential labelled data (SLD), where both the presence or absence and the order information of sound events are known. To utilize SLD in audio tagging, we propose a Convolutional Recurrent Neural Network followed by a Connectionist Temporal Classification (CRNN-CTC) objective function to map from an audio clip spectrogram to SLD. Experiments show that CRNN-CTC obtains an Area Under Curve (AUC) score of 0.986 in audio tagging, outperforming the baseline CRNN of 0.908 and 0.815 with Max Pooling and Average Pooling, respectively. In addition, we show CRNN-CTC has the ability to predict the order of sound events in an audio clip.
Tasks Audio Tagging
Published 2018-08-06
URL http://arxiv.org/abs/1808.01935v1
PDF http://arxiv.org/pdf/1808.01935v1.pdf
PWC https://paperswithcode.com/paper/audio-tagging-with-connectionist-temporal
Repo https://github.com/iooops/CS221-Audio-Tagging
Framework none

Kernel machines that adapt to GPUs for effective large batch training

Title Kernel machines that adapt to GPUs for effective large batch training
Authors Siyuan Ma, Mikhail Belkin
Abstract Modern machine learning models are typically trained using Stochastic Gradient Descent (SGD) on massively parallel computing resources such as GPUs. Increasing mini-batch size is a simple and direct way to utilize the parallel computing capacity. For small batch an increase in batch size results in the proportional reduction in the training time, a phenomenon known as linear scaling. However, increasing batch size beyond a certain value leads to no further improvement in training time. In this paper we develop the first analytical framework that extends linear scaling to match the parallel computing capacity of a resource. The framework is designed for a class of classical kernel machines. It automatically modifies a standard kernel machine to output a mathematically equivalent prediction function, yet allowing for extended linear scaling, i.e., higher effective parallelization and faster training time on given hardware. The resulting algorithms are accurate, principled and very fast. For example, using a single Titan Xp GPU, training on ImageNet with $1.3\times 10^6$ data points and $1000$ labels takes under an hour, while smaller datasets, such as MNIST, take seconds. As the parameters are chosen analytically, based on the theoretical bounds, little tuning beyond selecting the kernel and the kernel parameter is needed, further facilitating the practical use of these methods.
Tasks
Published 2018-06-15
URL http://arxiv.org/abs/1806.06144v3
PDF http://arxiv.org/pdf/1806.06144v3.pdf
PWC https://paperswithcode.com/paper/kernel-machines-that-adapt-to-gpus-for
Repo https://github.com/EigenPro/EigenPro2
Framework tf

Deepcode: Feedback Codes via Deep Learning

Title Deepcode: Feedback Codes via Deep Learning
Authors Hyeji Kim, Yihan Jiang, Sreeram Kannan, Sewoong Oh, Pramod Viswanath
Abstract The design of codes for communicating reliably over a statistically well defined channel is an important endeavor involving deep mathematical research and wide-ranging practical applications. In this work, we present the first family of codes obtained via deep learning, which significantly beats state-of-the-art codes designed over several decades of research. The communication channel under consideration is the Gaussian noise channel with feedback, whose study was initiated by Shannon; feedback is known theoretically to improve reliability of communication, but no practical codes that do so have ever been successfully constructed. We break this logjam by integrating information theoretic insights harmoniously with recurrent-neural-network based encoders and decoders to create novel codes that outperform known codes by 3 orders of magnitude in reliability. We also demonstrate several desirable properties of the codes: (a) generalization to larger block lengths, (b) composability with known codes, (c) adaptation to practical constraints. This result also has broader ramifications for coding theory: even when the channel has a clear mathematical model, deep learning methodologies, when combined with channel-specific information-theoretic insights, can potentially beat state-of-the-art codes constructed over decades of mathematical research.
Tasks
Published 2018-07-02
URL http://arxiv.org/abs/1807.00801v1
PDF http://arxiv.org/pdf/1807.00801v1.pdf
PWC https://paperswithcode.com/paper/deepcode-feedback-codes-via-deep-learning
Repo https://github.com/hyejikim1/Deepcode
Framework tf

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics using CNNs

Title Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics using CNNs
Authors Andres Milioto, Cyrill Stachniss
Abstract The ability to interpret a scene is an important capability for a robot that is supposed to interact with its environment. The knowledge of what is in front of the robot is, for example, relevant for navigation, manipulation, or planning. Semantic segmentation labels each pixel of an image with a class label and thus provides a detailed semantic annotation of the surroundings to the robot. Convolutional neural networks (CNNs) are popular methods for addressing this type of problem. The available software for training and the integration of CNNs for real robots, however, is quite fragmented and often difficult to use for non-experts, despite the availability of several high-quality open-source frameworks for neural network implementation and training. In this paper, we propose a tool called Bonnet, which addresses this fragmentation problem by building a higher abstraction that is specific for the semantic segmentation task. It provides a modular approach to simplify the training of a semantic segmentation CNN independently of the used dataset and the intended task. Furthermore, we also address the deployment on a real robotic platform. Thus, we do not propose a new CNN approach in this paper. Instead, we provide a stable and easy-to-use tool to make this technology more approachable in the context of autonomous systems. In this sense, we aim at closing a gap between computer vision research and its use in robotics research. We provide an open-source codebase for training and deployment. The training interface is implemented in Python using TensorFlow and the deployment interface provides a C++ library that can be easily integrated in an existing robotics codebase, a ROS node, and two standalone applications for label prediction in images and videos.
Tasks Semantic Segmentation
Published 2018-02-25
URL http://arxiv.org/abs/1802.08960v2
PDF http://arxiv.org/pdf/1802.08960v2.pdf
PWC https://paperswithcode.com/paper/bonnet-an-open-source-training-and-deployment
Repo https://github.com/PRBonn/bonnet
Framework tf

A near Pareto optimal approach to student-supervisor allocation with two sided preferences and workload balance

Title A near Pareto optimal approach to student-supervisor allocation with two sided preferences and workload balance
Authors Victor Sanchez-Anguix, Rithin Chalumuri, Reyhan Aydogan, Vicente Julian
Abstract The problem of allocating students to supervisors for the development of a personal project or a dissertation is a crucial activity in the higher education environment, as it enables students to get feedback on their work from an expert and improve their personal, academic, and professional abilities. In this article, we propose a multi-objective and near Pareto optimal genetic algorithm for the allocation of students to supervisors. The allocation takes into consideration the students and supervisors’ preferences on research/project topics, the lower and upper supervision quotas of supervisors, as well as the workload balance amongst supervisors. We introduce novel mutation and crossover operators for the student-supervisor allocation problem. The experiments carried out show that the components of the genetic algorithm are more apt for the problem than classic components, and that the genetic algorithm is capable of producing allocations that are near Pareto optimal in a reasonable time.
Tasks
Published 2018-12-16
URL http://arxiv.org/abs/1812.06474v1
PDF http://arxiv.org/pdf/1812.06474v1.pdf
PWC https://paperswithcode.com/paper/a-near-pareto-optimal-approach-to-student
Repo https://github.com/rithinch/Student-Supervisor-Allocation
Framework none

Towards a Better Metric for Evaluating Question Generation Systems

Title Towards a Better Metric for Evaluating Question Generation Systems
Authors Preksha Nema, Mitesh M. Khapra
Abstract There has always been criticism for using $n$-gram based similarity metrics, such as BLEU, NIST, etc, for evaluating the performance of NLG systems. However, these metrics continue to remain popular and are recently being used for evaluating the performance of systems which automatically generate questions from documents, knowledge graphs, images, etc. Given the rising interest in such automatic question generation (AQG) systems, it is important to objectively examine whether these metrics are suitable for this task. In particular, it is important to verify whether such metrics used for evaluating AQG systems focus on answerability of the generated question by preferring questions which contain all relevant information such as question type (Wh-types), entities, relations, etc. In this work, we show that current automatic evaluation metrics based on $n$-gram similarity do not always correlate well with human judgments about answerability of a question. To alleviate this problem and as a first step towards better evaluation metrics for AQG, we introduce a scoring function to capture answerability and show that when this scoring function is integrated with existing metrics, they correlate significantly better with human judgments. The scripts and data developed as a part of this work are made publicly available at https://github.com/PrekshaNema25/Answerability-Metric
Tasks Knowledge Graphs, Question Generation
Published 2018-08-30
URL http://arxiv.org/abs/1808.10192v2
PDF http://arxiv.org/pdf/1808.10192v2.pdf
PWC https://paperswithcode.com/paper/towards-a-better-metric-for-evaluating
Repo https://github.com/PrekshaNema25/Answerability-Metric
Framework none

Boosting Black Box Variational Inference

Title Boosting Black Box Variational Inference
Authors Francesco Locatello, Gideon Dresdner, Rajiv Khanna, Isabel Valera, Gunnar Rätsch
Abstract Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing ideas from the classic boosting framework, recent approaches attempt to \emph{boost} VI by replacing the selection of a single density with a greedily constructed mixture of densities. In order to guarantee convergence, previous works impose stringent assumptions that require significant effort for practitioners. Specifically, they require a custom implementation of the greedy step (called the LMO) for every probabilistic model with respect to an unnatural variational family of truncated distributions. Our work fixes these issues with novel theoretical and algorithmic insights. On the theoretical side, we show that boosting VI satisfies a relaxed smoothness assumption which is sufficient for the convergence of the functional Frank-Wolfe (FW) algorithm. Furthermore, we rephrase the LMO problem and propose to maximize the Residual ELBO (RELBO) which replaces the standard ELBO optimization in VI. These theoretical enhancements allow for black box implementation of the boosting subroutine. Finally, we present a stopping criterion drawn from the duality gap in the classic FW analyses and exhaustive experiments to illustrate the usefulness of our theoretical and algorithmic contributions.
Tasks
Published 2018-06-06
URL http://arxiv.org/abs/1806.02185v5
PDF http://arxiv.org/pdf/1806.02185v5.pdf
PWC https://paperswithcode.com/paper/boosting-black-box-variational-inference
Repo https://github.com/ratschlab/boosting-bbvi
Framework tf
comments powered by Disqus