October 21, 2019

2963 words 14 mins read

Paper Group AWR 79

Paper Group AWR 79

Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping. Images & Recipes: Retrieval in the cooking context. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. Extending Pretrained Segmentation Networks with Additional Anatomical Structures. Temporal Regularization in Markov Decision Proc …

Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping

Title Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping
Authors Akira Taniguchi, Yoshinobu Hagiwara, Tadahiro Taniguchi, Tetsunari Inamura
Abstract We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. However, our original algorithm had limited estimation accuracy owing to the influence of the early stages of learning, and increased computational complexity with added training data. Therefore, we introduce techniques such as fixed-lag rejuvenation to reduce the calculation time while maintaining an accuracy higher than that of the original algorithm. The results show that, in terms of estimation accuracy, the proposed algorithm exceeds the original algorithm and is comparable to batch learning. In addition, the calculation time of the proposed algorithm does not depend on the amount of training data and becomes constant for each step of the scalable algorithm. Our approach will contribute to the realization of long-term spatial language interactions between humans and robots.
Tasks
Published 2018-03-09
URL https://arxiv.org/abs/1803.03481v3
PDF https://arxiv.org/pdf/1803.03481v3.pdf
PWC https://paperswithcode.com/paper/improved-and-scalable-online-learning-of
Repo https://github.com/a-taniguchi/SpCoSLAM2
Framework none

Images & Recipes: Retrieval in the cooking context

Title Images & Recipes: Retrieval in the cooking context
Authors Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Matthieu Cord
Abstract Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine. In this paper, we tackle the picture-recipe alignment problem, having as target application the large-scale retrieval task (finding a recipe given a picture, and vice versa). Our approach is validated on the Recipe1M dataset, composed of one million image-recipe pairs and additional class information, for which we achieve state-of-the-art results.
Tasks
Published 2018-05-02
URL http://arxiv.org/abs/1805.00900v1
PDF http://arxiv.org/pdf/1805.00900v1.pdf
PWC https://paperswithcode.com/paper/images-recipes-retrieval-in-the-cooking
Repo https://github.com/Cadene/recipe1m.bootstrap.pytorch
Framework pytorch

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling

Title No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Authors Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang
Abstract Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem. Different from captions, stories have more expressive language styles and contain many imaginary concepts that do not appear in the images. Thus it poses challenges to behavioral cloning algorithms. Furthermore, due to the limitations of automatic metrics on evaluating story quality, reinforcement learning methods with hand-crafted rewards also face difficulties in gaining an overall performance boost. Therefore, we propose an Adversarial REward Learning (AREL) framework to learn an implicit reward function from human demonstrations, and then optimize policy search with the learned reward function. Though automatic eval- uation indicates slight performance boost over state-of-the-art (SOTA) methods in cloning expert behaviors, human evaluation shows that our approach achieves significant improvement in generating more human-like stories than SOTA systems.
Tasks Image Captioning, Visual Storytelling
Published 2018-04-24
URL http://arxiv.org/abs/1804.09160v2
PDF http://arxiv.org/pdf/1804.09160v2.pdf
PWC https://paperswithcode.com/paper/no-metrics-are-perfect-adversarial-reward
Repo https://github.com/littlekobe/AREL
Framework pytorch

Extending Pretrained Segmentation Networks with Additional Anatomical Structures

Title Extending Pretrained Segmentation Networks with Additional Anatomical Structures
Authors Firat Ozdemir, Orcun Goksel
Abstract Comprehensive surgical planning require complex patient-specific anatomical models. For instance, functional muskuloskeletal simulations necessitate all relevant structures to be segmented, which could be performed in real-time using deep neural networks given sufficient annotated samples. Such large datasets of multiple structure annotations are costly to procure and are often unavailable in practice. Nevertheless, annotations from different studies and centers can be readily available, or become available in the future in an incremental fashion. We propose a class-incremental segmentation framework for extending a deep network trained for some anatomical structure to yet another structure using a small incremental annotation set. Through distilling knowledge from the current state of the framework, we bypass the need for a full retraining. This is a meta-method to extend any choice of desired deep segmentation network with only a minor addition per structure, which makes it suitable for lifelong class-incremental learning and applicable also for future deep neural network architectures. We evaluated our methods on a public knee dataset of 100 MR volumes. Through varying amount of incremental annotation ratios, we show how our proposed method can retain the previous anatomical structure segmentation performance superior to the conventional finetuning approach. In addition, our framework inherently exploits transferable knowledge from previously trained structures to incremental tasks, demonstrated by superior results compared to non-incremental training. With the presented method, new anatomical structures can be learned without catastrophic forgetting of older structures and without extensive increase of memory and complexity.
Tasks
Published 2018-11-12
URL https://arxiv.org/abs/1811.04634v2
PDF https://arxiv.org/pdf/1811.04634v2.pdf
PWC https://paperswithcode.com/paper/extending-pretrained-segmentation-networks
Repo https://github.com/firatozdemir/LwfSeg-AeiSeg
Framework tf

Temporal Regularization in Markov Decision Process

Title Temporal Regularization in Markov Decision Process
Authors Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Abstract Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.
Tasks Atari Games
Published 2018-11-01
URL http://arxiv.org/abs/1811.00429v2
PDF http://arxiv.org/pdf/1811.00429v2.pdf
PWC https://paperswithcode.com/paper/temporal-regularization-in-markov-decision
Repo https://github.com/pierthodo/temporal_regularization
Framework tf

Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples

Title Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples
Authors Arindam Mitra, Chitta Baral
Abstract Over the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available at https://goo.gl/KdWAcV. This paper is under consideration for acceptance in TPLP.
Tasks Handwritten Digit Recognition, Question Answering
Published 2018-02-22
URL http://arxiv.org/abs/1802.07966v2
PDF http://arxiv.org/pdf/1802.07966v2.pdf
PWC https://paperswithcode.com/paper/incremental-and-iterative-learning-of-answer
Repo https://github.com/ari9dam/ILPME
Framework none

DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification

Title DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification
Authors Wentao Zhu, Chaochun Liu, Wei Fan, Xiaohui Xie
Abstract In this work, we present a fully automated lung computed tomography (CT) cancer diagnosis system, DeepLung. DeepLung consists of two components, nodule detection (identifying the locations of candidate nodules) and classification (classifying candidate nodules into benign or malignant). Considering the 3D nature of lung CT data and the compactness of dual path networks (DPN), two deep 3D DPN are designed for nodule detection and classification respectively. Specifically, a 3D Faster Regions with Convolutional Neural Net (R-CNN) is designed for nodule detection with 3D dual path blocks and a U-net-like encoder-decoder structure to effectively learn nodule features. For nodule classification, gradient boosting machine (GBM) with 3D dual path network features is proposed. The nodule classification subnetwork was validated on a public dataset from LIDC-IDRI, on which it achieved better performance than state-of-the-art approaches and surpassed the performance of experienced doctors based on image modality. Within the DeepLung system, candidate nodules are detected first by the nodule detection subnetwork, and nodule diagnosis is conducted by the classification subnetwork. Extensive experimental results demonstrate that DeepLung has performance comparable to experienced doctors both for the nodule-level and patient-level diagnosis on the LIDC-IDRI dataset.\footnote{https://github.com/uci-cbcl/DeepLung.git}
Tasks Computed Tomography (CT), Lung Nodule Classification
Published 2018-01-25
URL http://arxiv.org/abs/1801.09555v1
PDF http://arxiv.org/pdf/1801.09555v1.pdf
PWC https://paperswithcode.com/paper/deeplung-deep-3d-dual-path-nets-for-automated
Repo https://github.com/uci-cbcl/DeepLung
Framework pytorch

A Hierarchical Framework for Relation Extraction with Reinforcement Learning

Title A Hierarchical Framework for Relation Extraction with Reinforcement Learning
Authors Ryuichi Takanobu, Tianyang Zhang, Jiexi Liu, Minlie Huang
Abstract Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled. This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation. We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types. The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations. Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations.
Tasks Entity Extraction, Hierarchical Reinforcement Learning, Relation Extraction
Published 2018-11-09
URL http://arxiv.org/abs/1811.03925v1
PDF http://arxiv.org/pdf/1811.03925v1.pdf
PWC https://paperswithcode.com/paper/a-hierarchical-framework-for-relation
Repo https://github.com/truthless11/HRL-RE
Framework pytorch

Deep learning improved by biological activation functions

Title Deep learning improved by biological activation functions
Authors Gardave S Bhumbra
Abstract `Biologically inspired’ activation functions, such as the logistic sigmoid, have been instrumental in the historical advancement of machine learning. However in the field of deep learning, they have been largely displaced by rectified linear units (ReLU) or similar functions, such as its exponential linear unit (ELU) variant, to mitigate the effects of vanishing gradients associated with error back-propagation. The logistic sigmoid however does not represent the true input-output relation in neuronal cells under physiological conditions. Here, bionodal root unit (BRU) activation functions are introduced, exhibiting input-output non-linearities that are substantially more biologically plausible since their functional form is based on known biophysical properties of neuronal cells. In order to evaluate the learning performance of BRU activations, deep networks are constructed with identical architectures except differing in their transfer functions (ReLU, ELU, and BRU). Multilayer perceptrons, stacked auto-encoders, and convolutional networks are used to test supervised and unsupervised learning based on the MNIST and CIFAR-10/100 datasets. Comparisons of learning performance, quantified using loss and error measurements, demonstrate that bionodal networks both train faster than their ReLU and ELU counterparts and result in the best generalised models even in the absence of formal regularisation. These results therefore suggest that revisiting the detailed properties of biological neurones and their circuitry might prove invaluable in the field of deep learning for the future. |
Tasks
Published 2018-03-19
URL http://arxiv.org/abs/1804.11237v2
PDF http://arxiv.org/pdf/1804.11237v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-improved-by-biological
Repo https://github.com/takyamamoto/BRU_chainer
Framework none

A Span Selection Model for Semantic Role Labeling

Title A Span Selection Model for Semantic Role Labeling
Authors Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto
Abstract We present a simple and accurate span-based model for semantic role labeling (SRL). Our model directly takes into account all possible argument spans and scores them for each label. At decoding time, we greedily select higher scoring labeled spans. One advantage of our model is to allow us to design and use span-level features, that are difficult to use in token-based BIO tagging approaches. Experimental results demonstrate that our ensemble model achieves the state-of-the-art results, 87.4 F1 and 87.0 F1 on the CoNLL-2005 and 2012 datasets, respectively.
Tasks Semantic Role Labeling
Published 2018-10-04
URL http://arxiv.org/abs/1810.02245v1
PDF http://arxiv.org/pdf/1810.02245v1.pdf
PWC https://paperswithcode.com/paper/a-span-selection-model-for-semantic-role
Repo https://github.com/asadovsky/nn
Framework tf

Constituency Parsing with a Self-Attentive Encoder

Title Constituency Parsing with a Self-Attentive Encoder
Authors Nikita Kitaev, Dan Klein
Abstract We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. The use of attention makes explicit the manner in which information is propagated between different locations in the sentence, which we use to both analyze our model and propose potential improvements. For example, we find that separating positional and content information in the encoder can lead to improved parsing accuracy. Additionally, we evaluate different approaches for lexical representation. Our parser achieves new state-of-the-art results for single models trained on the Penn Treebank: 93.55 F1 without the use of any external data, and 95.13 F1 when using pre-trained word representations. Our parser also outperforms the previous best-published accuracy figures on 8 of the 9 languages in the SPMRL dataset.
Tasks Constituency Parsing
Published 2018-05-02
URL http://arxiv.org/abs/1805.01052v1
PDF http://arxiv.org/pdf/1805.01052v1.pdf
PWC https://paperswithcode.com/paper/constituency-parsing-with-a-self-attentive
Repo https://github.com/asadovsky/nn
Framework tf

Online Second Order Methods for Non-Convex Stochastic Optimizations

Title Online Second Order Methods for Non-Convex Stochastic Optimizations
Authors Xi-Lin Li
Abstract This paper proposes a family of online second order methods for possibly non-convex stochastic optimizations based on the theory of preconditioned stochastic gradient descent (PSGD), which can be regarded as an enhance stochastic Newton method with the ability to handle gradient noise and non-convexity simultaneously. We have improved the implementations of the original PSGD in several ways, e.g., new forms of preconditioners, more accurate Hessian vector product calculations, and better numerical stability with vanishing or ill-conditioned Hessian, etc.. We also have unrevealed the relationship between feature normalization and PSGD with Kronecker product preconditioners, which explains the excellent performance of Kronecker product preconditioners in deep neural network learning. A software package (https://github.com/lixilinx/psgd_tf) implemented in Tensorflow is provided to compare variations of stochastic gradient descent (SGD) and PSGD with five different preconditioners on a wide range of benchmark problems with commonly used neural network architectures, e.g., convolutional and recurrent neural networks. Experimental results clearly demonstrate the advantages of PSGD in terms of generalization performance and convergence speed.
Tasks
Published 2018-03-26
URL http://arxiv.org/abs/1803.09383v3
PDF http://arxiv.org/pdf/1803.09383v3.pdf
PWC https://paperswithcode.com/paper/online-second-order-methods-for-non-convex
Repo https://github.com/lixilinx/psgd_tf
Framework tf

Predicting Semantic Relations using Global Graph Properties

Title Predicting Semantic Relations using Global Graph Properties
Authors Yuval Pinter, Jacob Eisenstein
Abstract Semantic graphs, such as WordNet, are resources which curate natural language on two distinguishable layers. On the local level, individual relations between synsets (semantic building blocks) such as hypernymy and meronymy enhance our understanding of the words used to express their meanings. Globally, analysis of graph-theoretic properties of the entire net sheds light on the structure of human language as a whole. In this paper, we combine global and local properties of semantic graphs through the framework of Max-Margin Markov Graph Models (M3GM), a novel extension of Exponential Random Graph Model (ERGM) that scales to large multi-relational graphs. We demonstrate how such global modeling improves performance on the local task of predicting semantic relations between synsets, yielding new state-of-the-art results on the WN18RR dataset, a challenging version of WordNet link prediction in which “easy” reciprocal cases are removed. In addition, the M3GM model identifies multirelational motifs that are characteristic of well-formed lexical semantic ontologies.
Tasks Link Prediction
Published 2018-08-27
URL http://arxiv.org/abs/1808.08644v1
PDF http://arxiv.org/pdf/1808.08644v1.pdf
PWC https://paperswithcode.com/paper/predicting-semantic-relations-using-global
Repo https://github.com/thukg/KG-Reading-List
Framework none

Loosely-Coupled Semi-Direct Monocular SLAM

Title Loosely-Coupled Semi-Direct Monocular SLAM
Authors Seong Hun Lee, Javier Civera
Abstract We propose a novel semi-direct approach for monocular simultaneous localization and mapping (SLAM) that combines the complementary strengths of direct and feature-based methods. The proposed pipeline loosely couples direct odometry and feature-based SLAM to perform three levels of parallel optimizations: (1) photometric bundle adjustment (BA) that jointly optimizes the local structure and motion, (2) geometric BA that refines keyframe poses and associated feature map points, and (3) pose graph optimization to achieve global map consistency in the presence of loop closures. This is achieved in real-time by limiting the feature-based operations to marginalized keyframes from the direct odometry module. Exhaustive evaluation on two benchmark datasets demonstrates that our system outperforms the state-of-the-art monocular odometry and SLAM systems in terms of overall accuracy and robustness.
Tasks Simultaneous Localization and Mapping
Published 2018-07-26
URL http://arxiv.org/abs/1807.10073v3
PDF http://arxiv.org/pdf/1807.10073v3.pdf
PWC https://paperswithcode.com/paper/loosely-coupled-semi-direct-monocular-slam
Repo https://github.com/sunghoon031/LCSD_SLAM
Framework none

Heron Inference for Bayesian Graphical Models

Title Heron Inference for Bayesian Graphical Models
Authors Daniel Rugeles, Zhen Hai, Gao Cong, Manoranjan Dash
Abstract Bayesian graphical models have been shown to be a powerful tool for discovering uncertainty and causal structure from real-world data in many application fields. Current inference methods primarily follow different kinds of trade-offs between computational complexity and predictive accuracy. At one end of the spectrum, variational inference approaches perform well in computational efficiency, while at the other end, Gibbs sampling approaches are known to be relatively accurate for prediction in practice. In this paper, we extend an existing Gibbs sampling method, and propose a new deterministic Heron inference (Heron) for a family of Bayesian graphical models. In addition to the support for nontrivial distributability, one more benefit of Heron is that it is able to not only allow us to easily assess the convergence status but also largely improve the running efficiency. We evaluate Heron against the standard collapsed Gibbs sampler and state-of-the-art state augmentation method in inference for well-known graphical models. Experimental results using publicly available real-life data have demonstrated that Heron significantly outperforms the baseline methods for inferring Bayesian graphical models.
Tasks
Published 2018-02-19
URL http://arxiv.org/abs/1802.06526v1
PDF http://arxiv.org/pdf/1802.06526v1.pdf
PWC https://paperswithcode.com/paper/heron-inference-for-bayesian-graphical-models
Repo https://github.com/danrugeles/Heron
Framework none
comments powered by Disqus