October 21, 2019

2963 words 14 mins read

Paper Group AWR 79

Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping. Images & Recipes: Retrieval in the cooking context. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. Extending Pretrained Segmentation Networks with Additional Anatomical Structures. Temporal Regularization in Markov Decision Proc …

Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping


Title	Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping
Authors	Akira Taniguchi, Yoshinobu Hagiwara, Tadahiro Taniguchi, Tetsunari Inamura
Abstract	We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. However, our original algorithm had limited estimation accuracy owing to the influence of the early stages of learning, and increased computational complexity with added training data. Therefore, we introduce techniques such as fixed-lag rejuvenation to reduce the calculation time while maintaining an accuracy higher than that of the original algorithm. The results show that, in terms of estimation accuracy, the proposed algorithm exceeds the original algorithm and is comparable to batch learning. In addition, the calculation time of the proposed algorithm does not depend on the amount of training data and becomes constant for each step of the scalable algorithm. Our approach will contribute to the realization of long-term spatial language interactions between humans and robots.
Tasks
Published	2018-03-09
URL	https://arxiv.org/abs/1803.03481v3
PDF	https://arxiv.org/pdf/1803.03481v3.pdf
PWC	https://paperswithcode.com/paper/improved-and-scalable-online-learning-of
Repo	https://github.com/a-taniguchi/SpCoSLAM2
Framework	none

Images & Recipes: Retrieval in the cooking context


Title	Images & Recipes: Retrieval in the cooking context
Authors	Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Matthieu Cord
Abstract	Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine. In this paper, we tackle the picture-recipe alignment problem, having as target application the large-scale retrieval task (finding a recipe given a picture, and vice versa). Our approach is validated on the Recipe1M dataset, composed of one million image-recipe pairs and additional class information, for which we achieve state-of-the-art results.
Tasks
Published	2018-05-02
URL	http://arxiv.org/abs/1805.00900v1
PDF	http://arxiv.org/pdf/1805.00900v1.pdf
PWC	https://paperswithcode.com/paper/images-recipes-retrieval-in-the-cooking
Repo	https://github.com/Cadene/recipe1m.bootstrap.pytorch
Framework	pytorch

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling


Title	No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Authors	Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang
Abstract	Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem. Different from captions, stories have more expressive language styles and contain many imaginary concepts that do not appear in the images. Thus it poses challenges to behavioral cloning algorithms. Furthermore, due to the limitations of automatic metrics on evaluating story quality, reinforcement learning methods with hand-crafted rewards also face difficulties in gaining an overall performance boost. Therefore, we propose an Adversarial REward Learning (AREL) framework to learn an implicit reward function from human demonstrations, and then optimize policy search with the learned reward function. Though automatic eval- uation indicates slight performance boost over state-of-the-art (SOTA) methods in cloning expert behaviors, human evaluation shows that our approach achieves significant improvement in generating more human-like stories than SOTA systems.
Tasks	Image Captioning, Visual Storytelling
Published	2018-04-24
URL	http://arxiv.org/abs/1804.09160v2
PDF	http://arxiv.org/pdf/1804.09160v2.pdf
PWC	https://paperswithcode.com/paper/no-metrics-are-perfect-adversarial-reward
Repo	https://github.com/littlekobe/AREL
Framework	pytorch

Extending Pretrained Segmentation Networks with Additional Anatomical Structures


Title	Extending Pretrained Segmentation Networks with Additional Anatomical Structures
Authors	Firat Ozdemir, Orcun Goksel
Abstract	Comprehensive surgical planning require complex patient-specific anatomical models. For instance, functional muskuloskeletal simulations necessitate all relevant structures to be segmented, which could be performed in real-time using deep neural networks given sufficient annotated samples. Such large datasets of multiple structure annotations are costly to procure and are often unavailable in practice. Nevertheless, annotations from different studies and centers can be readily available, or become available in the future in an incremental fashion. We propose a class-incremental segmentation framework for extending a deep network trained for some anatomical structure to yet another structure using a small incremental annotation set. Through distilling knowledge from the current state of the framework, we bypass the need for a full retraining. This is a meta-method to extend any choice of desired deep segmentation network with only a minor addition per structure, which makes it suitable for lifelong class-incremental learning and applicable also for future deep neural network architectures. We evaluated our methods on a public knee dataset of 100 MR volumes. Through varying amount of incremental annotation ratios, we show how our proposed method can retain the previous anatomical structure segmentation performance superior to the conventional finetuning approach. In addition, our framework inherently exploits transferable knowledge from previously trained structures to incremental tasks, demonstrated by superior results compared to non-incremental training. With the presented method, new anatomical structures can be learned without catastrophic forgetting of older structures and without extensive increase of memory and complexity.
Tasks
Published	2018-11-12
URL	https://arxiv.org/abs/1811.04634v2
PDF	https://arxiv.org/pdf/1811.04634v2.pdf
PWC	https://paperswithcode.com/paper/extending-pretrained-segmentation-networks
Repo	https://github.com/firatozdemir/LwfSeg-AeiSeg
Framework	tf

Temporal Regularization in Markov Decision Process


Title	Temporal Regularization in Markov Decision Process
Authors	Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Abstract	Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.
Tasks	Atari Games
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00429v2
PDF	http://arxiv.org/pdf/1811.00429v2.pdf
PWC	https://paperswithcode.com/paper/temporal-regularization-in-markov-decision
Repo	https://github.com/pierthodo/temporal_regularization
Framework	tf

Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples


Title	Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples
Authors	Arindam Mitra, Chitta Baral
Abstract	Over the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available at https://goo.gl/KdWAcV. This paper is under consideration for acceptance in TPLP.
Tasks	Handwritten Digit Recognition, Question Answering
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07966v2
PDF	http://arxiv.org/pdf/1802.07966v2.pdf
PWC	https://paperswithcode.com/paper/incremental-and-iterative-learning-of-answer
Repo	https://github.com/ari9dam/ILPME
Framework	none

DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification


Title	DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification
Authors	Wentao Zhu, Chaochun Liu, Wei Fan, Xiaohui Xie
Abstract	In this work, we present a fully automated lung computed tomography (CT) cancer diagnosis system, DeepLung. DeepLung consists of two components, nodule detection (identifying the locations of candidate nodules) and classification (classifying candidate nodules into benign or malignant). Considering the 3D nature of lung CT data and the compactness of dual path networks (DPN), two deep 3D DPN are designed for nodule detection and classification respectively. Specifically, a 3D Faster Regions with Convolutional Neural Net (R-CNN) is designed for nodule detection with 3D dual path blocks and a U-net-like encoder-decoder structure to effectively learn nodule features. For nodule classification, gradient boosting machine (GBM) with 3D dual path network features is proposed. The nodule classification subnetwork was validated on a public dataset from LIDC-IDRI, on which it achieved better performance than state-of-the-art approaches and surpassed the performance of experienced doctors based on image modality. Within the DeepLung system, candidate nodules are detected first by the nodule detection subnetwork, and nodule diagnosis is conducted by the classification subnetwork. Extensive experimental results demonstrate that DeepLung has performance comparable to experienced doctors both for the nodule-level and patient-level diagnosis on the LIDC-IDRI dataset.\footnote{https://github.com/uci-cbcl/DeepLung.git}
Tasks	Computed Tomography (CT), Lung Nodule Classification
Published	2018-01-25
URL	http://arxiv.org/abs/1801.09555v1
PDF	http://arxiv.org/pdf/1801.09555v1.pdf
PWC	https://paperswithcode.com/paper/deeplung-deep-3d-dual-path-nets-for-automated
Repo	https://github.com/uci-cbcl/DeepLung
Framework	pytorch

A Hierarchical Framework for Relation Extraction with Reinforcement Learning


Title	A Hierarchical Framework for Relation Extraction with Reinforcement Learning
Authors	Ryuichi Takanobu, Tianyang Zhang, Jiexi Liu, Minlie Huang
Abstract	Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled. This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation. We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types. The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations. Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations.
Tasks	Entity Extraction, Hierarchical Reinforcement Learning, Relation Extraction
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03925v1
PDF	http://arxiv.org/pdf/1811.03925v1.pdf
PWC	https://paperswithcode.com/paper/a-hierarchical-framework-for-relation
Repo	https://github.com/truthless11/HRL-RE
Framework	pytorch

Deep learning improved by biological activation functions


Title	Deep learning improved by biological activation functions
Authors	Gardave S Bhumbra
Abstract	`Biologically inspired’ activation functions, such as the logistic sigmoid, have been instrumental in the historical advancement of machine learning. However in the field of deep learning, they have been largely displaced by rectified linear units (ReLU) or similar functions, such as its exponential linear unit (ELU) variant, to mitigate the effects of vanishing gradients associated with error back-propagation. The logistic sigmoid however does not represent the true input-output relation in neuronal cells under physiological conditions. Here, bionodal root unit (BRU) activation functions are introduced, exhibiting input-output non-linearities that are substantially more biologically plausible since their functional form is based on known biophysical properties of neuronal cells. In order to evaluate the learning performance of BRU activations, deep networks are constructed with identical architectures except differing in their transfer functions (ReLU, ELU, and BRU). Multilayer perceptrons, stacked auto-encoders, and convolutional networks are used to test supervised and unsupervised learning based on the MNIST and CIFAR-10/100 datasets. Comparisons of learning performance, quantified using loss and error measurements, demonstrate that bionodal networks both train faster than their ReLU and ELU counterparts and result in the best generalised models even in the absence of formal regularisation. These results therefore suggest that revisiting the detailed properties of biological neurones and their circuitry might prove invaluable in the field of deep learning for the future. \|
Tasks
Published	2018-03-19
URL	http://arxiv.org/abs/1804.11237v2
PDF	http://arxiv.org/pdf/1804.11237v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-improved-by-biological
Repo	https://github.com/takyamamoto/BRU_chainer
Framework	none

A Span Selection Model for Semantic Role Labeling


Title	A Span Selection Model for Semantic Role Labeling
Authors	Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto
Abstract	We present a simple and accurate span-based model for semantic role labeling (SRL). Our model directly takes into account all possible argument spans and scores them for each label. At decoding time, we greedily select higher scoring labeled spans. One advantage of our model is to allow us to design and use span-level features, that are difficult to use in token-based BIO tagging approaches. Experimental results demonstrate that our ensemble model achieves the state-of-the-art results, 87.4 F1 and 87.0 F1 on the CoNLL-2005 and 2012 datasets, respectively.
Tasks	Semantic Role Labeling
Published	2018-10-04
URL	http://arxiv.org/abs/1810.02245v1
PDF	http://arxiv.org/pdf/1810.02245v1.pdf
PWC	https://paperswithcode.com/paper/a-span-selection-model-for-semantic-role
Repo	https://github.com/asadovsky/nn
Framework	tf

Constituency Parsing with a Self-Attentive Encoder


Title	Constituency Parsing with a Self-Attentive Encoder
Authors	Nikita Kitaev, Dan Klein
Abstract	We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. The use of attention makes explicit the manner in which information is propagated between different locations in the sentence, which we use to both analyze our model and propose potential improvements. For example, we find that separating positional and content information in the encoder can lead to improved parsing accuracy. Additionally, we evaluate different approaches for lexical representation. Our parser achieves new state-of-the-art results for single models trained on the Penn Treebank: 93.55 F1 without the use of any external data, and 95.13 F1 when using pre-trained word representations. Our parser also outperforms the previous best-published accuracy figures on 8 of the 9 languages in the SPMRL dataset.
Tasks	Constituency Parsing
Published	2018-05-02
URL	http://arxiv.org/abs/1805.01052v1
PDF	http://arxiv.org/pdf/1805.01052v1.pdf
PWC	https://paperswithcode.com/paper/constituency-parsing-with-a-self-attentive
Repo	https://github.com/asadovsky/nn
Framework	tf

Online Second Order Methods for Non-Convex Stochastic Optimizations


Title	Online Second Order Methods for Non-Convex Stochastic Optimizations
Authors	Xi-Lin Li
Abstract	This paper proposes a family of online second order methods for possibly non-convex stochastic optimizations based on the theory of preconditioned stochastic gradient descent (PSGD), which can be regarded as an enhance stochastic Newton method with the ability to handle gradient noise and non-convexity simultaneously. We have improved the implementations of the original PSGD in several ways, e.g., new forms of preconditioners, more accurate Hessian vector product calculations, and better numerical stability with vanishing or ill-conditioned Hessian, etc.. We also have unrevealed the relationship between feature normalization and PSGD with Kronecker product preconditioners, which explains the excellent performance of Kronecker product preconditioners in deep neural network learning. A software package (https://github.com/lixilinx/psgd_tf) implemented in Tensorflow is provided to compare variations of stochastic gradient descent (SGD) and PSGD with five different preconditioners on a wide range of benchmark problems with commonly used neural network architectures, e.g., convolutional and recurrent neural networks. Experimental results clearly demonstrate the advantages of PSGD in terms of generalization performance and convergence speed.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09383v3
PDF	http://arxiv.org/pdf/1803.09383v3.pdf
PWC	https://paperswithcode.com/paper/online-second-order-methods-for-non-convex
Repo	https://github.com/lixilinx/psgd_tf
Framework	tf

Predicting Semantic Relations using Global Graph Properties


Title	Predicting Semantic Relations using Global Graph Properties
Authors	Yuval Pinter, Jacob Eisenstein
Abstract	Semantic graphs, such as WordNet, are resources which curate natural language on two distinguishable layers. On the local level, individual relations between synsets (semantic building blocks) such as hypernymy and meronymy enhance our understanding of the words used to express their meanings. Globally, analysis of graph-theoretic properties of the entire net sheds light on the structure of human language as a whole. In this paper, we combine global and local properties of semantic graphs through the framework of Max-Margin Markov Graph Models (M3GM), a novel extension of Exponential Random Graph Model (ERGM) that scales to large multi-relational graphs. We demonstrate how such global modeling improves performance on the local task of predicting semantic relations between synsets, yielding new state-of-the-art results on the WN18RR dataset, a challenging version of WordNet link prediction in which “easy” reciprocal cases are removed. In addition, the M3GM model identifies multirelational motifs that are characteristic of well-formed lexical semantic ontologies.
Tasks	Link Prediction
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08644v1
PDF	http://arxiv.org/pdf/1808.08644v1.pdf
PWC	https://paperswithcode.com/paper/predicting-semantic-relations-using-global
Repo	https://github.com/thukg/KG-Reading-List
Framework	none

Loosely-Coupled Semi-Direct Monocular SLAM


Title	Loosely-Coupled Semi-Direct Monocular SLAM
Authors	Seong Hun Lee, Javier Civera
Abstract	We propose a novel semi-direct approach for monocular simultaneous localization and mapping (SLAM) that combines the complementary strengths of direct and feature-based methods. The proposed pipeline loosely couples direct odometry and feature-based SLAM to perform three levels of parallel optimizations: (1) photometric bundle adjustment (BA) that jointly optimizes the local structure and motion, (2) geometric BA that refines keyframe poses and associated feature map points, and (3) pose graph optimization to achieve global map consistency in the presence of loop closures. This is achieved in real-time by limiting the feature-based operations to marginalized keyframes from the direct odometry module. Exhaustive evaluation on two benchmark datasets demonstrates that our system outperforms the state-of-the-art monocular odometry and SLAM systems in terms of overall accuracy and robustness.
Tasks	Simultaneous Localization and Mapping
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10073v3
PDF	http://arxiv.org/pdf/1807.10073v3.pdf
PWC	https://paperswithcode.com/paper/loosely-coupled-semi-direct-monocular-slam
Repo	https://github.com/sunghoon031/LCSD_SLAM
Framework	none

Heron Inference for Bayesian Graphical Models


Title	Heron Inference for Bayesian Graphical Models
Authors	Daniel Rugeles, Zhen Hai, Gao Cong, Manoranjan Dash
Abstract	Bayesian graphical models have been shown to be a powerful tool for discovering uncertainty and causal structure from real-world data in many application fields. Current inference methods primarily follow different kinds of trade-offs between computational complexity and predictive accuracy. At one end of the spectrum, variational inference approaches perform well in computational efficiency, while at the other end, Gibbs sampling approaches are known to be relatively accurate for prediction in practice. In this paper, we extend an existing Gibbs sampling method, and propose a new deterministic Heron inference (Heron) for a family of Bayesian graphical models. In addition to the support for nontrivial distributability, one more benefit of Heron is that it is able to not only allow us to easily assess the convergence status but also largely improve the running efficiency. We evaluate Heron against the standard collapsed Gibbs sampler and state-of-the-art state augmentation method in inference for well-known graphical models. Experimental results using publicly available real-life data have demonstrated that Heron significantly outperforms the baseline methods for inferring Bayesian graphical models.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06526v1
PDF	http://arxiv.org/pdf/1802.06526v1.pdf
PWC	https://paperswithcode.com/paper/heron-inference-for-bayesian-graphical-models
Repo	https://github.com/danrugeles/Heron
Framework	none