Paper Group AWR 79
Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping. Images & Recipes: Retrieval in the cooking context. No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling. Extending Pretrained Segmentation Networks with Additional Anatomical Structures. Temporal Regularization in Markov Decision Proc …
Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping
Title | Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping |
Authors | Akira Taniguchi, Yoshinobu Hagiwara, Tadahiro Taniguchi, Tetsunari Inamura |
Abstract | We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. However, our original algorithm had limited estimation accuracy owing to the influence of the early stages of learning, and increased computational complexity with added training data. Therefore, we introduce techniques such as fixed-lag rejuvenation to reduce the calculation time while maintaining an accuracy higher than that of the original algorithm. The results show that, in terms of estimation accuracy, the proposed algorithm exceeds the original algorithm and is comparable to batch learning. In addition, the calculation time of the proposed algorithm does not depend on the amount of training data and becomes constant for each step of the scalable algorithm. Our approach will contribute to the realization of long-term spatial language interactions between humans and robots. |
Tasks | |
Published | 2018-03-09 |
URL | https://arxiv.org/abs/1803.03481v3 |
https://arxiv.org/pdf/1803.03481v3.pdf | |
PWC | https://paperswithcode.com/paper/improved-and-scalable-online-learning-of |
Repo | https://github.com/a-taniguchi/SpCoSLAM2 |
Framework | none |
Images & Recipes: Retrieval in the cooking context
Title | Images & Recipes: Retrieval in the cooking context |
Authors | Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Matthieu Cord |
Abstract | Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine. In this paper, we tackle the picture-recipe alignment problem, having as target application the large-scale retrieval task (finding a recipe given a picture, and vice versa). Our approach is validated on the Recipe1M dataset, composed of one million image-recipe pairs and additional class information, for which we achieve state-of-the-art results. |
Tasks | |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.00900v1 |
http://arxiv.org/pdf/1805.00900v1.pdf | |
PWC | https://paperswithcode.com/paper/images-recipes-retrieval-in-the-cooking |
Repo | https://github.com/Cadene/recipe1m.bootstrap.pytorch |
Framework | pytorch |
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Title | No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling |
Authors | Xin Wang, Wenhu Chen, Yuan-Fang Wang, William Yang Wang |
Abstract | Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem. Different from captions, stories have more expressive language styles and contain many imaginary concepts that do not appear in the images. Thus it poses challenges to behavioral cloning algorithms. Furthermore, due to the limitations of automatic metrics on evaluating story quality, reinforcement learning methods with hand-crafted rewards also face difficulties in gaining an overall performance boost. Therefore, we propose an Adversarial REward Learning (AREL) framework to learn an implicit reward function from human demonstrations, and then optimize policy search with the learned reward function. Though automatic eval- uation indicates slight performance boost over state-of-the-art (SOTA) methods in cloning expert behaviors, human evaluation shows that our approach achieves significant improvement in generating more human-like stories than SOTA systems. |
Tasks | Image Captioning, Visual Storytelling |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09160v2 |
http://arxiv.org/pdf/1804.09160v2.pdf | |
PWC | https://paperswithcode.com/paper/no-metrics-are-perfect-adversarial-reward |
Repo | https://github.com/littlekobe/AREL |
Framework | pytorch |
Extending Pretrained Segmentation Networks with Additional Anatomical Structures
Title | Extending Pretrained Segmentation Networks with Additional Anatomical Structures |
Authors | Firat Ozdemir, Orcun Goksel |
Abstract | Comprehensive surgical planning require complex patient-specific anatomical models. For instance, functional muskuloskeletal simulations necessitate all relevant structures to be segmented, which could be performed in real-time using deep neural networks given sufficient annotated samples. Such large datasets of multiple structure annotations are costly to procure and are often unavailable in practice. Nevertheless, annotations from different studies and centers can be readily available, or become available in the future in an incremental fashion. We propose a class-incremental segmentation framework for extending a deep network trained for some anatomical structure to yet another structure using a small incremental annotation set. Through distilling knowledge from the current state of the framework, we bypass the need for a full retraining. This is a meta-method to extend any choice of desired deep segmentation network with only a minor addition per structure, which makes it suitable for lifelong class-incremental learning and applicable also for future deep neural network architectures. We evaluated our methods on a public knee dataset of 100 MR volumes. Through varying amount of incremental annotation ratios, we show how our proposed method can retain the previous anatomical structure segmentation performance superior to the conventional finetuning approach. In addition, our framework inherently exploits transferable knowledge from previously trained structures to incremental tasks, demonstrated by superior results compared to non-incremental training. With the presented method, new anatomical structures can be learned without catastrophic forgetting of older structures and without extensive increase of memory and complexity. |
Tasks | |
Published | 2018-11-12 |
URL | https://arxiv.org/abs/1811.04634v2 |
https://arxiv.org/pdf/1811.04634v2.pdf | |
PWC | https://paperswithcode.com/paper/extending-pretrained-segmentation-networks |
Repo | https://github.com/firatozdemir/LwfSeg-AeiSeg |
Framework | tf |
Temporal Regularization in Markov Decision Process
Title | Temporal Regularization in Markov Decision Process |
Authors | Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup |
Abstract | Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games. |
Tasks | Atari Games |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00429v2 |
http://arxiv.org/pdf/1811.00429v2.pdf | |
PWC | https://paperswithcode.com/paper/temporal-regularization-in-markov-decision |
Repo | https://github.com/pierthodo/temporal_regularization |
Framework | tf |
Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples
Title | Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples |
Authors | Arindam Mitra, Chitta Baral |
Abstract | Over the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available at https://goo.gl/KdWAcV. This paper is under consideration for acceptance in TPLP. |
Tasks | Handwritten Digit Recognition, Question Answering |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07966v2 |
http://arxiv.org/pdf/1802.07966v2.pdf | |
PWC | https://paperswithcode.com/paper/incremental-and-iterative-learning-of-answer |
Repo | https://github.com/ari9dam/ILPME |
Framework | none |
DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification
Title | DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification |
Authors | Wentao Zhu, Chaochun Liu, Wei Fan, Xiaohui Xie |
Abstract | In this work, we present a fully automated lung computed tomography (CT) cancer diagnosis system, DeepLung. DeepLung consists of two components, nodule detection (identifying the locations of candidate nodules) and classification (classifying candidate nodules into benign or malignant). Considering the 3D nature of lung CT data and the compactness of dual path networks (DPN), two deep 3D DPN are designed for nodule detection and classification respectively. Specifically, a 3D Faster Regions with Convolutional Neural Net (R-CNN) is designed for nodule detection with 3D dual path blocks and a U-net-like encoder-decoder structure to effectively learn nodule features. For nodule classification, gradient boosting machine (GBM) with 3D dual path network features is proposed. The nodule classification subnetwork was validated on a public dataset from LIDC-IDRI, on which it achieved better performance than state-of-the-art approaches and surpassed the performance of experienced doctors based on image modality. Within the DeepLung system, candidate nodules are detected first by the nodule detection subnetwork, and nodule diagnosis is conducted by the classification subnetwork. Extensive experimental results demonstrate that DeepLung has performance comparable to experienced doctors both for the nodule-level and patient-level diagnosis on the LIDC-IDRI dataset.\footnote{https://github.com/uci-cbcl/DeepLung.git} |
Tasks | Computed Tomography (CT), Lung Nodule Classification |
Published | 2018-01-25 |
URL | http://arxiv.org/abs/1801.09555v1 |
http://arxiv.org/pdf/1801.09555v1.pdf | |
PWC | https://paperswithcode.com/paper/deeplung-deep-3d-dual-path-nets-for-automated |
Repo | https://github.com/uci-cbcl/DeepLung |
Framework | pytorch |
A Hierarchical Framework for Relation Extraction with Reinforcement Learning
Title | A Hierarchical Framework for Relation Extraction with Reinforcement Learning |
Authors | Ryuichi Takanobu, Tianyang Zhang, Jiexi Liu, Minlie Huang |
Abstract | Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled. This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation. We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types. The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations. Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations. |
Tasks | Entity Extraction, Hierarchical Reinforcement Learning, Relation Extraction |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03925v1 |
http://arxiv.org/pdf/1811.03925v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchical-framework-for-relation |
Repo | https://github.com/truthless11/HRL-RE |
Framework | pytorch |
Deep learning improved by biological activation functions
Title | Deep learning improved by biological activation functions |
Authors | Gardave S Bhumbra |
Abstract | `Biologically inspired’ activation functions, such as the logistic sigmoid, have been instrumental in the historical advancement of machine learning. However in the field of deep learning, they have been largely displaced by rectified linear units (ReLU) or similar functions, such as its exponential linear unit (ELU) variant, to mitigate the effects of vanishing gradients associated with error back-propagation. The logistic sigmoid however does not represent the true input-output relation in neuronal cells under physiological conditions. Here, bionodal root unit (BRU) activation functions are introduced, exhibiting input-output non-linearities that are substantially more biologically plausible since their functional form is based on known biophysical properties of neuronal cells. In order to evaluate the learning performance of BRU activations, deep networks are constructed with identical architectures except differing in their transfer functions (ReLU, ELU, and BRU). Multilayer perceptrons, stacked auto-encoders, and convolutional networks are used to test supervised and unsupervised learning based on the MNIST and CIFAR-10/100 datasets. Comparisons of learning performance, quantified using loss and error measurements, demonstrate that bionodal networks both train faster than their ReLU and ELU counterparts and result in the best generalised models even in the absence of formal regularisation. These results therefore suggest that revisiting the detailed properties of biological neurones and their circuitry might prove invaluable in the field of deep learning for the future. | |
Tasks | |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1804.11237v2 |
http://arxiv.org/pdf/1804.11237v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-improved-by-biological |
Repo | https://github.com/takyamamoto/BRU_chainer |
Framework | none |
A Span Selection Model for Semantic Role Labeling
Title | A Span Selection Model for Semantic Role Labeling |
Authors | Hiroki Ouchi, Hiroyuki Shindo, Yuji Matsumoto |
Abstract | We present a simple and accurate span-based model for semantic role labeling (SRL). Our model directly takes into account all possible argument spans and scores them for each label. At decoding time, we greedily select higher scoring labeled spans. One advantage of our model is to allow us to design and use span-level features, that are difficult to use in token-based BIO tagging approaches. Experimental results demonstrate that our ensemble model achieves the state-of-the-art results, 87.4 F1 and 87.0 F1 on the CoNLL-2005 and 2012 datasets, respectively. |
Tasks | Semantic Role Labeling |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02245v1 |
http://arxiv.org/pdf/1810.02245v1.pdf | |
PWC | https://paperswithcode.com/paper/a-span-selection-model-for-semantic-role |
Repo | https://github.com/asadovsky/nn |
Framework | tf |
Constituency Parsing with a Self-Attentive Encoder
Title | Constituency Parsing with a Self-Attentive Encoder |
Authors | Nikita Kitaev, Dan Klein |
Abstract | We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. The use of attention makes explicit the manner in which information is propagated between different locations in the sentence, which we use to both analyze our model and propose potential improvements. For example, we find that separating positional and content information in the encoder can lead to improved parsing accuracy. Additionally, we evaluate different approaches for lexical representation. Our parser achieves new state-of-the-art results for single models trained on the Penn Treebank: 93.55 F1 without the use of any external data, and 95.13 F1 when using pre-trained word representations. Our parser also outperforms the previous best-published accuracy figures on 8 of the 9 languages in the SPMRL dataset. |
Tasks | Constituency Parsing |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.01052v1 |
http://arxiv.org/pdf/1805.01052v1.pdf | |
PWC | https://paperswithcode.com/paper/constituency-parsing-with-a-self-attentive |
Repo | https://github.com/asadovsky/nn |
Framework | tf |
Online Second Order Methods for Non-Convex Stochastic Optimizations
Title | Online Second Order Methods for Non-Convex Stochastic Optimizations |
Authors | Xi-Lin Li |
Abstract | This paper proposes a family of online second order methods for possibly non-convex stochastic optimizations based on the theory of preconditioned stochastic gradient descent (PSGD), which can be regarded as an enhance stochastic Newton method with the ability to handle gradient noise and non-convexity simultaneously. We have improved the implementations of the original PSGD in several ways, e.g., new forms of preconditioners, more accurate Hessian vector product calculations, and better numerical stability with vanishing or ill-conditioned Hessian, etc.. We also have unrevealed the relationship between feature normalization and PSGD with Kronecker product preconditioners, which explains the excellent performance of Kronecker product preconditioners in deep neural network learning. A software package (https://github.com/lixilinx/psgd_tf) implemented in Tensorflow is provided to compare variations of stochastic gradient descent (SGD) and PSGD with five different preconditioners on a wide range of benchmark problems with commonly used neural network architectures, e.g., convolutional and recurrent neural networks. Experimental results clearly demonstrate the advantages of PSGD in terms of generalization performance and convergence speed. |
Tasks | |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09383v3 |
http://arxiv.org/pdf/1803.09383v3.pdf | |
PWC | https://paperswithcode.com/paper/online-second-order-methods-for-non-convex |
Repo | https://github.com/lixilinx/psgd_tf |
Framework | tf |
Predicting Semantic Relations using Global Graph Properties
Title | Predicting Semantic Relations using Global Graph Properties |
Authors | Yuval Pinter, Jacob Eisenstein |
Abstract | Semantic graphs, such as WordNet, are resources which curate natural language on two distinguishable layers. On the local level, individual relations between synsets (semantic building blocks) such as hypernymy and meronymy enhance our understanding of the words used to express their meanings. Globally, analysis of graph-theoretic properties of the entire net sheds light on the structure of human language as a whole. In this paper, we combine global and local properties of semantic graphs through the framework of Max-Margin Markov Graph Models (M3GM), a novel extension of Exponential Random Graph Model (ERGM) that scales to large multi-relational graphs. We demonstrate how such global modeling improves performance on the local task of predicting semantic relations between synsets, yielding new state-of-the-art results on the WN18RR dataset, a challenging version of WordNet link prediction in which “easy” reciprocal cases are removed. In addition, the M3GM model identifies multirelational motifs that are characteristic of well-formed lexical semantic ontologies. |
Tasks | Link Prediction |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08644v1 |
http://arxiv.org/pdf/1808.08644v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-semantic-relations-using-global |
Repo | https://github.com/thukg/KG-Reading-List |
Framework | none |
Loosely-Coupled Semi-Direct Monocular SLAM
Title | Loosely-Coupled Semi-Direct Monocular SLAM |
Authors | Seong Hun Lee, Javier Civera |
Abstract | We propose a novel semi-direct approach for monocular simultaneous localization and mapping (SLAM) that combines the complementary strengths of direct and feature-based methods. The proposed pipeline loosely couples direct odometry and feature-based SLAM to perform three levels of parallel optimizations: (1) photometric bundle adjustment (BA) that jointly optimizes the local structure and motion, (2) geometric BA that refines keyframe poses and associated feature map points, and (3) pose graph optimization to achieve global map consistency in the presence of loop closures. This is achieved in real-time by limiting the feature-based operations to marginalized keyframes from the direct odometry module. Exhaustive evaluation on two benchmark datasets demonstrates that our system outperforms the state-of-the-art monocular odometry and SLAM systems in terms of overall accuracy and robustness. |
Tasks | Simultaneous Localization and Mapping |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10073v3 |
http://arxiv.org/pdf/1807.10073v3.pdf | |
PWC | https://paperswithcode.com/paper/loosely-coupled-semi-direct-monocular-slam |
Repo | https://github.com/sunghoon031/LCSD_SLAM |
Framework | none |
Heron Inference for Bayesian Graphical Models
Title | Heron Inference for Bayesian Graphical Models |
Authors | Daniel Rugeles, Zhen Hai, Gao Cong, Manoranjan Dash |
Abstract | Bayesian graphical models have been shown to be a powerful tool for discovering uncertainty and causal structure from real-world data in many application fields. Current inference methods primarily follow different kinds of trade-offs between computational complexity and predictive accuracy. At one end of the spectrum, variational inference approaches perform well in computational efficiency, while at the other end, Gibbs sampling approaches are known to be relatively accurate for prediction in practice. In this paper, we extend an existing Gibbs sampling method, and propose a new deterministic Heron inference (Heron) for a family of Bayesian graphical models. In addition to the support for nontrivial distributability, one more benefit of Heron is that it is able to not only allow us to easily assess the convergence status but also largely improve the running efficiency. We evaluate Heron against the standard collapsed Gibbs sampler and state-of-the-art state augmentation method in inference for well-known graphical models. Experimental results using publicly available real-life data have demonstrated that Heron significantly outperforms the baseline methods for inferring Bayesian graphical models. |
Tasks | |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.06526v1 |
http://arxiv.org/pdf/1802.06526v1.pdf | |
PWC | https://paperswithcode.com/paper/heron-inference-for-bayesian-graphical-models |
Repo | https://github.com/danrugeles/Heron |
Framework | none |