October 19, 2019

2806 words 14 mins read

Paper Group ANR 136

Paper Group ANR 136

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering. Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning. KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation. Multi-Step Prediction of Occupancy Grid Maps with Recurrent Neural …

Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering

Title Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Authors Medhini Narasimhan, Alexander G. Schwing
Abstract Question answering is an important task for autonomous agents and virtual assistants alike and was shown to support the disabled in efficiently navigating an overwhelming environment. Many existing methods focus on observation-based questions, ignoring our ability to seamlessly combine observed content with general knowledge. To understand interactions with a knowledge base, a dataset has been introduced recently and keyword matching techniques were shown to yield compelling results despite being vulnerable to misconceptions due to synonyms and homographs. To address this issue, we develop a learning-based approach which goes straight to the facts via a learned embedding space. We demonstrate state-of-the-art results on the challenging recently introduced fact-based visual question answering dataset, outperforming competing methods by more than 5%.
Tasks Factual Visual Question Answering, Question Answering, Visual Question Answering
Published 2018-09-04
URL http://arxiv.org/abs/1809.01124v1
PDF http://arxiv.org/pdf/1809.01124v1.pdf
PWC https://paperswithcode.com/paper/straight-to-the-facts-learning-knowledge-base
Repo
Framework

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Title Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning
Authors Akshita Bhandari, Chandramani Singh
Abstract We propose accelerated randomized coordinate descent algorithms for stochastic optimization and online learning. Our algorithms have significantly less per-iteration complexity than the known accelerated gradient algorithms. The proposed algorithms for online learning have better regret performance than the known randomized online coordinate descent algorithms. Furthermore, the proposed algorithms for stochastic optimization exhibit as good convergence rates as the best known randomized coordinate descent algorithms. We also show simulation results to demonstrate performance of the proposed algorithms.
Tasks Stochastic Optimization
Published 2018-06-05
URL http://arxiv.org/abs/1806.01600v2
PDF http://arxiv.org/pdf/1806.01600v2.pdf
PWC https://paperswithcode.com/paper/accelerated-randomized-coordinate-descent
Repo
Framework

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

Title KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation
Authors Shi Yin, Yi Zhou, Chenguang Li, Shangfei Wang, Jianmin Ji, Xiaoping Chen, Ruili Wang
Abstract We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning. First, from WordNet, we automatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets. Second, we automatically generate new sense-labeled data by DisDict from unlabeled corpora. Third, these generated data, together with manually labeled data and unlabeled data, are fed to a neural framework conducting supervised and unsupervised learning jointly to model the semantic relations among synsets, feature words and their contexts. The experimental results show that KDSL outperforms several representative state-of-the-art methods on various major benchmarks. Interestingly, it performs relatively well even when manually labeled data is unavailable, thus provides a potential solution for similar tasks in a lack of manual annotations.
Tasks Word Sense Disambiguation
Published 2018-08-28
URL http://arxiv.org/abs/1808.09888v4
PDF http://arxiv.org/pdf/1808.09888v4.pdf
PWC https://paperswithcode.com/paper/kdsl-a-knowledge-driven-supervised-learning
Repo
Framework

Multi-Step Prediction of Occupancy Grid Maps with Recurrent Neural Networks

Title Multi-Step Prediction of Occupancy Grid Maps with Recurrent Neural Networks
Authors Nima Mohajerin, Mohsen Rohani
Abstract We investigate the multi-step prediction of the drivable space, represented by Occupancy Grid Maps (OGMs), for autonomous vehicles. Our motivation is that accurate multi-step prediction of the drivable space can efficiently improve path planning and navigation resulting in safe, comfortable and optimum paths in autonomous driving. We train a variety of Recurrent Neural Network (RNN) based architectures on the OGM sequences from the KITTI dataset. The results demonstrate significant improvement of the prediction accuracy using our proposed difference learning method, incorporating motion related features, over the state of the art. We remove the egomotion from the OGM sequences by transforming them into a common frame. Although in the transformed sequences the KITTI dataset is heavily biased toward static objects, by learning the difference between subsequent OGMs, our proposed method provides accurate prediction over both the static and moving objects.
Tasks Autonomous Driving, Autonomous Vehicles, Prediction Of Occupancy Grid Maps
Published 2018-12-21
URL http://arxiv.org/abs/1812.09395v3
PDF http://arxiv.org/pdf/1812.09395v3.pdf
PWC https://paperswithcode.com/paper/multi-step-prediction-of-occupancy-grid-maps
Repo
Framework

Latent Topic Conversational Models

Title Latent Topic Conversational Models
Authors Tsung-Hsien Wen, Minh-Thang Luong
Abstract Latent variable models have been a preferred choice in conversational modeling compared to sequence-to-sequence (seq2seq) models which tend to generate generic and repetitive responses. Despite so, training latent variable models remains to be difficult. In this paper, we propose Latent Topic Conversational Model (LTCM) which augments seq2seq with a neural latent topic component to better guide response generation and make training easier. The neural topic component encodes information from the source sentence to build a global “topic” distribution over words, which is then consulted by the seq2seq model at each generation step. We study in details how the latent representation is learnt in both the vanilla model and LTCM. Our extensive experiments contribute to better understanding and training of conditional latent models for languages. Our results show that by sampling from the learnt latent representations, LTCM can generate diverse and interesting responses. In a subjective human evaluation, the judges also confirm that LTCM is the overall preferred option.
Tasks Latent Variable Models
Published 2018-09-19
URL http://arxiv.org/abs/1809.07070v1
PDF http://arxiv.org/pdf/1809.07070v1.pdf
PWC https://paperswithcode.com/paper/latent-topic-conversational-models
Repo
Framework

Sequential sampling of Gaussian process latent variable models

Title Sequential sampling of Gaussian process latent variable models
Authors Martin Tegner, Benjamin Bloem-Reddy, Stephen Roberts
Abstract We consider the problem of inferring a latent function in a probabilistic model of data. When dependencies of the latent function are specified by a Gaussian process and the data likelihood is complex, efficient computation often involve Markov chain Monte Carlo sampling with limited applicability to large data sets. We extend some of these techniques to scale efficiently when the problem exhibits a sequential structure. We propose an approximation that enables sequential sampling of both latent variables and associated parameters. We demonstrate strong performance in growing-data settings that would otherwise be unfeasible with naive, non-sequential sampling.
Tasks Latent Variable Models
Published 2018-07-13
URL http://arxiv.org/abs/1807.04932v2
PDF http://arxiv.org/pdf/1807.04932v2.pdf
PWC https://paperswithcode.com/paper/sequential-sampling-of-gaussian-process
Repo
Framework

Toward Autonomous Rotation-Aware Unmanned Aerial Grasping

Title Toward Autonomous Rotation-Aware Unmanned Aerial Grasping
Authors Shijie Lin, Jinwang Wang, Wen Yang, Guisong Xia
Abstract Autonomous Unmanned Aerial Manipulators (UAMs) have shown promising potentials to transform passive sensing missions into active 3-dimension interactive missions, but they still suffer from some difficulties impeding their wide applications, such as target detection and stabilization. This letter presents a vision-based autonomous UAM with a 3DoF robotic arm for rotational grasping, with a compensation on displacement for center of gravity. First, the hardware, software architecture and state estimation methods are detailed. All the mechanical designs are fully provided as open-source hardware for the reuse by the community. Then, we analyze the flow distribution generated by rotors and plan the robotic arm’s motion based on this analysis. Next, a novel detection approach called Rotation-SqueezeDet is proposed to enable rotation-aware grasping, which can give the target position and rotation angle in near real-time on Jetson TX2. Finally, the effectiveness of the proposed scheme is validated in multiple experimental trials, highlighting it’s applicability of autonomous aerial grasping in GPS-denied environments.
Tasks
Published 2018-11-09
URL http://arxiv.org/abs/1811.03921v1
PDF http://arxiv.org/pdf/1811.03921v1.pdf
PWC https://paperswithcode.com/paper/toward-autonomous-rotation-aware-unmanned
Repo
Framework

Entropy-regularized Optimal Transport Generative Models

Title Entropy-regularized Optimal Transport Generative Models
Authors Dong Liu, Minh Thành Vu, Saikat Chatterjee, Lars K. Rasmussen
Abstract We investigate the use of entropy-regularized optimal transport (EOT) cost in developing generative models to learn implicit distributions. Two generative models are proposed. One uses EOT cost directly in an one-shot optimization problem and the other uses EOT cost iteratively in an adversarial game. The proposed generative models show improved performance over contemporary models for image generation on MNSIT.
Tasks Image Generation
Published 2018-11-16
URL http://arxiv.org/abs/1811.06763v1
PDF http://arxiv.org/pdf/1811.06763v1.pdf
PWC https://paperswithcode.com/paper/entropy-regularized-optimal-transport
Repo
Framework

DP-GP-LVM: A Bayesian Non-Parametric Model for Learning Multivariate Dependency Structures

Title DP-GP-LVM: A Bayesian Non-Parametric Model for Learning Multivariate Dependency Structures
Authors Andrew R. Lawrence, Carl Henrik Ek, Neill D. F. Campbell
Abstract We present a non-parametric Bayesian latent variable model capable of learning dependency structures across dimensions in a multivariate setting. Our approach is based on flexible Gaussian process priors for the generative mappings and interchangeable Dirichlet process priors to learn the structure. The introduction of the Dirichlet process as a specific structural prior allows our model to circumvent issues associated with previous Gaussian process latent variable models. Inference is performed by deriving an efficient variational bound on the marginal log-likelihood on the model.
Tasks Latent Variable Models
Published 2018-07-12
URL http://arxiv.org/abs/1807.04833v1
PDF http://arxiv.org/pdf/1807.04833v1.pdf
PWC https://paperswithcode.com/paper/dp-gp-lvm-a-bayesian-non-parametric-model-for
Repo
Framework

DIAG-NRE: A Neural Pattern Diagnosis Framework for Distantly Supervised Neural Relation Extraction

Title DIAG-NRE: A Neural Pattern Diagnosis Framework for Distantly Supervised Neural Relation Extraction
Authors Shun Zheng, Xu Han, Yankai Lin, Peilin Yu, Lu Chen, Ling Huang, Zhiyuan Liu, Wei Xu
Abstract Pattern-based labeling methods have achieved promising results in alleviating the inevitable labeling noises of distantly supervised neural relation extraction. However, these methods require significant expert labor to write relation-specific patterns, which makes them too sophisticated to generalize quickly.To ease the labor-intensive workload of pattern writing and enable the quick generalization to new relation types, we propose a neural pattern diagnosis framework, DIAG-NRE, that can automatically summarize and refine high-quality relational patterns from noise data with human experts in the loop. To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
Tasks Relation Extraction
Published 2018-11-06
URL https://arxiv.org/abs/1811.02166v2
PDF https://arxiv.org/pdf/1811.02166v2.pdf
PWC https://paperswithcode.com/paper/diag-nre-a-deep-pattern-diagnosis-framework
Repo
Framework

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

Title Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation
Authors Cong Fang, Yameng Huang, Zhouchen Lin
Abstract Asynchronous algorithms have attracted much attention recently due to the crucial demands on solving large-scale optimization problems. However, the accelerated versions of asynchronous algorithms are rarely studied. In this paper, we propose the “momentum compensation” technique to accelerate asynchronous algorithms for convex problems. Specifically, we first accelerate the plain Asynchronous Gradient Descent, which achieves a faster $O(1/\sqrt{\epsilon})$ (v.s. $O(1/\epsilon)$) convergence rate for non-strongly convex functions, and $O(\sqrt{\kappa}\log(1/\epsilon))$ (v.s. $O(\kappa \log(1/\epsilon))$) for strongly convex functions to reach an $\epsilon$- approximate minimizer with the condition number $\kappa$. We further apply the technique to accelerate modern stochastic asynchronous algorithms such as Asynchronous Stochastic Coordinate Descent and Asynchronous Stochastic Gradient Descent. Both of the resultant practical algorithms are faster than existing ones by order. To the best of our knowledge, we are the first to consider accelerated algorithms that allow updating by delayed gradients and are the first to propose truly accelerated asynchronous algorithms. Finally, the experimental results on a shared memory system show that acceleration can lead to significant performance gains on ill-conditioned problems.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.09747v1
PDF http://arxiv.org/pdf/1802.09747v1.pdf
PWC https://paperswithcode.com/paper/accelerating-asynchronous-algorithms-for
Repo
Framework

Improving Character-level Japanese-Chinese Neural Machine Translation with Radicals as an Additional Input Feature

Title Improving Character-level Japanese-Chinese Neural Machine Translation with Radicals as an Additional Input Feature
Authors Jinyi Zhang, Tadahiro Matsumoto
Abstract In recent years, Neural Machine Translation (NMT) has been proven to get impressive results. While some additional linguistic features of input words improve word-level NMT, any additional character features have not been used to improve character-level NMT so far. In this paper, we show that the radicals of Chinese characters (or kanji), as a character feature information, can be easily provide further improvements in the character-level NMT. In experiments on WAT2016 Japanese-Chinese scientific paper excerpt corpus (ASPEC-JP), we find that the proposed method improves the translation quality according to two aspects: perplexity and BLEU. The character-level NMT with the radical input feature’s model got a state-of-the-art result of 40.61 BLEU points in the test set, which is an improvement of about 8.6 BLEU points over the best system on the WAT2016 Japanese-to-Chinese translation subtask with ASPEC-JP. The improvements over the character-level NMT with no additional input feature are up to about 1.5 and 1.4 BLEU points in the development-test set and the test set of the corpus, respectively.
Tasks Machine Translation
Published 2018-05-08
URL http://arxiv.org/abs/1805.02937v1
PDF http://arxiv.org/pdf/1805.02937v1.pdf
PWC https://paperswithcode.com/paper/improving-character-level-japanese-chinese
Repo
Framework

GANsfer Learning: Combining labelled and unlabelled data for GAN based data augmentation

Title GANsfer Learning: Combining labelled and unlabelled data for GAN based data augmentation
Authors Christopher Bowles, Roger Gunn, Alexander Hammers, Daniel Rueckert
Abstract Medical imaging is a domain which suffers from a paucity of manually annotated data for the training of learning algorithms. Manually delineating pathological regions at a pixel level is a time consuming process, especially in 3D images, and often requires the time of a trained expert. As a result, supervised machine learning solutions must make do with small amounts of labelled data, despite there often being additional unlabelled data available. Whilst of less value than labelled images, these unlabelled images can contain potentially useful information. In this paper we propose combining both labelled and unlabelled data within a GAN framework, before using the resulting network to produce images for use when training a segmentation network. We explore the task of deep grey matter multi-class segmentation in an AD dataset and show that the proposed method leads to a significant improvement in segmentation results, particularly in cases where the amount of labelled data is restricted. We show that this improvement is largely driven by a greater ability to segment the structures known to be the most affected by AD, thereby demonstrating the benefits of exposing the system to more examples of pathological anatomical variation. We also show how a shift in domain of the training data from young and healthy towards older and more pathological examples leads to better segmentations of the latter cases, and that this leads to a significant improvement in the ability for the computed segmentations to stratify cases of AD.
Tasks Data Augmentation
Published 2018-11-26
URL http://arxiv.org/abs/1811.10669v1
PDF http://arxiv.org/pdf/1811.10669v1.pdf
PWC https://paperswithcode.com/paper/gansfer-learning-combining-labelled-and
Repo
Framework

Learning Data-adaptive Nonparametric Kernels

Title Learning Data-adaptive Nonparametric Kernels
Authors Fanghui Liu, Xiaolin Huang, Chen Gong, Jie Yang, Li Li
Abstract Traditional kernels or their combinations are often not sufficiently flexible to fit the data in complicated practical tasks. In this paper, we present a Data-Adaptive Nonparametric Kernel (DANK) learning framework by imposing an adaptive matrix on the kernel/Gram matrix in an entry-wise strategy. Since we do not specify the formulation of the adaptive matrix, each entry in it can be directly and flexibly learned from the data. Therefore, the solution space of the learned kernel is largely expanded, which makes DANK flexible to adapt to the data. Specifically, the proposed kernel learning framework can be seamlessly embedded to support vector machines (SVM) and support vector regression (SVR), which has the capability of enlarging the margin between classes and reducing the model generalization error. Theoretically, we demonstrate that the objective function of our devised model is gradient-Lipschitz continuous. Thereby, the training process for kernel and parameter learning in SVM/SVR can be efficiently optimized in a unified framework. Further, to address the scalability issue in DANK, a decomposition-based scalable approach is developed, of which the effectiveness is demonstrated by both empirical studies and theoretical guarantees. Experimentally, our method outperforms other representative kernel learning based algorithms on various classification and regression benchmark datasets.
Tasks
Published 2018-08-31
URL https://arxiv.org/abs/1808.10724v2
PDF https://arxiv.org/pdf/1808.10724v2.pdf
PWC https://paperswithcode.com/paper/learning-data-adaptive-nonparametric-kernels
Repo
Framework

Confounding-Robust Policy Improvement

Title Confounding-Robust Policy Improvement
Authors Nathan Kallus, Angela Zhou
Abstract We study the problem of learning personalized decision policies from observational data while accounting for possible unobserved confounding. Previous approaches, which assume unconfoundedness, i.e., that no unobserved confounders affect both the treatment assignment as well as outcome, can lead to policies that introduce harm rather than benefit when some unobserved confounding is present, as is generally the case with observational data. Instead, since policy value and regret may not be point-identifiable, we study a method that minimizes the worst-case estimated regret of a candidate policy against a baseline policy over an uncertainty set for propensity weights that controls the extent of unobserved confounding. We prove generalization guarantees that ensure our policy will be safe when applied in practice and will in fact obtain the best-possible uniform control on the range of all possible population regrets that agree with the possible extent of confounding. We develop efficient algorithmic solutions to compute this confounding-robust policy. Finally, we assess and compare our methods on synthetic and semi-synthetic data. In particular, we consider a case study on personalizing hormone replacement therapy based on observational data, where we validate our results on a randomized experiment. We demonstrate that hidden confounding can hinder existing policy learning approaches and lead to unwarranted harm, while our robust approach guarantees safety and focuses on well-evidenced improvement, a necessity for making personalized treatment policies learned from observational data reliable in practice.
Tasks Causal Inference
Published 2018-05-22
URL https://arxiv.org/abs/1805.08593v3
PDF https://arxiv.org/pdf/1805.08593v3.pdf
PWC https://paperswithcode.com/paper/confounding-robust-policy-improvement
Repo
Framework
comments powered by Disqus