October 19, 2019

3327 words 16 mins read

Paper Group ANR 221

Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All. Learning and Inferring Movement with Deep Generative Model. Multi-Scale Structure-Aware Network for Human Pose Estimation. The Case for Automatic Database Administration using Deep Reinforcement Learning. Design Exploration of Hybrid CMOS-OxRAM Deep Generative Architec …

Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All


Title	Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All
Authors	Minh C. Phan, Aixin Sun, Yi Tay, Jialong Han, Chenliang Li
Abstract	Collective entity disambiguation aims to jointly resolve multiple mentions by linking them to their associated entities in a knowledge base. Previous works are primarily based on the underlying assumption that entities within the same document are highly related. However, the extend to which these mentioned entities are actually connected in reality is rarely studied and therefore raises interesting research questions. For the first time, we show that the semantic relationships between the mentioned entities are in fact less dense than expected. This could be attributed to several reasons such as noise, data sparsity and knowledge base incompleteness. As a remedy, we introduce MINTREE, a new tree-based objective for the entity disambiguation problem. The key intuition behind MINTREE is the concept of coherence relaxation which utilizes the weight of a minimum spanning tree to measure the coherence between entities. Based on this new objective, we design a novel entity disambiguation algorithms which we call Pair-Linking. Instead of considering all the given mentions, Pair-Linking iteratively selects a pair with the highest confidence at each step for decision making. Via extensive experiments, we show that our approach is not only more accurate but also surprisingly faster than many state-of-the-art collective linking algorithms.
Tasks	Decision Making, Entity Disambiguation
Published	2018-02-04
URL	http://arxiv.org/abs/1802.01074v3
PDF	http://arxiv.org/pdf/1802.01074v3.pdf
PWC	https://paperswithcode.com/paper/pair-linking-for-collective-entity
Repo
Framework

Learning and Inferring Movement with Deep Generative Model


Title	Learning and Inferring Movement with Deep Generative Model
Authors	Mingxuan Jing, Xiaojian Ma, Fuchun Sun, Huaping Liu
Abstract	Learning and inference movement is a very challenging problem due to its high dimensionality and dependency to varied environments or tasks. In this paper, we propose an effective probabilistic method for learning and inference of basic movements. The motion planning problem is formulated as learning on a directed graphic model and deep generative model is used to perform learning and inference from demonstrations. An important characteristic of this method is that it flexibly incorporates the task descriptors and context information for long-term planning and it can be combined with dynamic systems for robot control. The experimental validations on robotic approaching path planning tasks show the advantages over the base methods with limited training data.
Tasks	Motion Planning
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07252v2
PDF	http://arxiv.org/pdf/1805.07252v2.pdf
PWC	https://paperswithcode.com/paper/learning-and-inferring-movement-with-deep
Repo
Framework

Multi-Scale Structure-Aware Network for Human Pose Estimation


Title	Multi-Scale Structure-Aware Network for Human Pose Estimation
Authors	Lipeng Ke, Ming-Ching Chang, Honggang Qi, Siwei Lyu
Abstract	We develop a robust multi-scale structure-aware neural network for human pose estimation. This method improves the recent deep conv-deconv hourglass models with four key improvements: (1) multi-scale supervision to strengthen contextual feature learning in matching body keypoints by combining feature heatmaps across scales, (2) multi-scale regression network at the end to globally optimize the structural matching of the multi-scale features, (3) structure-aware loss used in the intermediate supervision and at the regression to improve the matching of keypoints and respective neighbors to infer a higher-order matching configurations, and (4) a keypoint masking training scheme that can effectively fine-tune our network to robustly localize occluded keypoints via adjacent matches. Our method can effectively improve state-of-the-art pose estimation methods that suffer from difficulties in scale varieties, occlusions, and complex multi-person scenarios. This multi-scale supervision tightly integrates with the regression network to effectively (i) localize keypoints using the ensemble of multi-scale features, and (ii) infer global pose configuration by maximizing structural consistencies across multiple keypoints and scales. The keypoint masking training enhances these advantages to focus learning on hard occlusion samples. Our method achieves the leading position in the MPII challenge leaderboard among the state-of-the-art methods.
Tasks	Pose Estimation
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09894v3
PDF	http://arxiv.org/pdf/1803.09894v3.pdf
PWC	https://paperswithcode.com/paper/multi-scale-structure-aware-network-for-human
Repo
Framework

The Case for Automatic Database Administration using Deep Reinforcement Learning


Title	The Case for Automatic Database Administration using Deep Reinforcement Learning
Authors	Ankur Sharma, Felix Martin Schuhknecht, Jens Dittrich
Abstract	Like any large software system, a full-fledged DBMS offers an overwhelming amount of configuration knobs. These range from static initialisation parameters like buffer sizes, degree of concurrency, or level of replication to complex runtime decisions like creating a secondary index on a particular column or reorganising the physical layout of the store. To simplify the configuration, industry grade DBMSs are usually shipped with various advisory tools, that provide recommendations for given workloads and machines. However, reality shows that the actual configuration, tuning, and maintenance is usually still done by a human administrator, relying on intuition and experience. Recent work on deep reinforcement learning has shown very promising results in solving problems, that require such a sense of intuition. For instance, it has been applied very successfully in learning how to play complicated games with enormous search spaces. Motivated by these achievements, in this work we explore how deep reinforcement learning can be used to administer a DBMS. First, we will describe how deep reinforcement learning can be used to automatically tune an arbitrary software system like a DBMS by defining a problem environment. Second, we showcase our concept of NoDBA at the concrete example of index selection and evaluate how well it recommends indexes for given workloads.
Tasks
Published	2018-01-17
URL	http://arxiv.org/abs/1801.05643v1
PDF	http://arxiv.org/pdf/1801.05643v1.pdf
PWC	https://paperswithcode.com/paper/the-case-for-automatic-database
Repo
Framework

Design Exploration of Hybrid CMOS-OxRAM Deep Generative Architectures


Title	Design Exploration of Hybrid CMOS-OxRAM Deep Generative Architectures
Authors	Vivek Parmar, Manan Suri
Abstract	Deep Learning and its applications have gained tremendous interest recently in both academia and industry. Restricted Boltzmann Machines (RBMs) offer a key methodology to implement deep learning paradigms. This paper presents a novel approach for realizing hybrid CMOS-OxRAM based deep generative models (DGM). In our proposed hybrid DGM architectures, HfOx based (filamentary-type switching) OxRAM devices are extensively used for realizing multiple computational and non-computational functions such as: (i) Synapses (weights), (ii) internal neuron-state storage, (iii) stochastic neuron activation and (iv) programmable signal normalization. To validate the proposed scheme we have simulated two different architectures: (i) Deep Belief Network (DBN) and (ii) Stacked Denoising Autoencoder for classification and reconstruction of hand-written digits from a reduced MNIST dataset of 6000 images. Contrastive-divergence (CD) specially optimized for OxRAM devices was used to drive the synaptic weight update mechanism of each layer in the network. Overall learning rule was based on greedy-layer wise learning with no back propagation which allows the network to be trained to a good pre-training stage. Performance of the simulated hybrid CMOS-RRAM DGM model matches closely with software based model for a 2-layers deep network. Top-3 test accuracy achieved by the DBN was 95.5%. MSE of the SDA network was 0.003, lower than software based approach. Endurance analysis of the simulated architectures show that for 200 epochs of training (single RBM layer), maximum switching events/per OxRAM device was ~ 7000 cycles.
Tasks	Denoising
Published	2018-01-06
URL	http://arxiv.org/abs/1801.02003v1
PDF	http://arxiv.org/pdf/1801.02003v1.pdf
PWC	https://paperswithcode.com/paper/design-exploration-of-hybrid-cmos-oxram-deep
Repo
Framework

Expectation Learning for Adaptive Crossmodal Stimuli Association


Title	Expectation Learning for Adaptive Crossmodal Stimuli Association
Authors	Pablo Barros, German I. Parisi, Di Fu, Xun Liu, Stefan Wermter
Abstract	The human brain is able to learn, generalize, and predict crossmodal stimuli. Learning by expectation fine-tunes crossmodal processing at different levels, thus enhancing our power of generalization and adaptation in highly dynamic environments. In this paper, we propose a deep neural architecture trained by using expectation learning accounting for unsupervised learning tasks. Our learning model exhibits a self-adaptable behavior, setting the first steps towards the development of deep learning architectures for crossmodal stimuli association.
Tasks
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07654v1
PDF	http://arxiv.org/pdf/1801.07654v1.pdf
PWC	https://paperswithcode.com/paper/expectation-learning-for-adaptive-crossmodal
Repo
Framework

Neural-Kernelized Conditional Density Estimation


Title	Neural-Kernelized Conditional Density Estimation
Authors	Hiroaki Sasaki, Aapo Hyvärinen
Abstract	Conditional density estimation is a general framework for solving various problems in machine learning. Among existing methods, non-parametric and/or kernel-based methods are often difficult to use on large datasets, while methods based on neural networks usually make restrictive parametric assumptions on the probability densities. Here, we propose a novel method for estimating the conditional density based on score matching. In contrast to existing methods, we employ scalable neural networks, but do not make explicit parametric assumptions on densities. The key challenge in applying score matching to neural networks is computation of the first- and second-order derivatives of a model for the log-density. We tackle this challenge by developing a new neural-kernelized approach, which can be applied on large datasets with stochastic gradient descent, while the reproducing kernels allow for easy computation of the derivatives needed in score matching. We show that the neural-kernelized function approximator has universal approximation capability and that our method is consistent in conditional density estimation. We numerically demonstrate that our method is useful in high-dimensional conditional density estimation, and compares favourably with existing methods. Finally, we prove that the proposed method has interesting connections to two probabilistically principled frameworks of representation learning: Nonlinear sufficient dimension reduction and nonlinear independent component analysis.
Tasks	Density Estimation, Dimensionality Reduction, Representation Learning
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01754v1
PDF	http://arxiv.org/pdf/1806.01754v1.pdf
PWC	https://paperswithcode.com/paper/neural-kernelized-conditional-density
Repo
Framework

Wavelet Decomposition of Gradient Boosting


Title	Wavelet Decomposition of Gradient Boosting
Authors	Shai Dekel, Oren Elisha, Ohad Morgan
Abstract	In this paper we introduce a significant improvement to the popular tree-based Stochastic Gradient Boosting algorithm using a wavelet decomposition of the trees. This approach is based on harmonic analysis and approximation theoretical elements, and as we show through extensive experimentation, our wavelet based method generally outperforms existing methods, particularly in difficult scenarios of class unbalance and mislabeling in the training data.
Tasks
Published	2018-05-07
URL	https://arxiv.org/abs/1805.02642v2
PDF	https://arxiv.org/pdf/1805.02642v2.pdf
PWC	https://paperswithcode.com/paper/wavelet-decomposition-of-gradient-boosting
Repo
Framework

Robust Cross-lingual Hypernymy Detection using Dependency Context


Title	Robust Cross-lingual Hypernymy Detection using Dependency Context
Authors	Shyam Upadhyay, Yogarshi Vyas, Marine Carpuat, Dan Roth
Abstract	Cross-lingual Hypernymy Detection involves determining if a word in one language (“fruit”) is a hypernym of a word in another language (“pomme” i.e. apple in French). The ability to detect hypernymy cross-lingually can aid in solving cross-lingual versions of tasks such as textual entailment and event coreference. We propose BISPARSE-DEP, a family of unsupervised approaches for cross-lingual hypernymy detection, which learns sparse, bilingual word embeddings based on dependency contexts. We show that BISPARSE-DEP can significantly improve performance on this task, compared to approaches based only on lexical context. Our approach is also robust, showing promise for low-resource settings: our dependency-based embeddings can be learned using a parser trained on related languages, with negligible loss in performance. We also crowd-source a challenging dataset for this task on four languages – Russian, French, Arabic, and Chinese. Our embeddings and datasets are publicly available.
Tasks	Natural Language Inference, Word Embeddings
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11291v1
PDF	http://arxiv.org/pdf/1803.11291v1.pdf
PWC	https://paperswithcode.com/paper/robust-cross-lingual-hypernymy-detection
Repo
Framework

Monte Carlo Tree Search with Scalable Simulation Periods for Continuously Running Tasks


Title	Monte Carlo Tree Search with Scalable Simulation Periods for Continuously Running Tasks
Authors	Seydou Ba, Takuya Hiraoka, Takashi Onishi, Toru Nakata, Yoshimasa Tsuruoka
Abstract	Monte Carlo Tree Search (MCTS) is particularly adapted to domains where the potential actions can be represented as a tree of sequential decisions. For an effective action selection, MCTS performs many simulations to build a reliable tree representation of the decision space. As such, a bottleneck to MCTS appears when enough simulations cannot be performed between action selections. This is particularly highlighted in continuously running tasks, for which the time available to perform simulations between actions tends to be limited due to the environment’s state constantly changing. In this paper, we present an approach that takes advantage of the anytime characteristic of MCTS to increase the simulation time when allowed. Our approach is to effectively balance the prospect of selecting an action with the time that can be spared to perform MCTS simulations before the next action selection. For that, we considered the simulation time as a decision variable to be selected alongside an action. We extended the Hierarchical Optimistic Optimization applied to Tree (HOOT) method to adapt our approach to environments with a continuous decision space. We evaluated our approach for environments with a continuous decision space through OpenAI gym’s Pendulum and Continuous Mountain Car environments and for environments with discrete action space through the arcade learning environment (ALE) platform. The evaluation results show that, with variable simulation times, the proposed approach outperforms the conventional MCTS in the evaluated continuous decision space tasks and improves the performance of MCTS in most of the ALE tasks.
Tasks	Atari Games
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02378v1
PDF	http://arxiv.org/pdf/1809.02378v1.pdf
PWC	https://paperswithcode.com/paper/monte-carlo-tree-search-with-scalable
Repo
Framework

Joint On-line Learning of a Zero-shot Spoken Semantic Parser and a Reinforcement Learning Dialogue Manager


Title	Joint On-line Learning of a Zero-shot Spoken Semantic Parser and a Reinforcement Learning Dialogue Manager
Authors	Matthieu Riou, Bassam Jabaian, Stéphane Huet, Fabrice Lefèvre
Abstract	Despite many recent advances for the design of dialogue systems, a true bottleneck remains the acquisition of data required to train its components. Unlike many other language processing applications, dialogue systems require interactions with users, therefore it is complex to develop them with pre-recorded data. Building on previous works, on-line learning is pursued here as a most convenient way to address the issue. Data collection, annotation and use in learning algorithms are performed in a single process. The main difficulties are then: to bootstrap an initial basic system, and to control the level of additional cost on the user side. Considering that well-performing solutions can be used directly off the shelf for speech recognition and synthesis, the study is focused on learning the spoken language understanding and dialogue management modules only. Several variants of joint learning are investigated and tested with user trials to confirm that the overall on-line learning can be obtained after only a few hundred training dialogues and can overstep an expert-based system.
Tasks	Dialogue Management, Speech Recognition, Spoken Language Understanding
Published	2018-10-01
URL	http://arxiv.org/abs/1810.00924v1
PDF	http://arxiv.org/pdf/1810.00924v1.pdf
PWC	https://paperswithcode.com/paper/joint-on-line-learning-of-a-zero-shot-spoken
Repo
Framework

Model-Preserving Sensitivity Analysis for Families of Gaussian Distributions


Title	Model-Preserving Sensitivity Analysis for Families of Gaussian Distributions
Authors	Christiane Goergen, Manuele Leonelli
Abstract	The accuracy of probability distributions inferred using machine-learning algorithms heavily depends on data availability and quality. In practical applications it is therefore fundamental to investigate the robustness of a statistical model to misspecification of some of its underlying probabilities. In the context of graphical models, investigations of robustness fall under the notion of sensitivity analyses. These analyses consist in varying some of the model’s probabilities or parameters and then assessing how far apart the original and the varied distributions are. However, for Gaussian graphical models, such variations usually make the original graph an incoherent representation of the model’s conditional independence structure. Here we develop an approach to sensitivity analysis which guarantees the original graph remains valid after any probability variation and we quantify the effect of such variations using different measures. To achieve this we take advantage of algebraic techniques to both concisely represent conditional independence and to provide a straightforward way of checking the validity of such relationships. Our methods are demonstrated to be robust and comparable to standard ones, which break the conditional independence structure of the model, using an artificial example and a medical real-world application.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10794v1
PDF	http://arxiv.org/pdf/1809.10794v1.pdf
PWC	https://paperswithcode.com/paper/model-preserving-sensitivity-analysis-for
Repo
Framework

Sliding Line Point Regression for Shape Robust Scene Text Detection


Title	Sliding Line Point Regression for Shape Robust Scene Text Detection
Authors	Yixing Zhu, Jun Du
Abstract	Traditional text detection methods mostly focus on quadrangle text. In this study we propose a novel method named sliding line point regression (SLPR) in order to detect arbitrary-shape text in natural scene. SLPR regresses multiple points on the edge of text line and then utilizes these points to sketch the outlines of the text. The proposed SLPR can be adapted to many object detection architectures such as Faster R-CNN and R-FCN. Specifically, we first generate the smallest rectangular box including the text with region proposal network (RPN), then isometrically regress the points on the edge of text by using the vertically and horizontally sliding lines. To make full use of information and reduce redundancy, we calculate x-coordinate or y-coordinate of target point by the rectangular box position, and just regress the remaining y-coordinate or x-coordinate. Accordingly we can not only reduce the parameters of system, but also restrain the points which will generate more regular polygon. Our approach achieved competitive results on traditional ICDAR2015 Incidental Scene Text benchmark and curve text detection dataset CTW1500.
Tasks	Curved Text Detection, Object Detection, Scene Text Detection
Published	2018-01-30
URL	http://arxiv.org/abs/1801.09969v1
PDF	http://arxiv.org/pdf/1801.09969v1.pdf
PWC	https://paperswithcode.com/paper/sliding-line-point-regression-for-shape
Repo
Framework

A Review on Learning Planning Action Models for Socio-Communicative HRI


Title	A Review on Learning Planning Action Models for Socio-Communicative HRI
Authors	Ankuj Arora, Humbert Fiorino, Damien Pellier, Sylvie Pesty
Abstract	For social robots to be brought more into widespread use in the fields of companionship, care taking and domestic help, they must be capable of demonstrating social intelligence. In order to be acceptable, they must exhibit socio-communicative skills. Classic approaches to program HRI from observed human-human interactions fails to capture the subtlety of multimodal interactions as well as the key structural differences between robots and humans. The former arises due to a difficulty in quantifying and coding multimodal behaviours, while the latter due to a difference of the degrees of liberty between a robot and a human. However, the notion of reverse engineering from multimodal HRI traces to learn the underlying behavioral blueprint of the robot given multimodal traces seems an option worth exploring. With this spirit, the entire HRI can be seen as a sequence of exchanges of speech acts between the robot and human, each act treated as an action, bearing in mind that the entire sequence is goal-driven. Thus, this entire interaction can be treated as a sequence of actions propelling the interaction from its initial to goal state, also known as a plan in the domain of AI planning. In the same domain, this action sequence that stems from plan execution can be represented as a trace. AI techniques, such as machine learning, can be used to learn behavioral models (also known as symbolic action models in AI), intended to be reusable for AI planning, from the aforementioned multimodal traces. This article reviews recent machine learning techniques for learning planning action models which can be applied to the field of HRI with the intent of rendering robots as socio-communicative.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09245v1
PDF	http://arxiv.org/pdf/1810.09245v1.pdf
PWC	https://paperswithcode.com/paper/a-review-on-learning-planning-action-models
Repo
Framework

Direct Estimation of Pharmacokinetic Parameters from DCE-MRI using Deep CNN with Forward Physical Model Loss


Title	Direct Estimation of Pharmacokinetic Parameters from DCE-MRI using Deep CNN with Forward Physical Model Loss
Authors	Cagdas Ulas, Giles Tetteh, Michael J. Thrippleton, Paul A. Armitage, Stephen D. Makin, Joanna M. Wardlaw, Mike E. Davies, Bjoern H. Menze
Abstract	Dynamic contrast-enhanced (DCE) MRI is an evolving imaging technique that provides a quantitative measure of pharmacokinetic (PK) parameters in body tissues, in which series of T1-weighted images are collected following the administration of a paramagnetic contrast agent. Unfortunately, in many applications, conventional clinical DCE-MRI suffers from low spatiotemporal resolution and insufficient volume coverage. In this paper, we propose a novel deep learning based approach to directly estimate the PK parameters from undersampled DCE-MRI data. Specifically, we design a custom loss function where we incorporate a forward physical model that relates the PK parameters to corrupted image-time series obtained due to subsampling in k-space. This allows the network to directly exploit the knowledge of true contrast agent kinetics in the training phase, and hence provide more accurate restoration of PK parameters. Experiments on clinical brain DCE datasets demonstrate the efficacy of our approach in terms of fidelity of PK parameter reconstruction and significantly faster parameter inference compared to a model-based iterative reconstruction method.
Tasks	Time Series
Published	2018-04-08
URL	http://arxiv.org/abs/1804.02745v2
PDF	http://arxiv.org/pdf/1804.02745v2.pdf
PWC	https://paperswithcode.com/paper/direct-estimation-of-pharmacokinetic
Repo
Framework