April 1, 2020

3118 words 15 mins read

Paper Group ANR 488

Paper Group ANR 488

Exploring Neural Models for Parsing Natural Language into First-Order Logic. Advanced kNN: A Mature Machine Learning Series. Universal Successor Features for Transfer Reinforcement Learning. Adapting to Unseen Environments through Explicit Representation of Context. Bayesian optimization of variable-size design space problems. SPFCN: Select and Pru …

Exploring Neural Models for Parsing Natural Language into First-Order Logic

Title Exploring Neural Models for Parsing Natural Language into First-Order Logic
Authors Hrituraj Singh, Milan Aggrawal, Balaji Krishnamurthy
Abstract Semantic parsing is the task of obtaining machine-interpretable representations from natural language text. We consider one such formal representation - First-Order Logic (FOL) and explore the capability of neural models in parsing English sentences to FOL. We model FOL parsing as a sequence to sequence mapping task where given a natural language sentence, it is encoded into an intermediate representation using an LSTM followed by a decoder which sequentially generates the predicates in the corresponding FOL formula. We improve the standard encoder-decoder model by introducing a variable alignment mechanism that enables it to align variables across predicates in the predicted FOL. We further show the effectiveness of predicting the category of FOL entity - Unary, Binary, Variables and Scoped Entities, at each decoder step as an auxiliary task on improving the consistency of generated FOL. We perform rigorous evaluations and extensive ablations. We also aim to release our code as well as large scale FOL dataset along with models to aid further research in logic-based parsing and inference in NLP.
Tasks Semantic Parsing
Published 2020-02-16
URL https://arxiv.org/abs/2002.06544v1
PDF https://arxiv.org/pdf/2002.06544v1.pdf
PWC https://paperswithcode.com/paper/exploring-neural-models-for-parsing-natural
Repo
Framework

Advanced kNN: A Mature Machine Learning Series

Title Advanced kNN: A Mature Machine Learning Series
Authors Muhammad Asim, Muaaz Zakria
Abstract k-nearest neighbour (kNN) is one of the most prominent, simple and basic algorithm used in machine learning and data mining. However, kNN has limited prediction ability, i.e., kNN cannot predict any instance correctly if it does not belong to any of the predefined classes in the training data set. The purpose of this paper is to suggest an Advanced kNN (A-kNN) algorithm that will be able to classify an instance as unknown, after verifying that it does not belong to any of the predefined classes. Performance of kNN and A-kNN is compared on three different data sets namely iris plant data set, BUPA liver disorder data set, and Alpha Beta detection data set. Results of A-kNN are significantly accurate for detecting unknown instances.
Tasks
Published 2020-03-01
URL https://arxiv.org/abs/2003.00415v1
PDF https://arxiv.org/pdf/2003.00415v1.pdf
PWC https://paperswithcode.com/paper/advanced-knn-a-mature-machine-learning-series
Repo
Framework

Universal Successor Features for Transfer Reinforcement Learning

Title Universal Successor Features for Transfer Reinforcement Learning
Authors Chen Ma, Dylan R. Ashley, Junfeng Wen, Yoshua Bengio
Abstract Transfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015), which generalizes over goals and states, has previously been shown to be useful for transfer. However, successor features are believed to be more suitable than values for transfer (Dayan, 1993; Barreto et al.,2017), even though they cannot directly generalize to new goals. In this paper, we propose (1) Universal Successor Features (USFs) to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model of USFs that can be trained by interacting with the environment. We show that learning USFs is compatible with any RL algorithm that learns state values using a temporal difference method. Our experiments in a simple gridworld and with two MuJoCo environments show that USFs can greatly accelerate training when learning multiple tasks and can effectively transfer knowledge to new tasks.
Tasks Transfer Reinforcement Learning
Published 2020-01-05
URL https://arxiv.org/abs/2001.04025v1
PDF https://arxiv.org/pdf/2001.04025v1.pdf
PWC https://paperswithcode.com/paper/universal-successor-features-for-transfer-1
Repo
Framework

Adapting to Unseen Environments through Explicit Representation of Context

Title Adapting to Unseen Environments through Explicit Representation of Context
Authors Cem C. Tutum, Risto Miikkulainen
Abstract In order to deploy autonomous agents to domains such as autonomous driving, infrastructure management, health care, and finance, they must be able to adapt safely to unseen situations. The current approach in constructing such agents is to try to include as much variation into training as possible, and then generalize within the possible variations. This paper proposes a principled approach where a context module is coevolved with a skill module. The context module recognizes the variation and modulates the skill module so that the entire system performs well in unseen situations. The approach is evaluated in a challenging version of the Flappy Bird game where the effects of the actions vary over time. The Context+Skill approach leads to significantly more robust behavior in environments with previously unseen effects. Such a principled generalization ability is essential in deploying autonomous agents in real world tasks, and can serve as a foundation for continual learning as well.
Tasks Autonomous Driving, Continual Learning
Published 2020-02-13
URL https://arxiv.org/abs/2002.05640v1
PDF https://arxiv.org/pdf/2002.05640v1.pdf
PWC https://paperswithcode.com/paper/adapting-to-unseen-environments-through
Repo
Framework

Bayesian optimization of variable-size design space problems

Title Bayesian optimization of variable-size design space problems
Authors Julien Pelamatti, Loic Brevault, Mathieu Balesdent, El-Ghazali Talbi, Yannick Guerin
Abstract Within the framework of complex system design, it is often necessary to solve mixed variable optimization problems, in which the objective and constraint functions can depend simultaneously on continuous and discrete variables. Additionally, complex system design problems occasionally present a variable-size design space. This results in an optimization problem for which the search space varies dynamically (with respect to both number and type of variables) along the optimization process as a function of the values of specific discrete decision variables. Similarly, the number and type of constraints can vary as well. In this paper, two alternative Bayesian Optimization-based approaches are proposed in order to solve this type of optimization problems. The first one consists in a budget allocation strategy allowing to focus the computational budget on the most promising design sub-spaces. The second approach, instead, is based on the definition of a kernel function allowing to compute the covariance between samples characterized by partially different sets of variables. The results obtained on analytical and engineering related test-cases show a faster and more consistent convergence of both proposed methods with respect to the standard approaches.
Tasks
Published 2020-03-06
URL https://arxiv.org/abs/2003.03300v1
PDF https://arxiv.org/pdf/2003.03300v1.pdf
PWC https://paperswithcode.com/paper/bayesian-optimization-of-variable-size-design
Repo
Framework

SPFCN: Select and Prune the Fully Convolutional Networks for Real-time Parking Slot Detection

Title SPFCN: Select and Prune the Fully Convolutional Networks for Real-time Parking Slot Detection
Authors Zhuoping Yu, Zhong Gao, Hansheng Chen, Yuyao Huang
Abstract For passenger cars equipped with automatic parking function, convolutional neural networks are employed to detect parking slots on the panoramic surround view, which is an overhead image synthesized by four calibrated fish-eye images, The accuracy is obtained at the price of low speed or expensive computation equipments, which are sensitive for many car manufacturers. In this paper, the same accuracy is challenged by the proposed parking slot detector, which leverages deep convolutional networks for the faster speed and smaller model while keep the accuracy by simultaneously training and pruning it. To achieve the optimal trade-off, we developed a strategy to select the best receptive fields and prune the redundant channels automatically during training. The proposed model is capable of jointly detecting corners and line features of parking slots while running efficiently in real time on average CPU. Even without any specific computing devices, the model outperforms existing counterparts, at a frame rate of about 30 FPS on a 2.3 GHz CPU core, getting parking slot corner localization error of 1.51$\pm$2.14 cm (std. err.) and slot detection accuracy of 98%, generally satisfying the requirements in both speed and accuracy on on-board mobile terminals.
Tasks
Published 2020-03-25
URL https://arxiv.org/abs/2003.11337v1
PDF https://arxiv.org/pdf/2003.11337v1.pdf
PWC https://paperswithcode.com/paper/spfcn-select-and-prune-the-fully
Repo
Framework

Consistency Analysis of Replication-Based Probabilistic Key-Value Stores

Title Consistency Analysis of Replication-Based Probabilistic Key-Value Stores
Authors Ramy E. Ali
Abstract Partial quorum systems are widely used in distributed key-value stores due to their latency benefits at the expense of providing weaker consistency guarantees. The probabilistically bounded staleness framework (PBS) studied the latency-consistency trade-off of Dynamo-style partial quorum systems through Monte Carlo event-based simulations. In this paper, we study the latency-consistency trade-off for such systems analytically and derive a closed-form expression for the inconsistency probability. Our approach allows fine-tuning of latency and consistency guarantees in key-value stores, which is intractable using Monte Carlo event-based simulations.
Tasks
Published 2020-02-14
URL https://arxiv.org/abs/2002.06098v2
PDF https://arxiv.org/pdf/2002.06098v2.pdf
PWC https://paperswithcode.com/paper/consistency-analysis-of-replication-based
Repo
Framework

Explainable Deep RDFS Reasoner

Title Explainable Deep RDFS Reasoner
Authors Bassem Makni, Ibrahim Abdelaziz, James Hendler
Abstract Recent research efforts aiming to bridge the Neural-Symbolic gap for RDFS reasoning proved empirically that deep learning techniques can be used to learn RDFS inference rules. However, one of their main deficiencies compared to rule-based reasoners is the lack of derivations for the inferred triples (i.e. explainability in AI terms). In this paper, we build on these approaches to provide not only the inferred graph but also explain how these triples were inferred. In the graph words approach, RDF graphs are represented as a sequence of graph words where inference can be achieved through neural machine translation. To achieve explainability in RDFS reasoning, we revisit this approach and introduce a new neural network model that gets the input graph–as a sequence of graph words– as well as the encoding of the inferred triple and outputs the derivation for the inferred triple. We evaluated our justification model on two datasets: a synthetic dataset– LUBM benchmark– and a real-world dataset –ScholarlyData about conferences– where the lowest validation accuracy approached 96%.
Tasks Machine Translation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03514v1
PDF https://arxiv.org/pdf/2002.03514v1.pdf
PWC https://paperswithcode.com/paper/explainable-deep-rdfs-reasoner
Repo
Framework

A Strong Baseline for Fashion Retrieval with Person Re-Identification Models

Title A Strong Baseline for Fashion Retrieval with Person Re-Identification Models
Authors Mikolaj Wieczorek, Andrzej Michalowski, Anna Wroblewska, Jacek Dabrowski
Abstract Fashion retrieval is the challenging task of finding an exact match for fashion items contained within an image. Difficulties arise from the fine-grained nature of clothing items, very large intra-class and inter-class variance. Additionally, query and source images for the task usually come from different domains - street photos and catalogue photos respectively. Due to these differences, a significant gap in quality, lighting, contrast, background clutter and item presentation exists between domains. As a result, fashion retrieval is an active field of research both in academia and the industry. Inspired by recent advancements in Person Re-Identification research, we adapt leading ReID models to be used in fashion retrieval tasks. We introduce a simple baseline model for fashion retrieval, significantly outperforming previous state-of-the-art results despite a much simpler architecture. We conduct in-depth experiments on Street2Shop and DeepFashion datasets and validate our results. Finally, we propose a cross-domain (cross-dataset) evaluation method to test the robustness of fashion retrieval models.
Tasks Person Re-Identification
Published 2020-03-09
URL https://arxiv.org/abs/2003.04094v1
PDF https://arxiv.org/pdf/2003.04094v1.pdf
PWC https://paperswithcode.com/paper/a-strong-baseline-for-fashion-retrieval-with
Repo
Framework

Reducing complexity and unidentifiability when modelling human atrial cells

Title Reducing complexity and unidentifiability when modelling human atrial cells
Authors C. Houston, B. Marchand, L. Engelbert, C. D. Cantwell
Abstract Mathematical models of a cellular action potential in cardiac modelling have become increasingly complex, particularly in gating kinetics which control the opening and closing of individual ion channel currents. As cardiac models advance towards use in personalised medicine to inform clinical decision-making, it is critical to understand the uncertainty hidden in parameter estimates from their calibration to experimental data. This study applies approximate Bayesian computation to re-calibrate the gating kinetics of four ion channels in two existing human atrial cell models to their original datasets, providing a measure of uncertainty and indication of potential issues with selecting a single unique value given the available experimental data. Two approaches are investigated to reduce the uncertainty present: re-calibrating the models to a more complete dataset and using a less complex formulation with fewer parameters to constrain. The re-calibrated models are inserted back into the full cell model to study the overall effect on the action potential. The use of more complete datasets does not eliminate uncertainty present in parameter estimates. The less complex model, particularly for the fast sodium current, gave a better fit to experimental data alongside lower parameter uncertainty and improved computational speed.
Tasks Calibration, Decision Making
Published 2020-01-29
URL https://arxiv.org/abs/2001.10954v1
PDF https://arxiv.org/pdf/2001.10954v1.pdf
PWC https://paperswithcode.com/paper/reducing-complexity-and-unidentifiability
Repo
Framework
Title Neural Inheritance Relation Guided One-Shot Layer Assignment Search
Authors Rang Meng, Weijie Chen, Di Xie, Yuan Zhang, Shiliang Pu
Abstract Layer assignment is seldom picked out as an independent research topic in neural architecture search. In this paper, for the first time, we systematically investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100. Through analyzing this dataset, we discover a neural inheritance relation among the networks with different layer assignments, that is, the optimal layer assignments for deeper networks always inherit from those for shallow networks. Inspired by this neural inheritance relation, we propose an efficient one-shot layer assignment search approach via inherited sampling. Specifically, the optimal layer assignment searched in the shallow network can be provided as a strong sampling priori to train and search the deeper ones in supernet, which extremely reduces the network search space. Comprehensive experiments carried out on CIFAR-100 illustrate the efficiency of our proposed method. Our search results are strongly consistent with the optimal ones directly selected from the architecture dataset. To further confirm the generalization of our proposed method, we also conduct experiments on Tiny-ImageNet and ImageNet. Our searched results are remarkably superior to the handcrafted ones under the unchanged computational budgets. The neural inheritance relation discovered in this paper can provide insights to the universal neural architecture search.
Tasks Neural Architecture Search
Published 2020-02-28
URL https://arxiv.org/abs/2002.12580v1
PDF https://arxiv.org/pdf/2002.12580v1.pdf
PWC https://paperswithcode.com/paper/neural-inheritance-relation-guided-one-shot
Repo
Framework

Compression of descriptor models for mobile applications

Title Compression of descriptor models for mobile applications
Authors Roy Miles, Krystian Mikolajczyk
Abstract Deep neural networks have demonstrated state-of-the-art performance for feature-based image matching through the advent of new large and diverse datasets. However, there has been little work on evaluating the computational cost, model size, and matching accuracy tradeoffs for these models. This paper explicitly addresses these practical metrics by considering the state-of-the-art HardNet model. We observe a significant redundancy in the learned weights, which we exploit through the use of depthwise separable layers and an efficient Tucker decomposition. We demonstrate that a combination of these methods is very effective, but still sacrifices the top-end accuracy. To resolve this, we propose the Convolution-Depthwise-Pointwise(CDP) layer, which provides a means of interpolating between the standard and depthwise separable convolutions. With this proposed layer, we can achieve an 8 times reduction in the number of parameters on the HardNet model, 13 times reduction in the computational complexity, while sacrificing less than 1% on the overall accuracy across theHPatchesbenchmarks. To further demonstrate the generalisation of this approach, we apply it to the state-of-the-art SuperPoint model, where we can significantly reduce the number of parameters and floating-point operations, with minimal degradation in the matching accuracy.
Tasks
Published 2020-01-09
URL https://arxiv.org/abs/2001.03102v2
PDF https://arxiv.org/pdf/2001.03102v2.pdf
PWC https://paperswithcode.com/paper/compression-of-convolutional-neural-networks
Repo
Framework

Early Forecasting of Text Classification Accuracy and F-Measure with Active Learning

Title Early Forecasting of Text Classification Accuracy and F-Measure with Active Learning
Authors Thomas Orth, Michael Bloodgood
Abstract When creating text classification systems, one of the major bottlenecks is the annotation of training data. Active learning has been proposed to address this bottleneck using stopping methods to minimize the cost of data annotation. An important capability for improving the utility of stopping methods is to effectively forecast the performance of the text classification models. Forecasting can be done through the use of logarithmic models regressed on some portion of the data as learning is progressing. A critical unexplored question is what portion of the data is needed for accurate forecasting. There is a tension, where it is desirable to use less data so that the forecast can be made earlier, which is more useful, versus it being desirable to use more data, so that the forecast can be more accurate. We find that when using active learning it is even more important to generate forecasts earlier so as to make them more useful and not waste annotation effort. We investigate the difference in forecasting difficulty when using accuracy and F-measure as the text classification system performance metrics and we find that F-measure is more difficult to forecast. We conduct experiments on seven text classification datasets in different semantic domains with different characteristics and with three different base machine learning algorithms. We find that forecasting is easiest for decision tree learning, moderate for Support Vector Machines, and most difficult for neural networks.
Tasks Active Learning, Text Classification
Published 2020-01-20
URL https://arxiv.org/abs/2001.10337v1
PDF https://arxiv.org/pdf/2001.10337v1.pdf
PWC https://paperswithcode.com/paper/early-forecasting-of-text-classification
Repo
Framework

Kalman Filtering and Expectation Maximization for Multitemporal Spectral Unmixing

Title Kalman Filtering and Expectation Maximization for Multitemporal Spectral Unmixing
Authors Ricardo Augusto Borsoi, Tales Imbiriba, Pau Closas, José Carlos Moreira Bermudez, Cédric Richard
Abstract The recent evolution of hyperspectral imaging technology and the proliferation of new emerging applications presses for the processing of multiple temporal hyperspectral images. In this work, we propose a novel spectral unmixing (SU) strategy using physically motivated parametric endmember representations to account for temporal spectral variability. By representing the multitemporal mixing process using a state-space formulation, we are able to exploit the Bayesian filtering machinery to estimate the endmember variability coefficients. Moreover, by assuming that the temporal variability of the abundances is small over short intervals, an efficient implementation of the expectation maximization (EM) algorithm is employed to estimate the abundances and the other model parameters. Simulation results indicate that the proposed strategy outperforms state-of-the-art multitemporal SU algorithms.
Tasks
Published 2020-01-02
URL https://arxiv.org/abs/2001.00425v1
PDF https://arxiv.org/pdf/2001.00425v1.pdf
PWC https://paperswithcode.com/paper/kalman-filtering-and-expectation-maximization
Repo
Framework

GeoConv: Geodesic Guided Convolution for Facial Action Unit Recognition

Title GeoConv: Geodesic Guided Convolution for Facial Action Unit Recognition
Authors Yuedong Chen, Guoxian Song, Zhiwen Shao, Jianfei Cai, Tat-Jen Cham, Jianming Zheng
Abstract Automatic facial action unit (AU) recognition has attracted great attention but still remains a challenging task, as subtle changes of local facial muscles are difficult to thoroughly capture. Most existing AU recognition approaches leverage geometry information in a straightforward 2D or 3D manner, which either ignore 3D manifold information or suffer from high computational costs. In this paper, we propose a novel geodesic guided convolution (GeoConv) for AU recognition by embedding 3D manifold information into 2D convolutions. Specifically, the kernel of GeoConv is weighted by our introduced geodesic weights, which are negatively correlated to geodesic distances on a coarsely reconstructed 3D face model. Moreover, based on GeoConv, we further develop an end-to-end trainable framework named GeoCNN for AU recognition. Extensive experiments on BP4D and DISFA benchmarks show that our approach significantly outperforms the state-of-the-art AU recognition methods.
Tasks Facial Action Unit Detection
Published 2020-03-06
URL https://arxiv.org/abs/2003.03055v1
PDF https://arxiv.org/pdf/2003.03055v1.pdf
PWC https://paperswithcode.com/paper/geoconv-geodesic-guided-convolution-for
Repo
Framework
comments powered by Disqus