October 21, 2019

2852 words 14 mins read

Paper Group AWR 83

Paper Group AWR 83

ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations. Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction. Random directions stochastic approximation with deterministic perturbations. Improving Reinforcement Learning Based Image Captioning with Natural Language Prio …

ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations

Title ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations
Authors Rafael Izbicki, Ann B. Lee, Taylor Pospisil
Abstract Approximate Bayesian Computation (ABC) is typically used when the likelihood is either unavailable or intractable but where data can be simulated under different parameter settings using a forward model. Despite the recent interest in ABC, high-dimensional data and costly simulations still remain a bottleneck in some applications. There is also no consensus as to how to best assess the performance of such methods without knowing the true posterior. We show how a nonparametric conditional density estimation (CDE) framework, which we refer to as ABC-CDE, help address three nontrivial challenges in ABC: (i) how to efficiently estimate the posterior distribution with limited simulations and different types of data, (ii) how to tune and compare the performance of ABC and related methods in estimating the posterior itself, rather than just certain properties of the density, and (iii) how to efficiently choose among a large set of summary statistics based on a CDE surrogate loss. We provide theoretical and empirical evidence that justify ABC-CDE procedures that {\em directly} estimate and assess the posterior based on an initial ABC sample, and we describe settings where standard ABC and regression-based approaches are inadequate.
Tasks Density Estimation
Published 2018-05-14
URL http://arxiv.org/abs/1805.05480v2
PDF http://arxiv.org/pdf/1805.05480v2.pdf
PWC https://paperswithcode.com/paper/abc-cde-towards-approximate-bayesian
Repo https://github.com/tpospisi/NNKCDE
Framework none

Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction

Title Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Authors Mingze Xu, Chenyou Fan, John D Paden, Geoffrey C Fox, David J Crandall
Abstract Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical. It is less clear how well these techniques may apply on structured prediction problems where fine-grained output with high precision is required, such as in scientific imaging domains. Here we consider the problem of segmenting echogram radar data collected from the polar ice sheets, which is challenging because segmentation boundaries are often very weak and there is a high degree of noise. We propose a multi-task spatiotemporal neural network that combines 3D ConvNets and Recurrent Neural Networks (RNNs) to estimate ice surface boundaries from sequences of tomographic radar images. We show that our model outperforms the state-of-the-art on this problem by (1) avoiding the need for hand-tuned parameters, (2) extracting multiple surfaces (ice-air and ice-bed) simultaneously, (3) requiring less non-visual metadata, and (4) being about 6 times faster.
Tasks Structured Prediction
Published 2018-01-11
URL http://arxiv.org/abs/1801.03986v2
PDF http://arxiv.org/pdf/1801.03986v2.pdf
PWC https://paperswithcode.com/paper/multi-task-spatiotemporal-neural-networks-for
Repo https://github.com/shyam1692/ice-reconstruction
Framework pytorch

Random directions stochastic approximation with deterministic perturbations

Title Random directions stochastic approximation with deterministic perturbations
Authors Prashanth L A, Shalabh Bhatnagar, Nirav Bhavsar, Michael Fu, Steven I. Marcus
Abstract We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms. In the latter case, these are the first second-order algorithms to incorporate deterministic perturbations. We show that the gradient and/or Hessian estimates in the resulting algorithms with deterministic perturbations are asymptotically unbiased, so that the algorithms are provably convergent. Furthermore, we derive convergence rates to establish the superiority of the first-order and second-order algorithms, for the special case of a convex and quadratic optimization problem, respectively. Numerical experiments are used to validate the theoretical results.
Tasks
Published 2018-08-08
URL http://arxiv.org/abs/1808.02871v2
PDF http://arxiv.org/pdf/1808.02871v2.pdf
PWC https://paperswithcode.com/paper/random-directions-stochastic-approximation
Repo https://github.com/prashla/RDSA
Framework none

Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Title Improving Reinforcement Learning Based Image Captioning with Natural Language Prior
Authors Tszhang Guo, Shiyu Chang, Mo Yu, Kun Bai
Abstract Recently, Reinforcement Learning (RL) approaches have demonstrated advanced performance in image captioning by directly optimizing the metric used for testing. However, this shaped reward introduces learning biases, which reduces the readability of generated text. In addition, the large sample space makes training unstable and slow. To alleviate these issues, we propose a simple coherent solution that constrains the action space using an n-gram language prior. Quantitative and qualitative evaluations on benchmarks show that RL with the simple add-on module performs favorably against its counterpart in terms of both readability and speed of convergence. Human evaluation results show that our model is more human readable and graceful. The implementation will become publicly available upon the acceptance of the paper.
Tasks Image Captioning
Published 2018-09-13
URL http://arxiv.org/abs/1809.06227v1
PDF http://arxiv.org/pdf/1809.06227v1.pdf
PWC https://paperswithcode.com/paper/improving-reinforcement-learning-based-image
Repo https://github.com/tgGuo15/PriorImageCaption
Framework pytorch

Data-driven discovery of PDEs in complex datasets

Title Data-driven discovery of PDEs in complex datasets
Authors Jens Berg, Kaj Nyström
Abstract Many processes in science and engineering can be described by partial differential equations (PDEs). Traditionally, PDEs are derived by considering first principles of physics to derive the relations between the involved physical quantities of interest. A different approach is to measure the quantities of interest and use deep learning to reverse engineer the PDEs which are describing the physical process. In this paper we use machine learning, and deep learning in particular, to discover PDEs hidden in complex data sets from measurement data. We include examples of data from a known model problem, and real data from weather station measurements. We show how necessary transformations of the input data amounts to coordinate transformations in the discovered PDE, and we elaborate on feature and model selection. It is shown that the dynamics of a non-linear, second order PDE can be accurately described by an ordinary differential equation which is automatically discovered by our deep learning algorithm. Even more interestingly, we show that similar results apply in the context of more complex simulations of the Swedish temperature distribution.
Tasks Model Selection
Published 2018-08-31
URL http://arxiv.org/abs/1808.10788v1
PDF http://arxiv.org/pdf/1808.10788v1.pdf
PWC https://paperswithcode.com/paper/data-driven-discovery-of-pdes-in-complex
Repo https://github.com/arnauldnzegha/deep2pde_Berg_Nystrom
Framework none

Memory Replay GANs: learning to generate images from new categories without forgetting

Title Memory Replay GANs: learning to generate images from new categories without forgetting
Authors Chenshen Wu, Luis Herranz, Xialei Liu, Yaxing Wang, Joost van de Weijer, Bogdan Raducanu
Abstract Previous works on sequential learning address the problem of forgetting in discriminative models. In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (i.e. forgetting). Addressing this problem, we propose Memory Replay GANs (MeRGANs), a conditional GAN framework that integrates a memory replay generator. We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories.
Tasks
Published 2018-09-06
URL https://arxiv.org/abs/1809.02058v3
PDF https://arxiv.org/pdf/1809.02058v3.pdf
PWC https://paperswithcode.com/paper/memory-replay-gans-learning-to-generate
Repo https://github.com/WuChenshen/MeRGAN
Framework tf

AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning

Title AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning
Authors Zheng Xie, XingYu Fu, JinYuan Yu
Abstract In this project, we combine AlphaGo algorithm with Curriculum Learning to crack the game of Gomoku. Modifications like Double Networks Mechanism and Winning Value Decay are implemented to solve the intrinsic asymmetry and short-sight of Gomoku. Our final AI AlphaGomoku, through two days’ training on a single GPU, has reached humans’ playing level.
Tasks
Published 2018-09-27
URL http://arxiv.org/abs/1809.10595v1
PDF http://arxiv.org/pdf/1809.10595v1.pdf
PWC https://paperswithcode.com/paper/alphagomoku-an-alphago-based-gomoku
Repo https://github.com/PolyKen/15_by_15_AlphaGomoku
Framework tf

Joint entity recognition and relation extraction as a multi-head selection problem

Title Joint entity recognition and relation extraction as a multi-head selection problem
Authors Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.
Tasks Relation Extraction
Published 2018-04-20
URL http://arxiv.org/abs/1804.07847v3
PDF http://arxiv.org/pdf/1804.07847v3.pdf
PWC https://paperswithcode.com/paper/joint-entity-recognition-and-relation
Repo https://github.com/bekou/multihead_joint_entity_relation_extraction
Framework tf

A Neurodynamic model of Saliency prediction in V1

Title A Neurodynamic model of Saliency prediction in V1
Authors David Berga, Xavier Otazu
Abstract Lateral connections in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible of several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model’s (named Neurodynamic Saliency WAvelet Model or NSWAM) architecture is based on Pennachio’s neurodynamic model of lateral connections of V1 (defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale). We tested NSWAM saliency predictions using images from eye tracking datasets, showing that it is an improvement with respect to previous models as well as consistent with human psychophysics. Hence, we show that our biologically plausible model of lateral connections can simultaneously explain different visual proceses present in V1 (without applying any type of training or optimization and keeping the same parametrization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex.
Tasks Eye Tracking, Saliency Prediction
Published 2018-11-15
URL https://arxiv.org/abs/1811.06308v7
PDF https://arxiv.org/pdf/1811.06308v7.pdf
PWC https://paperswithcode.com/paper/a-neurodynamic-model-of-saliency-prediction
Repo https://github.com/dberga/NSWAM
Framework none

Visual Dialogue without Vision or Dialogue

Title Visual Dialogue without Vision or Dialogue
Authors Daniela Massiceti, Puneet K. Dokania, N. Siddharth, Philip H. S. Torr
Abstract We characterise some of the quirks and shortcomings in the exploration of Visual Dialogue - a sequential question-answering task where the questions and corresponding answers are related through given visual stimuli. To do so, we develop an embarrassingly simple method based on Canonical Correlation Analysis (CCA) that, on the standard dataset, achieves near state-of-the-art performance on mean rank (MR). In direct contrast to current complex and over-parametrised architectures that are both compute and time intensive, our method ignores the visual stimuli, ignores the sequencing of dialogue, does not need gradients, uses off-the-shelf feature extractors, has at least an order of magnitude fewer parameters, and learns in practically no time. We argue that these results are indicative of issues in current approaches to Visual Dialogue and conduct analyses to highlight implicit dataset biases and effects of over-constrained evaluation metrics. Our code is publicly available.
Tasks Question Answering, Visual Dialog
Published 2018-12-16
URL https://arxiv.org/abs/1812.06417v3
PDF https://arxiv.org/pdf/1812.06417v3.pdf
PWC https://paperswithcode.com/paper/visual-dialogue-without-vision-or-dialogue
Repo https://github.com/danielamassiceti/CCA-visualdialogue
Framework pytorch

A Novel Framework for Online Supervised Learning with Feature Selection

Title A Novel Framework for Online Supervised Learning with Feature Selection
Authors Lizhe Sun, Yangzi Guo, Adrian Barbu
Abstract Current online learning methods suffer issues such as lower convergence rates and limited capability to recover the support of the true features compared to their offline counterparts. In this paper, we present a novel framework for online learning based on running averages and introduce a series of online versions of some popular existing offline methods such as Elastic Net, Minimax Concave Penalty and Feature Selection with Annealing. We prove the equivalence between our online methods and their offline counterparts and give theoretical true feature recovery and convergence guarantees for some of them. In contrast to the existing online methods, the proposed methods can extract models with any desired sparsity level at any time. Numerical experiments indicate that our new methods enjoy high accuracy of true feature recovery and a fast convergence rate, compared with standard online and offline algorithms. We also show how the running averages framework can be used for model adaptation in the presence of model drift. Finally, we present some applications to large datasets where again the proposed framework shows competitive results compared to popular online and offline algorithms.
Tasks Feature Selection
Published 2018-03-30
URL https://arxiv.org/abs/1803.11521v6
PDF https://arxiv.org/pdf/1803.11521v6.pdf
PWC https://paperswithcode.com/paper/a-novel-framework-for-online-supervised
Repo https://github.com/lizhesun0507/Runningaverageonlinelearning
Framework none

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

Title Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
Authors Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester
Abstract Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP .
Tasks Morphological Tagging
Published 2018-08-28
URL http://arxiv.org/abs/1808.09551v1
PDF http://arxiv.org/pdf/1808.09551v1.pdf
PWC https://paperswithcode.com/paper/explaining-character-aware-neural-networks
Repo https://github.com/FredericGodin/ContextualDecomposition-NLP
Framework pytorch

Automatic segmentation of skin lesions using deep learning

Title Automatic segmentation of skin lesions using deep learning
Authors Joshua Peter Ebenezer, Jagath C. Rajapakse
Abstract This paper summarizes the method used in our submission to Task 1 of the International Skin Imaging Collaboration’s (ISIC) Skin Lesion Analysis Towards Melanoma Detection challenge held in 2018. We used a fully automated method to accurately segment lesion boundaries from dermoscopic images. A U-net deep learning network is trained on publicly available data from ISIC. We introduce the use of intensity, color, and texture enhancement operations as pre-processing steps and morphological operations and contour identification as post-processing steps.
Tasks
Published 2018-07-13
URL http://arxiv.org/abs/1807.04893v1
PDF http://arxiv.org/pdf/1807.04893v1.pdf
PWC https://paperswithcode.com/paper/automatic-segmentation-of-skin-lesions-using
Repo https://github.com/JoshuaEbenezer/deep_segment
Framework none

Differentiating Concepts and Instances for Knowledge Graph Embedding

Title Differentiating Concepts and Instances for Knowledge Graph Embedding
Authors Xin Lv, Lei Hou, Juanzi Li, Zhiyuan Liu
Abstract Concepts, which represent a group of different instances sharing common properties, are essential information in knowledge representation. Most conventional knowledge embedding methods encode both entities (concepts and instances) and relations as vectors in a low dimensional semantic space equally, ignoring the difference between concepts and instances. In this paper, we propose a novel knowledge graph embedding model named TransC by differentiating concepts and instances. Specifically, TransC encodes each concept in knowledge graph as a sphere and each instance as a vector in the same semantic space. We use the relative positions to model the relations between concepts and instances (i.e., instanceOf), and the relations between concepts and sub-concepts (i.e., subClassOf). We evaluate our model on both link prediction and triple classification tasks on the dataset based on YAGO. Experimental results show that TransC outperforms state-of-the-art methods, and captures the semantic transitivity for instanceOf and subClassOf relation. Our codes and datasets can be obtained from https:// github.com/davidlvxin/TransC.
Tasks Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction
Published 2018-11-12
URL http://arxiv.org/abs/1811.04588v1
PDF http://arxiv.org/pdf/1811.04588v1.pdf
PWC https://paperswithcode.com/paper/differentiating-concepts-and-instances-for
Repo https://github.com/davidlvxin/TransC
Framework none

Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data

Title Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data
Authors Nazanin Alipourfard, Peter G. Fennell, Kristina Lerman
Abstract We describe a data-driven discovery method that leverages Simpson’s paradox to uncover interesting patterns in behavioral data. Our method systematically disaggregates data to identify subgroups within a population whose behavior deviates significantly from the rest of the population. Given an outcome of interest and a set of covariates, the method follows three steps. First, it disaggregates data into subgroups, by conditioning on a particular covariate, so as minimize the variation of the outcome within the subgroups. Next, it models the outcome as a linear function of another covariate, both in the subgroups and in the aggregate data. Finally, it compares trends to identify disaggregations that produce subgroups with different behaviors from the aggregate. We illustrate the method by applying it to three real-world behavioral datasets, including Q&A site Stack Exchange and online learning platforms Khan Academy and Duolingo.
Tasks
Published 2018-05-08
URL http://arxiv.org/abs/1805.03094v1
PDF http://arxiv.org/pdf/1805.03094v1.pdf
PWC https://paperswithcode.com/paper/using-simpsons-paradox-to-discover
Repo https://github.com/ninoch/Trend-Simpsons-Paradox
Framework none
comments powered by Disqus