October 21, 2019

2852 words 14 mins read

Paper Group AWR 83

ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations. Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction. Random directions stochastic approximation with deterministic perturbations. Improving Reinforcement Learning Based Image Captioning with Natural Language Prio …

ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations


Title	ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations
Authors	Rafael Izbicki, Ann B. Lee, Taylor Pospisil
Abstract	Approximate Bayesian Computation (ABC) is typically used when the likelihood is either unavailable or intractable but where data can be simulated under different parameter settings using a forward model. Despite the recent interest in ABC, high-dimensional data and costly simulations still remain a bottleneck in some applications. There is also no consensus as to how to best assess the performance of such methods without knowing the true posterior. We show how a nonparametric conditional density estimation (CDE) framework, which we refer to as ABC-CDE, help address three nontrivial challenges in ABC: (i) how to efficiently estimate the posterior distribution with limited simulations and different types of data, (ii) how to tune and compare the performance of ABC and related methods in estimating the posterior itself, rather than just certain properties of the density, and (iii) how to efficiently choose among a large set of summary statistics based on a CDE surrogate loss. We provide theoretical and empirical evidence that justify ABC-CDE procedures that {\em directly} estimate and assess the posterior based on an initial ABC sample, and we describe settings where standard ABC and regression-based approaches are inadequate.
Tasks	Density Estimation
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05480v2
PDF	http://arxiv.org/pdf/1805.05480v2.pdf
PWC	https://paperswithcode.com/paper/abc-cde-towards-approximate-bayesian
Repo	https://github.com/tpospisi/NNKCDE
Framework	none

Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction


Title	Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Authors	Mingze Xu, Chenyou Fan, John D Paden, Geoffrey C Fox, David J Crandall
Abstract	Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical. It is less clear how well these techniques may apply on structured prediction problems where fine-grained output with high precision is required, such as in scientific imaging domains. Here we consider the problem of segmenting echogram radar data collected from the polar ice sheets, which is challenging because segmentation boundaries are often very weak and there is a high degree of noise. We propose a multi-task spatiotemporal neural network that combines 3D ConvNets and Recurrent Neural Networks (RNNs) to estimate ice surface boundaries from sequences of tomographic radar images. We show that our model outperforms the state-of-the-art on this problem by (1) avoiding the need for hand-tuned parameters, (2) extracting multiple surfaces (ice-air and ice-bed) simultaneously, (3) requiring less non-visual metadata, and (4) being about 6 times faster.
Tasks	Structured Prediction
Published	2018-01-11
URL	http://arxiv.org/abs/1801.03986v2
PDF	http://arxiv.org/pdf/1801.03986v2.pdf
PWC	https://paperswithcode.com/paper/multi-task-spatiotemporal-neural-networks-for
Repo	https://github.com/shyam1692/ice-reconstruction
Framework	pytorch

Random directions stochastic approximation with deterministic perturbations


Title	Random directions stochastic approximation with deterministic perturbations
Authors	Prashanth L A, Shalabh Bhatnagar, Nirav Bhavsar, Michael Fu, Steven I. Marcus
Abstract	We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms. In the latter case, these are the first second-order algorithms to incorporate deterministic perturbations. We show that the gradient and/or Hessian estimates in the resulting algorithms with deterministic perturbations are asymptotically unbiased, so that the algorithms are provably convergent. Furthermore, we derive convergence rates to establish the superiority of the first-order and second-order algorithms, for the special case of a convex and quadratic optimization problem, respectively. Numerical experiments are used to validate the theoretical results.
Tasks
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02871v2
PDF	http://arxiv.org/pdf/1808.02871v2.pdf
PWC	https://paperswithcode.com/paper/random-directions-stochastic-approximation
Repo	https://github.com/prashla/RDSA
Framework	none

Improving Reinforcement Learning Based Image Captioning with Natural Language Prior


Title	Improving Reinforcement Learning Based Image Captioning with Natural Language Prior
Authors	Tszhang Guo, Shiyu Chang, Mo Yu, Kun Bai
Abstract	Recently, Reinforcement Learning (RL) approaches have demonstrated advanced performance in image captioning by directly optimizing the metric used for testing. However, this shaped reward introduces learning biases, which reduces the readability of generated text. In addition, the large sample space makes training unstable and slow. To alleviate these issues, we propose a simple coherent solution that constrains the action space using an n-gram language prior. Quantitative and qualitative evaluations on benchmarks show that RL with the simple add-on module performs favorably against its counterpart in terms of both readability and speed of convergence. Human evaluation results show that our model is more human readable and graceful. The implementation will become publicly available upon the acceptance of the paper.
Tasks	Image Captioning
Published	2018-09-13
URL	http://arxiv.org/abs/1809.06227v1
PDF	http://arxiv.org/pdf/1809.06227v1.pdf
PWC	https://paperswithcode.com/paper/improving-reinforcement-learning-based-image
Repo	https://github.com/tgGuo15/PriorImageCaption
Framework	pytorch

Data-driven discovery of PDEs in complex datasets


Title	Data-driven discovery of PDEs in complex datasets
Authors	Jens Berg, Kaj Nyström
Abstract	Many processes in science and engineering can be described by partial differential equations (PDEs). Traditionally, PDEs are derived by considering first principles of physics to derive the relations between the involved physical quantities of interest. A different approach is to measure the quantities of interest and use deep learning to reverse engineer the PDEs which are describing the physical process. In this paper we use machine learning, and deep learning in particular, to discover PDEs hidden in complex data sets from measurement data. We include examples of data from a known model problem, and real data from weather station measurements. We show how necessary transformations of the input data amounts to coordinate transformations in the discovered PDE, and we elaborate on feature and model selection. It is shown that the dynamics of a non-linear, second order PDE can be accurately described by an ordinary differential equation which is automatically discovered by our deep learning algorithm. Even more interestingly, we show that similar results apply in the context of more complex simulations of the Swedish temperature distribution.
Tasks	Model Selection
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10788v1
PDF	http://arxiv.org/pdf/1808.10788v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-discovery-of-pdes-in-complex
Repo	https://github.com/arnauldnzegha/deep2pde_Berg_Nystrom
Framework	none

Memory Replay GANs: learning to generate images from new categories without forgetting


Title	Memory Replay GANs: learning to generate images from new categories without forgetting
Authors	Chenshen Wu, Luis Herranz, Xialei Liu, Yaxing Wang, Joost van de Weijer, Bogdan Raducanu
Abstract	Previous works on sequential learning address the problem of forgetting in discriminative models. In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (i.e. forgetting). Addressing this problem, we propose Memory Replay GANs (MeRGANs), a conditional GAN framework that integrates a memory replay generator. We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories.
Tasks
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02058v3
PDF	https://arxiv.org/pdf/1809.02058v3.pdf
PWC	https://paperswithcode.com/paper/memory-replay-gans-learning-to-generate
Repo	https://github.com/WuChenshen/MeRGAN
Framework	tf

AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning


Title	AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning
Authors	Zheng Xie, XingYu Fu, JinYuan Yu
Abstract	In this project, we combine AlphaGo algorithm with Curriculum Learning to crack the game of Gomoku. Modifications like Double Networks Mechanism and Winning Value Decay are implemented to solve the intrinsic asymmetry and short-sight of Gomoku. Our final AI AlphaGomoku, through two days’ training on a single GPU, has reached humans’ playing level.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10595v1
PDF	http://arxiv.org/pdf/1809.10595v1.pdf
PWC	https://paperswithcode.com/paper/alphagomoku-an-alphago-based-gomoku
Repo	https://github.com/PolyKen/15_by_15_AlphaGomoku
Framework	tf

Joint entity recognition and relation extraction as a multi-head selection problem


Title	Joint entity recognition and relation extraction as a multi-head selection problem
Authors	Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract	State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.
Tasks	Relation Extraction
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07847v3
PDF	http://arxiv.org/pdf/1804.07847v3.pdf
PWC	https://paperswithcode.com/paper/joint-entity-recognition-and-relation
Repo	https://github.com/bekou/multihead_joint_entity_relation_extraction
Framework	tf

A Neurodynamic model of Saliency prediction in V1


Title	A Neurodynamic model of Saliency prediction in V1
Authors	David Berga, Xavier Otazu
Abstract	Lateral connections in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible of several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model’s (named Neurodynamic Saliency WAvelet Model or NSWAM) architecture is based on Pennachio’s neurodynamic model of lateral connections of V1 (defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale). We tested NSWAM saliency predictions using images from eye tracking datasets, showing that it is an improvement with respect to previous models as well as consistent with human psychophysics. Hence, we show that our biologically plausible model of lateral connections can simultaneously explain different visual proceses present in V1 (without applying any type of training or optimization and keeping the same parametrization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex.
Tasks	Eye Tracking, Saliency Prediction
Published	2018-11-15
URL	https://arxiv.org/abs/1811.06308v7
PDF	https://arxiv.org/pdf/1811.06308v7.pdf
PWC	https://paperswithcode.com/paper/a-neurodynamic-model-of-saliency-prediction
Repo	https://github.com/dberga/NSWAM
Framework	none

Visual Dialogue without Vision or Dialogue


Title	Visual Dialogue without Vision or Dialogue
Authors	Daniela Massiceti, Puneet K. Dokania, N. Siddharth, Philip H. S. Torr
Abstract	We characterise some of the quirks and shortcomings in the exploration of Visual Dialogue - a sequential question-answering task where the questions and corresponding answers are related through given visual stimuli. To do so, we develop an embarrassingly simple method based on Canonical Correlation Analysis (CCA) that, on the standard dataset, achieves near state-of-the-art performance on mean rank (MR). In direct contrast to current complex and over-parametrised architectures that are both compute and time intensive, our method ignores the visual stimuli, ignores the sequencing of dialogue, does not need gradients, uses off-the-shelf feature extractors, has at least an order of magnitude fewer parameters, and learns in practically no time. We argue that these results are indicative of issues in current approaches to Visual Dialogue and conduct analyses to highlight implicit dataset biases and effects of over-constrained evaluation metrics. Our code is publicly available.
Tasks	Question Answering, Visual Dialog
Published	2018-12-16
URL	https://arxiv.org/abs/1812.06417v3
PDF	https://arxiv.org/pdf/1812.06417v3.pdf
PWC	https://paperswithcode.com/paper/visual-dialogue-without-vision-or-dialogue
Repo	https://github.com/danielamassiceti/CCA-visualdialogue
Framework	pytorch

A Novel Framework for Online Supervised Learning with Feature Selection


Title	A Novel Framework for Online Supervised Learning with Feature Selection
Authors	Lizhe Sun, Yangzi Guo, Adrian Barbu
Abstract	Current online learning methods suffer issues such as lower convergence rates and limited capability to recover the support of the true features compared to their offline counterparts. In this paper, we present a novel framework for online learning based on running averages and introduce a series of online versions of some popular existing offline methods such as Elastic Net, Minimax Concave Penalty and Feature Selection with Annealing. We prove the equivalence between our online methods and their offline counterparts and give theoretical true feature recovery and convergence guarantees for some of them. In contrast to the existing online methods, the proposed methods can extract models with any desired sparsity level at any time. Numerical experiments indicate that our new methods enjoy high accuracy of true feature recovery and a fast convergence rate, compared with standard online and offline algorithms. We also show how the running averages framework can be used for model adaptation in the presence of model drift. Finally, we present some applications to large datasets where again the proposed framework shows competitive results compared to popular online and offline algorithms.
Tasks	Feature Selection
Published	2018-03-30
URL	https://arxiv.org/abs/1803.11521v6
PDF	https://arxiv.org/pdf/1803.11521v6.pdf
PWC	https://paperswithcode.com/paper/a-novel-framework-for-online-supervised
Repo	https://github.com/lizhesun0507/Runningaverageonlinelearning
Framework	none

Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?


Title	Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
Authors	Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester
Abstract	Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP .
Tasks	Morphological Tagging
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09551v1
PDF	http://arxiv.org/pdf/1808.09551v1.pdf
PWC	https://paperswithcode.com/paper/explaining-character-aware-neural-networks
Repo	https://github.com/FredericGodin/ContextualDecomposition-NLP
Framework	pytorch

Automatic segmentation of skin lesions using deep learning


Title	Automatic segmentation of skin lesions using deep learning
Authors	Joshua Peter Ebenezer, Jagath C. Rajapakse
Abstract	This paper summarizes the method used in our submission to Task 1 of the International Skin Imaging Collaboration’s (ISIC) Skin Lesion Analysis Towards Melanoma Detection challenge held in 2018. We used a fully automated method to accurately segment lesion boundaries from dermoscopic images. A U-net deep learning network is trained on publicly available data from ISIC. We introduce the use of intensity, color, and texture enhancement operations as pre-processing steps and morphological operations and contour identification as post-processing steps.
Tasks
Published	2018-07-13
URL	http://arxiv.org/abs/1807.04893v1
PDF	http://arxiv.org/pdf/1807.04893v1.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-of-skin-lesions-using
Repo	https://github.com/JoshuaEbenezer/deep_segment
Framework	none

Differentiating Concepts and Instances for Knowledge Graph Embedding


Title	Differentiating Concepts and Instances for Knowledge Graph Embedding
Authors	Xin Lv, Lei Hou, Juanzi Li, Zhiyuan Liu
Abstract	Concepts, which represent a group of different instances sharing common properties, are essential information in knowledge representation. Most conventional knowledge embedding methods encode both entities (concepts and instances) and relations as vectors in a low dimensional semantic space equally, ignoring the difference between concepts and instances. In this paper, we propose a novel knowledge graph embedding model named TransC by differentiating concepts and instances. Specifically, TransC encodes each concept in knowledge graph as a sphere and each instance as a vector in the same semantic space. We use the relative positions to model the relations between concepts and instances (i.e., instanceOf), and the relations between concepts and sub-concepts (i.e., subClassOf). We evaluate our model on both link prediction and triple classification tasks on the dataset based on YAGO. Experimental results show that TransC outperforms state-of-the-art methods, and captures the semantic transitivity for instanceOf and subClassOf relation. Our codes and datasets can be obtained from https:// github.com/davidlvxin/TransC.
Tasks	Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04588v1
PDF	http://arxiv.org/pdf/1811.04588v1.pdf
PWC	https://paperswithcode.com/paper/differentiating-concepts-and-instances-for
Repo	https://github.com/davidlvxin/TransC
Framework	none

Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data


Title	Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data
Authors	Nazanin Alipourfard, Peter G. Fennell, Kristina Lerman
Abstract	We describe a data-driven discovery method that leverages Simpson’s paradox to uncover interesting patterns in behavioral data. Our method systematically disaggregates data to identify subgroups within a population whose behavior deviates significantly from the rest of the population. Given an outcome of interest and a set of covariates, the method follows three steps. First, it disaggregates data into subgroups, by conditioning on a particular covariate, so as minimize the variation of the outcome within the subgroups. Next, it models the outcome as a linear function of another covariate, both in the subgroups and in the aggregate data. Finally, it compares trends to identify disaggregations that produce subgroups with different behaviors from the aggregate. We illustrate the method by applying it to three real-world behavioral datasets, including Q&A site Stack Exchange and online learning platforms Khan Academy and Duolingo.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.03094v1
PDF	http://arxiv.org/pdf/1805.03094v1.pdf
PWC	https://paperswithcode.com/paper/using-simpsons-paradox-to-discover
Repo	https://github.com/ninoch/Trend-Simpsons-Paradox
Framework	none