Paper Group AWR 83
ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations. Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction. Random directions stochastic approximation with deterministic perturbations. Improving Reinforcement Learning Based Image Captioning with Natural Language Prio …
ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations
Title | ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations |
Authors | Rafael Izbicki, Ann B. Lee, Taylor Pospisil |
Abstract | Approximate Bayesian Computation (ABC) is typically used when the likelihood is either unavailable or intractable but where data can be simulated under different parameter settings using a forward model. Despite the recent interest in ABC, high-dimensional data and costly simulations still remain a bottleneck in some applications. There is also no consensus as to how to best assess the performance of such methods without knowing the true posterior. We show how a nonparametric conditional density estimation (CDE) framework, which we refer to as ABC-CDE, help address three nontrivial challenges in ABC: (i) how to efficiently estimate the posterior distribution with limited simulations and different types of data, (ii) how to tune and compare the performance of ABC and related methods in estimating the posterior itself, rather than just certain properties of the density, and (iii) how to efficiently choose among a large set of summary statistics based on a CDE surrogate loss. We provide theoretical and empirical evidence that justify ABC-CDE procedures that {\em directly} estimate and assess the posterior based on an initial ABC sample, and we describe settings where standard ABC and regression-based approaches are inadequate. |
Tasks | Density Estimation |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05480v2 |
http://arxiv.org/pdf/1805.05480v2.pdf | |
PWC | https://paperswithcode.com/paper/abc-cde-towards-approximate-bayesian |
Repo | https://github.com/tpospisi/NNKCDE |
Framework | none |
Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Title | Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction |
Authors | Mingze Xu, Chenyou Fan, John D Paden, Geoffrey C Fox, David J Crandall |
Abstract | Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical. It is less clear how well these techniques may apply on structured prediction problems where fine-grained output with high precision is required, such as in scientific imaging domains. Here we consider the problem of segmenting echogram radar data collected from the polar ice sheets, which is challenging because segmentation boundaries are often very weak and there is a high degree of noise. We propose a multi-task spatiotemporal neural network that combines 3D ConvNets and Recurrent Neural Networks (RNNs) to estimate ice surface boundaries from sequences of tomographic radar images. We show that our model outperforms the state-of-the-art on this problem by (1) avoiding the need for hand-tuned parameters, (2) extracting multiple surfaces (ice-air and ice-bed) simultaneously, (3) requiring less non-visual metadata, and (4) being about 6 times faster. |
Tasks | Structured Prediction |
Published | 2018-01-11 |
URL | http://arxiv.org/abs/1801.03986v2 |
http://arxiv.org/pdf/1801.03986v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-spatiotemporal-neural-networks-for |
Repo | https://github.com/shyam1692/ice-reconstruction |
Framework | pytorch |
Random directions stochastic approximation with deterministic perturbations
Title | Random directions stochastic approximation with deterministic perturbations |
Authors | Prashanth L A, Shalabh Bhatnagar, Nirav Bhavsar, Michael Fu, Steven I. Marcus |
Abstract | We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms. In the latter case, these are the first second-order algorithms to incorporate deterministic perturbations. We show that the gradient and/or Hessian estimates in the resulting algorithms with deterministic perturbations are asymptotically unbiased, so that the algorithms are provably convergent. Furthermore, we derive convergence rates to establish the superiority of the first-order and second-order algorithms, for the special case of a convex and quadratic optimization problem, respectively. Numerical experiments are used to validate the theoretical results. |
Tasks | |
Published | 2018-08-08 |
URL | http://arxiv.org/abs/1808.02871v2 |
http://arxiv.org/pdf/1808.02871v2.pdf | |
PWC | https://paperswithcode.com/paper/random-directions-stochastic-approximation |
Repo | https://github.com/prashla/RDSA |
Framework | none |
Improving Reinforcement Learning Based Image Captioning with Natural Language Prior
Title | Improving Reinforcement Learning Based Image Captioning with Natural Language Prior |
Authors | Tszhang Guo, Shiyu Chang, Mo Yu, Kun Bai |
Abstract | Recently, Reinforcement Learning (RL) approaches have demonstrated advanced performance in image captioning by directly optimizing the metric used for testing. However, this shaped reward introduces learning biases, which reduces the readability of generated text. In addition, the large sample space makes training unstable and slow. To alleviate these issues, we propose a simple coherent solution that constrains the action space using an n-gram language prior. Quantitative and qualitative evaluations on benchmarks show that RL with the simple add-on module performs favorably against its counterpart in terms of both readability and speed of convergence. Human evaluation results show that our model is more human readable and graceful. The implementation will become publicly available upon the acceptance of the paper. |
Tasks | Image Captioning |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.06227v1 |
http://arxiv.org/pdf/1809.06227v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-reinforcement-learning-based-image |
Repo | https://github.com/tgGuo15/PriorImageCaption |
Framework | pytorch |
Data-driven discovery of PDEs in complex datasets
Title | Data-driven discovery of PDEs in complex datasets |
Authors | Jens Berg, Kaj Nyström |
Abstract | Many processes in science and engineering can be described by partial differential equations (PDEs). Traditionally, PDEs are derived by considering first principles of physics to derive the relations between the involved physical quantities of interest. A different approach is to measure the quantities of interest and use deep learning to reverse engineer the PDEs which are describing the physical process. In this paper we use machine learning, and deep learning in particular, to discover PDEs hidden in complex data sets from measurement data. We include examples of data from a known model problem, and real data from weather station measurements. We show how necessary transformations of the input data amounts to coordinate transformations in the discovered PDE, and we elaborate on feature and model selection. It is shown that the dynamics of a non-linear, second order PDE can be accurately described by an ordinary differential equation which is automatically discovered by our deep learning algorithm. Even more interestingly, we show that similar results apply in the context of more complex simulations of the Swedish temperature distribution. |
Tasks | Model Selection |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1808.10788v1 |
http://arxiv.org/pdf/1808.10788v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-discovery-of-pdes-in-complex |
Repo | https://github.com/arnauldnzegha/deep2pde_Berg_Nystrom |
Framework | none |
Memory Replay GANs: learning to generate images from new categories without forgetting
Title | Memory Replay GANs: learning to generate images from new categories without forgetting |
Authors | Chenshen Wu, Luis Herranz, Xialei Liu, Yaxing Wang, Joost van de Weijer, Bogdan Raducanu |
Abstract | Previous works on sequential learning address the problem of forgetting in discriminative models. In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (i.e. forgetting). Addressing this problem, we propose Memory Replay GANs (MeRGANs), a conditional GAN framework that integrates a memory replay generator. We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories. |
Tasks | |
Published | 2018-09-06 |
URL | https://arxiv.org/abs/1809.02058v3 |
https://arxiv.org/pdf/1809.02058v3.pdf | |
PWC | https://paperswithcode.com/paper/memory-replay-gans-learning-to-generate |
Repo | https://github.com/WuChenshen/MeRGAN |
Framework | tf |
AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning
Title | AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning |
Authors | Zheng Xie, XingYu Fu, JinYuan Yu |
Abstract | In this project, we combine AlphaGo algorithm with Curriculum Learning to crack the game of Gomoku. Modifications like Double Networks Mechanism and Winning Value Decay are implemented to solve the intrinsic asymmetry and short-sight of Gomoku. Our final AI AlphaGomoku, through two days’ training on a single GPU, has reached humans’ playing level. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10595v1 |
http://arxiv.org/pdf/1809.10595v1.pdf | |
PWC | https://paperswithcode.com/paper/alphagomoku-an-alphago-based-gomoku |
Repo | https://github.com/PolyKen/15_by_15_AlphaGomoku |
Framework | tf |
Joint entity recognition and relation extraction as a multi-head selection problem
Title | Joint entity recognition and relation extraction as a multi-head selection problem |
Authors | Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder |
Abstract | State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them. |
Tasks | Relation Extraction |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07847v3 |
http://arxiv.org/pdf/1804.07847v3.pdf | |
PWC | https://paperswithcode.com/paper/joint-entity-recognition-and-relation |
Repo | https://github.com/bekou/multihead_joint_entity_relation_extraction |
Framework | tf |
A Neurodynamic model of Saliency prediction in V1
Title | A Neurodynamic model of Saliency prediction in V1 |
Authors | David Berga, Xavier Otazu |
Abstract | Lateral connections in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible of several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model’s (named Neurodynamic Saliency WAvelet Model or NSWAM) architecture is based on Pennachio’s neurodynamic model of lateral connections of V1 (defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale). We tested NSWAM saliency predictions using images from eye tracking datasets, showing that it is an improvement with respect to previous models as well as consistent with human psychophysics. Hence, we show that our biologically plausible model of lateral connections can simultaneously explain different visual proceses present in V1 (without applying any type of training or optimization and keeping the same parametrization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex. |
Tasks | Eye Tracking, Saliency Prediction |
Published | 2018-11-15 |
URL | https://arxiv.org/abs/1811.06308v7 |
https://arxiv.org/pdf/1811.06308v7.pdf | |
PWC | https://paperswithcode.com/paper/a-neurodynamic-model-of-saliency-prediction |
Repo | https://github.com/dberga/NSWAM |
Framework | none |
Visual Dialogue without Vision or Dialogue
Title | Visual Dialogue without Vision or Dialogue |
Authors | Daniela Massiceti, Puneet K. Dokania, N. Siddharth, Philip H. S. Torr |
Abstract | We characterise some of the quirks and shortcomings in the exploration of Visual Dialogue - a sequential question-answering task where the questions and corresponding answers are related through given visual stimuli. To do so, we develop an embarrassingly simple method based on Canonical Correlation Analysis (CCA) that, on the standard dataset, achieves near state-of-the-art performance on mean rank (MR). In direct contrast to current complex and over-parametrised architectures that are both compute and time intensive, our method ignores the visual stimuli, ignores the sequencing of dialogue, does not need gradients, uses off-the-shelf feature extractors, has at least an order of magnitude fewer parameters, and learns in practically no time. We argue that these results are indicative of issues in current approaches to Visual Dialogue and conduct analyses to highlight implicit dataset biases and effects of over-constrained evaluation metrics. Our code is publicly available. |
Tasks | Question Answering, Visual Dialog |
Published | 2018-12-16 |
URL | https://arxiv.org/abs/1812.06417v3 |
https://arxiv.org/pdf/1812.06417v3.pdf | |
PWC | https://paperswithcode.com/paper/visual-dialogue-without-vision-or-dialogue |
Repo | https://github.com/danielamassiceti/CCA-visualdialogue |
Framework | pytorch |
A Novel Framework for Online Supervised Learning with Feature Selection
Title | A Novel Framework for Online Supervised Learning with Feature Selection |
Authors | Lizhe Sun, Yangzi Guo, Adrian Barbu |
Abstract | Current online learning methods suffer issues such as lower convergence rates and limited capability to recover the support of the true features compared to their offline counterparts. In this paper, we present a novel framework for online learning based on running averages and introduce a series of online versions of some popular existing offline methods such as Elastic Net, Minimax Concave Penalty and Feature Selection with Annealing. We prove the equivalence between our online methods and their offline counterparts and give theoretical true feature recovery and convergence guarantees for some of them. In contrast to the existing online methods, the proposed methods can extract models with any desired sparsity level at any time. Numerical experiments indicate that our new methods enjoy high accuracy of true feature recovery and a fast convergence rate, compared with standard online and offline algorithms. We also show how the running averages framework can be used for model adaptation in the presence of model drift. Finally, we present some applications to large datasets where again the proposed framework shows competitive results compared to popular online and offline algorithms. |
Tasks | Feature Selection |
Published | 2018-03-30 |
URL | https://arxiv.org/abs/1803.11521v6 |
https://arxiv.org/pdf/1803.11521v6.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-framework-for-online-supervised |
Repo | https://github.com/lizhesun0507/Runningaverageonlinelearning |
Framework | none |
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
Title | Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules? |
Authors | Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester |
Abstract | Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns coincide with manually-defined word segmentations and annotations. To that end, we extend the contextual decomposition technique (Murdoch et al. 2018) to convolutional neural networks which allows us to compare convolutional neural networks and bidirectional long short-term memory networks. We evaluate and compare these models for the task of morphological tagging on three morphologically different languages and show that these models implicitly discover understandable linguistic rules. Our implementation can be found at https://github.com/FredericGodin/ContextualDecomposition-NLP . |
Tasks | Morphological Tagging |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09551v1 |
http://arxiv.org/pdf/1808.09551v1.pdf | |
PWC | https://paperswithcode.com/paper/explaining-character-aware-neural-networks |
Repo | https://github.com/FredericGodin/ContextualDecomposition-NLP |
Framework | pytorch |
Automatic segmentation of skin lesions using deep learning
Title | Automatic segmentation of skin lesions using deep learning |
Authors | Joshua Peter Ebenezer, Jagath C. Rajapakse |
Abstract | This paper summarizes the method used in our submission to Task 1 of the International Skin Imaging Collaboration’s (ISIC) Skin Lesion Analysis Towards Melanoma Detection challenge held in 2018. We used a fully automated method to accurately segment lesion boundaries from dermoscopic images. A U-net deep learning network is trained on publicly available data from ISIC. We introduce the use of intensity, color, and texture enhancement operations as pre-processing steps and morphological operations and contour identification as post-processing steps. |
Tasks | |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.04893v1 |
http://arxiv.org/pdf/1807.04893v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-of-skin-lesions-using |
Repo | https://github.com/JoshuaEbenezer/deep_segment |
Framework | none |
Differentiating Concepts and Instances for Knowledge Graph Embedding
Title | Differentiating Concepts and Instances for Knowledge Graph Embedding |
Authors | Xin Lv, Lei Hou, Juanzi Li, Zhiyuan Liu |
Abstract | Concepts, which represent a group of different instances sharing common properties, are essential information in knowledge representation. Most conventional knowledge embedding methods encode both entities (concepts and instances) and relations as vectors in a low dimensional semantic space equally, ignoring the difference between concepts and instances. In this paper, we propose a novel knowledge graph embedding model named TransC by differentiating concepts and instances. Specifically, TransC encodes each concept in knowledge graph as a sphere and each instance as a vector in the same semantic space. We use the relative positions to model the relations between concepts and instances (i.e., instanceOf), and the relations between concepts and sub-concepts (i.e., subClassOf). We evaluate our model on both link prediction and triple classification tasks on the dataset based on YAGO. Experimental results show that TransC outperforms state-of-the-art methods, and captures the semantic transitivity for instanceOf and subClassOf relation. Our codes and datasets can be obtained from https:// github.com/davidlvxin/TransC. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04588v1 |
http://arxiv.org/pdf/1811.04588v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiating-concepts-and-instances-for |
Repo | https://github.com/davidlvxin/TransC |
Framework | none |
Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data
Title | Using Simpson’s Paradox to Discover Interesting Patterns in Behavioral Data |
Authors | Nazanin Alipourfard, Peter G. Fennell, Kristina Lerman |
Abstract | We describe a data-driven discovery method that leverages Simpson’s paradox to uncover interesting patterns in behavioral data. Our method systematically disaggregates data to identify subgroups within a population whose behavior deviates significantly from the rest of the population. Given an outcome of interest and a set of covariates, the method follows three steps. First, it disaggregates data into subgroups, by conditioning on a particular covariate, so as minimize the variation of the outcome within the subgroups. Next, it models the outcome as a linear function of another covariate, both in the subgroups and in the aggregate data. Finally, it compares trends to identify disaggregations that produce subgroups with different behaviors from the aggregate. We illustrate the method by applying it to three real-world behavioral datasets, including Q&A site Stack Exchange and online learning platforms Khan Academy and Duolingo. |
Tasks | |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.03094v1 |
http://arxiv.org/pdf/1805.03094v1.pdf | |
PWC | https://paperswithcode.com/paper/using-simpsons-paradox-to-discover |
Repo | https://github.com/ninoch/Trend-Simpsons-Paradox |
Framework | none |