January 31, 2020

3043 words 15 mins read

Paper Group ANR 141

Paper Group ANR 141

Combined Model for Partially-Observable and Non-Observable Task Switching: Solving Hierarchical Reinforcement Learning Problems. Stochastic DCA for minimizing a large sum of DC functions with application to Multi-class Logistic Regression. Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination. Neural data-to-text gener …

Combined Model for Partially-Observable and Non-Observable Task Switching: Solving Hierarchical Reinforcement Learning Problems

Title Combined Model for Partially-Observable and Non-Observable Task Switching: Solving Hierarchical Reinforcement Learning Problems
Authors Nibraas Khan, Joshua Phillips
Abstract An integral function of fully autonomous robots and humans is the ability to focus attention on a few relevant percepts to reach a certain goal while disregarding irrelevant percepts. Humans and animals rely on the interactions between the Pre-Frontal Cortex and the Basal Ganglia to achieve this focus, which is known as working memory. The working memory toolkit (WMtk) was developed based on a computational neuroscience model of this phenomenon with the use of temporal difference learning for autonomous systems. Recent adaptations of the toolkit either utilize abstract task representations to solve non-observable tasks or storage of past input features to solve partially-observable tasks, but not both. We propose a new model, which combines both approaches to solve complex tasks with both Partially-Observable (PO) and Non-Observable (NO) components called PONOWMtk. The model learns when to store relevant cues in working memory as well as when to switch from one task representation to another based on external feedback. The results of our experiments show that PONOWMtk performs effectively for tasks that exhibit PO properties or NO properties or both.
Tasks Hierarchical Reinforcement Learning
Published 2019-11-23
URL https://arxiv.org/abs/1911.10425v3
PDF https://arxiv.org/pdf/1911.10425v3.pdf
PWC https://paperswithcode.com/paper/combined-model-for-partially-observable-and
Repo
Framework

Stochastic DCA for minimizing a large sum of DC functions with application to Multi-class Logistic Regression

Title Stochastic DCA for minimizing a large sum of DC functions with application to Multi-class Logistic Regression
Authors Hoai An Le Thi, Hoai Minh Le, Duy Nhat Phan, Bach Tran
Abstract We consider the large sum of DC (Difference of Convex) functions minimization problem which appear in several different areas, especially in stochastic optimization and machine learning. Two DCA (DC Algorithm) based algorithms are proposed: stochastic DCA and inexact stochastic DCA. We prove that the convergence of both algorithms to a critical point is guaranteed with probability one. Furthermore, we develop our stochastic DCA for solving an important problem in multi-task learning, namely group variables selection in multi class logistic regression. The corresponding stochastic DCA is very inexpensive, all computations are explicit. Numerical experiments on several benchmark datasets and synthetic datasets illustrate the efficiency of our algorithms and their superiority over existing methods, with respect to classification accuracy, sparsity of solution as well as running time.
Tasks Multi-Task Learning, Stochastic Optimization
Published 2019-11-10
URL https://arxiv.org/abs/1911.03992v1
PDF https://arxiv.org/pdf/1911.03992v1.pdf
PWC https://paperswithcode.com/paper/stochastic-dca-for-minimizing-a-large-sum-of
Repo
Framework

Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination

Title Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination
Authors Ruixue Liu, Baoyang Chen, Meng Chen, Youzheng Wu, Zhijie Qiu, Xiaodong He
Abstract We present a novel real-time, collaborative, and interactive AI painting system, Mappa Mundi, for artistic Mind Map creation. The system consists of a voice-based input interface, an automatic topic expansion module, and an image projection module. The key innovation is to inject Artificial Imagination into painting creation by considering lexical and phonological similarities of language, learning and inheriting artist’s original painting style, and applying the principles of Dadaism and impossibility of improvisation. Our system indicates that AI and artist can collaborate seamlessly to create imaginative artistic painting and Mappa Mundi has been applied in art exhibition in UCCA, Beijing
Tasks
Published 2019-05-09
URL https://arxiv.org/abs/1905.03638v2
PDF https://arxiv.org/pdf/1905.03638v2.pdf
PWC https://paperswithcode.com/paper/190503638
Repo
Framework

Neural data-to-text generation: A comparison between pipeline and end-to-end architectures

Title Neural data-to-text generation: A comparison between pipeline and end-to-end architectures
Authors Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, Emiel Krahmer
Abstract Traditionally, most data-to-text applications have been designed using a modular pipeline architecture, in which non-linguistic input data is converted into natural language through several intermediate transformations. In contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in-between. This study introduces a systematic comparison between neural pipeline and end-to-end data-to-text approaches for the generation of text from RDF triples. Both architectures were implemented making use of state-of-the art deep learning methods as the encoder-decoder Gated-Recurrent Units (GRU) and Transformer. Automatic and human evaluations together with a qualitative analysis suggest that having explicit intermediate steps in the generation process results in better texts than the ones generated by end-to-end approaches. Moreover, the pipeline models generalize better to unseen inputs. Data and code are publicly available.
Tasks Data-to-Text Generation, Text Generation
Published 2019-08-23
URL https://arxiv.org/abs/1908.09022v2
PDF https://arxiv.org/pdf/1908.09022v2.pdf
PWC https://paperswithcode.com/paper/neural-data-to-text-generation-a-comparison
Repo
Framework

A Heuristically Modified FP-Tree for Ontology Learning with Applications in Education

Title A Heuristically Modified FP-Tree for Ontology Learning with Applications in Education
Authors Safwan Shatnawi, Mohamed Medhat Gaber, Mihaela Cocea
Abstract We propose a heuristically modified FP-Tree for ontology learning from text. Unlike previous research, for concept extraction, we use a regular expression parser approach widely adopted in compiler construction, i.e., deterministic finite automata (DFA). Thus, the concepts are extracted from unstructured documents. For ontology learning, we use a frequent pattern mining approach and employ a rule mining heuristic function to enhance its quality. This process does not rely on predefined lexico-syntactic patterns, thus, it is applicable for different subjects. We employ the ontology in a question-answering system for students’ content-related questions. For validation, we used textbook questions/answers and questions from online course forums. Subject experts rated the quality of the system’s answers on a subset of questions and their ratings were used to identify the most appropriate automatic semantic text similarity metric to use as a validation metric for all answers. The Latent Semantic Analysis was identified as the closest to the experts’ ratings. We compared the use of our ontology with the use of Text2Onto for the question-answering system and found that with our ontology 80% of the questions were answered, while with Text2Onto only 28.4% were answered, thanks to the finer grained hierarchy our approach is able to produce.
Tasks Question Answering
Published 2019-10-29
URL https://arxiv.org/abs/1910.13561v1
PDF https://arxiv.org/pdf/1910.13561v1.pdf
PWC https://paperswithcode.com/paper/a-heuristically-modified-fp-tree-for-ontology
Repo
Framework

Probabilistic Time of Arrival Localization

Title Probabilistic Time of Arrival Localization
Authors Fernando Perez-Cruz, Pablo M. Olmos, Michael Minyi Zhang, Howard Huang
Abstract In this paper, we take a new approach for time of arrival geo-localization. We show that the main sources of error in metropolitan areas are due to environmental imperfections that bias our solutions, and that we can rely on a probabilistic model to learn and compensate for them. The resulting localization error is validated using measurements from a live LTE cellular network to be less than 10 meters, representing an order-of-magnitude improvement.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.06569v1
PDF https://arxiv.org/pdf/1910.06569v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-time-of-arrival-localization
Repo
Framework

Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks

Title Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks
Authors Sekitoshi Kanai, Yasutoshi Ida, Yasuhiro Fujiwara, Masanori Yamada, Shuichi Adachi
Abstract We propose Absum, which is a regularization method for improving adversarial robustness of convolutional neural networks (CNNs). Although CNNs can accurately recognize images, recent studies have shown that the convolution operations in CNNs commonly have structural sensitivity to specific noise composed of Fourier basis functions. By exploiting this sensitivity, they proposed a simple black-box adversarial attack: Single Fourier attack. To reduce structural sensitivity, we can use regularization of convolution filter weights since the sensitivity of linear transform can be assessed by the norm of the weights. However, standard regularization methods can prevent minimization of the loss function because they impose a tight constraint for obtaining high robustness. To solve this problem, Absum imposes a loose constraint; it penalizes the absolute values of the summation of the parameters in the convolution layers. Absum can improve robustness against single Fourier attack while being as simple and efficient as standard regularization methods (e.g., weight decay and L1 regularization). Our experiments demonstrate that Absum improves robustness against single Fourier attack more than standard regularization methods. Furthermore, we reveal that robust CNNs with Absum are more robust against transferred attacks due to decreasing the common sensitivity and against high-frequency noise than standard regularization methods. We also reveal that Absum can improve robustness against gradient-based attacks (projected gradient descent) when used with adversarial training.
Tasks Adversarial Attack
Published 2019-09-19
URL https://arxiv.org/abs/1909.08830v1
PDF https://arxiv.org/pdf/1909.08830v1.pdf
PWC https://paperswithcode.com/paper/absum-simple-regularization-method-for
Repo
Framework

Learning as the Unsupervised Alignment of Conceptual Systems

Title Learning as the Unsupervised Alignment of Conceptual Systems
Authors Brett D. Roads, Bradley C. Love
Abstract Concept induction requires the extraction and naming of concepts from noisy perceptual experience. For supervised approaches, as the number of concepts grows, so does the number of required training examples. Philosophers, psychologists, and computer scientists, have long recognized that children can learn to label objects without being explicitly taught. In a series of computational experiments, we highlight how information in the environment can be used to build and align conceptual systems. Unlike supervised learning, the learning problem becomes easier the more concepts and systems there are to master. The key insight is that each concept has a unique signature within one conceptual system (e.g., images) that is recapitulated in other systems (e.g., text or audio). As predicted, children’s early concepts form readily aligned systems.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09012v3
PDF https://arxiv.org/pdf/1906.09012v3.pdf
PWC https://paperswithcode.com/paper/learning-as-the-unsupervised-alignment-of
Repo
Framework

Zero-shot Learning of 3D Point Cloud Objects

Title Zero-shot Learning of 3D Point Cloud Objects
Authors Ali Cheraghian, Shafin Rahman, Lars Petersson
Abstract Recent deep learning architectures can recognize instances of 3D point cloud objects of previously seen classes quite well. At the same time, current 3D depth camera technology allows generating/segmenting a large amount of 3D point cloud objects from an arbitrary scene, for which there is no previously seen training data. A challenge for a 3D point cloud recognition system is, then, to classify objects from new, unseen, classes. This issue can be resolved by adopting a zero-shot learning (ZSL) approach for 3D data, similar to the 2D image version of the same problem. ZSL attempts to classify unseen objects by comparing semantic information (attribute/word vector) of seen and unseen classes. Here, we adapt several recent 3D point cloud recognition systems to the ZSL setting with some changes to their architectures. To the best of our knowledge, this is the first attempt to classify unseen 3D point cloud objects in the ZSL setting. A standard protocol (which includes the choice of datasets and the seen/unseen split) to evaluate such systems is also proposed. Baseline performances are reported using the new protocol on the investigated models. This investigation throws a new challenge to the 3D point cloud recognition community that may instigate numerous future works.
Tasks Zero-Shot Learning
Published 2019-02-27
URL http://arxiv.org/abs/1902.10272v1
PDF http://arxiv.org/pdf/1902.10272v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-learning-of-3d-point-cloud-objects
Repo
Framework

Text Generation with Exemplar-based Adaptive Decoding

Title Text Generation with Exemplar-based Adaptive Decoding
Authors Hao Peng, Ankur P. Parikh, Manaal Faruqui, Bhuwan Dhingra, Dipanjan Das
Abstract We propose a novel conditioned text generation model. It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e., what to say), and the template influences how to say it. Building on the successful encoder-decoder paradigm, it first encodes the content representation from the given input text; to produce the output, it retrieves exemplar text from the training data as “soft templates,” which are then used to construct an exemplar-specific decoder. We evaluate the proposed model on abstractive text summarization and data-to-text generation. Empirical results show that this model achieves strong performance and outperforms comparable baselines.
Tasks Abstractive Text Summarization, Data-to-Text Generation, Text Generation, Text Summarization
Published 2019-04-09
URL http://arxiv.org/abs/1904.04428v2
PDF http://arxiv.org/pdf/1904.04428v2.pdf
PWC https://paperswithcode.com/paper/text-generation-with-exemplar-based-adaptive
Repo
Framework

Constructing Information-Lossless Biological Knowledge Graphs from Conditional Statements

Title Constructing Information-Lossless Biological Knowledge Graphs from Conditional Statements
Authors Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
Abstract Conditions are essential in the statements of biological literature. Without the conditions (e.g., environment, equipment) that were precisely specified, the facts (e.g., observations) in the statements may no longer be valid. One biological statement has one or multiple fact(s) and/or condition(s). Their subject and object can be either a concept or a concept’s attribute. Existing information extraction methods do not consider the role of condition in the biological statement nor the role of attribute in the subject/object. In this work, we design a new tag schema and propose a deep sequence tagging framework to structure conditional statement into fact and condition tuples from biological text. Experiments demonstrate that our method yields a information-lossless structure of the literature.
Tasks Knowledge Graphs
Published 2019-06-26
URL https://arxiv.org/abs/1907.00720v1
PDF https://arxiv.org/pdf/1907.00720v1.pdf
PWC https://paperswithcode.com/paper/constructing-information-lossless-biological
Repo
Framework

Sample Variance Decay in Randomly Initialized ReLU Networks

Title Sample Variance Decay in Randomly Initialized ReLU Networks
Authors Kyle Luther, H. Sebastian Seung
Abstract Before training a neural net, a classic rule of thumb is to randomly initialize the weights so the variance of activations is preserved across layers. This is traditionally interpreted using the total variance due to randomness in both weights \emph{and} samples. Alternatively, one can interpret the rule of thumb as preservation of the variance over samples for a fixed network. The two interpretations differ little for a shallow net, but the difference is shown to grow with depth for a deep ReLU net by decomposing the total variance into the network-averaged sum of the sample variance and square of the sample mean. We demonstrate that even when the total variance is preserved, the sample variance decays in the later layers through an analytical calculation in the limit of infinite network width, and numerical simulations for finite width. We show that Batch Normalization eliminates this decay and provide empirical evidence that preserving the sample variance instead of only the total variance at initialization time can have an impact on the training dynamics of a deep network.
Tasks
Published 2019-02-13
URL https://arxiv.org/abs/1902.04942v2
PDF https://arxiv.org/pdf/1902.04942v2.pdf
PWC https://paperswithcode.com/paper/variance-preserving-initialization-schemes
Repo
Framework

Sequence-Aware Factorization Machines for Temporal Predictive Analytics

Title Sequence-Aware Factorization Machines for Temporal Predictive Analytics
Authors Tong Chen, Hongzhi Yin, Quoc Viet Hung Nguyen, Wen-Chih Peng, Xue Li, Xiaofang Zhou
Abstract In various web applications like targeted advertising and recommender systems, the available categorical features (e.g., product type) are often of great importance but sparse. As a widely adopted solution, models based on Factorization Machines (FMs) are capable of modelling high-order interactions among features for effective sparse predictive analytics. As the volume of web-scale data grows exponentially over time, sparse predictive analytics inevitably involves dynamic and sequential features. However, existing FM-based models assume no temporal orders in the data, and are unable to capture the sequential dependencies or patterns within the dynamic features, impeding the performance and adaptivity of these methods. Hence, in this paper, we propose a novel Sequence-Aware Factorization Machine (SeqFM) for temporal predictive analytics, which models feature interactions by fully investigating the effect of sequential dependencies. As static features (e.g., user gender) and dynamic features (e.g., user interacted items) express different semantics, we innovatively devise a multi-view self-attention scheme that separately models the effect of static features, dynamic features and the mutual interactions between static and dynamic features in three different views. In SeqFM, we further map the learned representations of feature interactions to the desired output with a shared residual network. To showcase the versatility and generalizability of SeqFM, we test SeqFM in three popular application scenarios for FM-based models, namely ranking, classification and regression tasks. Extensive experimental results on six large-scale datasets demonstrate the superior effectiveness and efficiency of SeqFM.
Tasks Recommendation Systems
Published 2019-11-07
URL https://arxiv.org/abs/1911.02752v2
PDF https://arxiv.org/pdf/1911.02752v2.pdf
PWC https://paperswithcode.com/paper/sequence-aware-factorization-machines-for
Repo
Framework

Multi-Stage Pathological Image Classification using Semantic Segmentation

Title Multi-Stage Pathological Image Classification using Semantic Segmentation
Authors Shusuke Takahama, Yusuke Kurose, Yusuke Mukuta, Hiroyuki Abe, Masashi Fukayama, Akihiko Yoshizawa, Masanobu Kitagawa, Tatsuya Harada
Abstract Histopathological image analysis is an essential process for the discovery of diseases such as cancer. However, it is challenging to train CNN on whole slide images (WSIs) of gigapixel resolution considering the available memory capacity. Most of the previous works divide high resolution WSIs into small image patches and separately input them into the model to classify it as a tumor or a normal tissue. However, patch-based classification uses only patch-scale local information but ignores the relationship between neighboring patches. If we consider the relationship of neighboring patches and global features, we can improve the classification performance. In this paper, we propose a new model structure combining the patch-based classification model and whole slide-scale segmentation model in order to improve the prediction performance of automatic pathological diagnosis. We extract patch features from the classification model and input them into the segmentation model to obtain a whole slide tumor probability heatmap. The classification model considers patch-scale local features, and the segmentation model can take global information into account. We also propose a new optimization method that retains gradient information and trains the model partially for end-to-end learning with limited GPU memory capacity. We apply our method to the tumor/normal prediction on WSIs and the classification performance is improved compared with the conventional patch-based method.
Tasks Image Classification, Semantic Segmentation
Published 2019-10-10
URL https://arxiv.org/abs/1910.04473v1
PDF https://arxiv.org/pdf/1910.04473v1.pdf
PWC https://paperswithcode.com/paper/multi-stage-pathological-image-classification
Repo
Framework

Finite size corrections for neural network Gaussian processes

Title Finite size corrections for neural network Gaussian processes
Authors Joseph M. Antognini
Abstract There has been a recent surge of interest in modeling neural networks (NNs) as Gaussian processes. In the limit of a NN of infinite width the NN becomes equivalent to a Gaussian process. Here we demonstrate that for an ensemble of large, finite, fully connected networks with a single hidden layer the distribution of outputs at initialization is well described by a Gaussian perturbed by the fourth Hermite polynomial for weights drawn from a symmetric distribution. We show that the scale of the perturbation is inversely proportional to the number of units in the NN and that higher order terms decay more rapidly, thereby recovering the Edgeworth expansion. We conclude by observing that understanding how this perturbation changes under training would reveal the regimes in which the Gaussian process framework is valid to model NN behavior.
Tasks Gaussian Processes
Published 2019-08-27
URL https://arxiv.org/abs/1908.10030v1
PDF https://arxiv.org/pdf/1908.10030v1.pdf
PWC https://paperswithcode.com/paper/finite-size-corrections-for-neural-network
Repo
Framework
comments powered by Disqus