April 2, 2020

2738 words 13 mins read

Paper Group ANR 138

Paper Group ANR 138

Convolutional Spectral Kernel Learning. Online Continual Learning on Sequences. Triple Memory Networks: a Brain-Inspired Method for Continual Learning. An ASP semantics for Constraints involving Conditional Aggregates. Prototype Refinement Network for Few-Shot Segmentation. Directions for Explainable Knowledge-Enabled Systems. DVNet: A Memory-Effic …

Convolutional Spectral Kernel Learning

Title Convolutional Spectral Kernel Learning
Authors Jian Li, Yong Liu, Weiping Wang
Abstract Recently, non-stationary spectral kernels have drawn much attention, owing to its powerful feature representation ability in revealing long-range correlations and input-dependent characteristics. However, non-stationary spectral kernels are still shallow models, thus they are deficient to learn both hierarchical features and local interdependence. In this paper, to obtain hierarchical and local knowledge, we build an interpretable convolutional spectral kernel network (\texttt{CSKN}) based on the inverse Fourier transform, where we introduce deep architectures and convolutional filters into non-stationary spectral kernel representations. Moreover, based on Rademacher complexity, we derive the generalization error bounds and introduce two regularizers to improve the performance. Combining the regularizers and recent advancements on random initialization, we finally complete the learning framework of \texttt{CSKN}. Extensive experiments results on real-world datasets validate the effectiveness of the learning framework and coincide with our theoretical findings.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2002.12744v1
PDF https://arxiv.org/pdf/2002.12744v1.pdf
PWC https://paperswithcode.com/paper/convolutional-spectral-kernel-learning
Repo
Framework

Online Continual Learning on Sequences

Title Online Continual Learning on Sequences
Authors German I. Parisi, Vincenzo Lomonaco
Abstract Online continual learning (OCL) refers to the ability of a system to learn over time from a continuous stream of data without having to revisit previously encountered training samples. Learning continually in a single data pass is crucial for agents and robots operating in changing environments and required to acquire, fine-tune, and transfer increasingly complex representations from non-i.i.d. input distributions. Machine learning models that address OCL must alleviate \textit{catastrophic forgetting} in which hidden representations are disrupted or completely overwritten when learning from streams of novel input. In this chapter, we summarize and discuss recent deep learning models that address OCL on sequential input through the use (and combination) of synaptic regularization, structural plasticity, and experience replay. Different implementations of replay have been proposed that alleviate catastrophic forgetting in connectionists architectures via the re-occurrence of (latent representations of) input sequences and that functionally resemble mechanisms of hippocampal replay in the mammalian brain. Empirical evidence shows that architectures endowed with experience replay typically outperform architectures without in (online) incremental learning tasks.
Tasks Continual Learning
Published 2020-03-20
URL https://arxiv.org/abs/2003.09114v1
PDF https://arxiv.org/pdf/2003.09114v1.pdf
PWC https://paperswithcode.com/paper/online-continual-learning-on-sequences
Repo
Framework

Triple Memory Networks: a Brain-Inspired Method for Continual Learning

Title Triple Memory Networks: a Brain-Inspired Method for Continual Learning
Authors Liyuan Wang, Bo Lei, Qian Li, Hang Su, Jun Zhu, Yi Zhong
Abstract Continual acquisition of novel experience without interfering previously learned knowledge, i.e. continual learning, is critical for artificial neural networks, but limited by catastrophic forgetting. A neural network adjusts its parameters when learning a new task, but then fails to conduct the old tasks well. By contrast, the brain has a powerful ability to continually learn new experience without catastrophic interference. The underlying neural mechanisms possibly attribute to the interplay of hippocampus-dependent memory system and neocortex-dependent memory system, mediated by prefrontal cortex. Specifically, the two memory systems develop specialized mechanisms to consolidate information as more specific forms and more generalized forms, respectively, and complement the two forms of information in the interplay. Inspired by such brain strategy, we propose a novel approach named triple memory networks (TMNs) for continual learning. TMNs model the interplay of hippocampus, prefrontal cortex and sensory cortex (a neocortex region) as a triple-network architecture of generative adversarial networks (GAN). The input information is encoded as specific representation of the data distributions in a generator, or generalized knowledge of solving tasks in a discriminator and a classifier, with implementing appropriate brain-inspired algorithms to alleviate catastrophic forgetting in each module. Particularly, the generator replays generated data of the learned tasks to the discriminator and the classifier, both of which are implemented with a weight consolidation regularizer to complement the lost information in generation process. TMNs achieve new state-of-the-art performance on a variety of class-incremental learning benchmarks on MNIST, SVHN, CIFAR-10 and ImageNet-50, comparing with strong baseline methods.
Tasks Continual Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.03143v1
PDF https://arxiv.org/pdf/2003.03143v1.pdf
PWC https://paperswithcode.com/paper/triple-memory-networks-a-brain-inspired
Repo
Framework

An ASP semantics for Constraints involving Conditional Aggregates

Title An ASP semantics for Constraints involving Conditional Aggregates
Authors Pedro Cabalar, Jorge Fandinno, Torsten Schaub, Philipp Wanko
Abstract We elaborate upon the formal foundations of hybrid Answer Set Programming (ASP) and extend its underlying logical framework with aggregate functions over constraint values and variables. This is achieved by introducing the construct of conditional expressions, which allow for considering two alternatives while evaluating constraints. Which alternative is considered is interpretation-dependent and chosen according to an associated condition. We put some emphasis on logic programs with linear constraints and show how common ASP aggregates can be regarded as particular cases of so-called conditional linear constraints. Finally, we introduce a polynomial-size, modular and faithful translation from our framework into regular (condition-free) Constraint ASP, outlining an implementation of conditional aggregates on top of existing hybrid ASP solvers.
Tasks
Published 2020-02-17
URL https://arxiv.org/abs/2002.06911v2
PDF https://arxiv.org/pdf/2002.06911v2.pdf
PWC https://paperswithcode.com/paper/an-asp-semantics-for-constraints-involving
Repo
Framework

Prototype Refinement Network for Few-Shot Segmentation

Title Prototype Refinement Network for Few-Shot Segmentation
Authors Jinlu Liu, Yongqiang Qin
Abstract Few-shot segmentation targets to segment new classes with few annotated images provided. It is more challenging than traditional semantic segmentation tasks that segment pre-defined classes with abundant annotated data. In this paper, we propose Prototype Refinement Network (PRNet) to attack the challenge of few-shot segmentation. PRNet learns to bidirectionally extract prototypes from both support and query images, which is different from existing methods. To extract representative prototypes of the new classes, we use adaptation and fusion for prototype refinement. The adaptation of PRNet is implemented by fine-tuning on the support set. Furthermore, prototype fusion is adopted to fuse support prototypes with query prototypes, incorporating the knowledge from both sides. Refined in this way, the prototypes become more discriminative in low-data regimes. Experiments on PASAL-$5^i$ and COCO-$20^i$ demonstrate the superiority of our method. Especially on COCO-$20^i$, PRNet significantly outperforms previous methods by a large margin of 13.1% in 1-shot setting and 17.4% in 5-shot setting respectively.
Tasks Semantic Segmentation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03579v1
PDF https://arxiv.org/pdf/2002.03579v1.pdf
PWC https://paperswithcode.com/paper/prototype-refinement-network-for-few-shot
Repo
Framework

Directions for Explainable Knowledge-Enabled Systems

Title Directions for Explainable Knowledge-Enabled Systems
Authors Shruthi Chari, Daniel M. Gruen, Oshani Seneviratne, Deborah L. McGuinness
Abstract Interest in the field of Explainable Artificial Intelligence has been growing for decades and has accelerated recently. As Artificial Intelligence models have become more complex, and often more opaque, with the incorporation of complex machine learning techniques, explainability has become more critical. Recently, researchers have been investigating and tackling explainability with a user-centric focus, looking for explanations to consider trustworthiness, comprehensibility, explicit provenance, and context-awareness. In this chapter, we leverage our survey of explanation literature in Artificial Intelligence and closely related fields and use these past efforts to generate a set of explanation types that we feel reflect the expanded needs of explanation for today’s artificial intelligence applications. We define each type and provide an example question that would motivate the need for this style of explanation. We believe this set of explanation types will help future system designers in their generation and prioritization of requirements and further help generate explanations that are better aligned to users’ and situational needs.
Tasks
Published 2020-03-17
URL https://arxiv.org/abs/2003.07523v1
PDF https://arxiv.org/pdf/2003.07523v1.pdf
PWC https://paperswithcode.com/paper/directions-for-explainable-knowledge-enabled
Repo
Framework

DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction

Title DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction
Authors Leila Saadatifard, Aryan Mobiny, Pavel Govyadinov, Hien Nguyen, David Mayerich
Abstract Maps of brain microarchitecture are important for understanding neurological function and behavior, including alterations caused by chronic conditions such as neurodegenerative disease. Techniques such as knife-edge scanning microscopy (KESM) provide the potential for whole organ imaging at sub-cellular resolution. However, multi-terabyte data sizes make manual annotation impractical and automatic segmentation challenging. Densely packed cells combined with interconnected microvascular networks are a challenge for current segmentation algorithms. The massive size of high-throughput microscopy data necessitates fast and largely unsupervised algorithms. In this paper, we investigate a fully-convolutional, deep, and densely-connected encoder-decoder for pixel-wise semantic segmentation. The excessive memory complexity often encountered with deep and dense networks is mitigated using skip connections, resulting in fewer parameters and enabling a significant performance increase over prior architectures. The proposed network provides superior performance for semantic segmentation problems applied to open-source benchmarks. We finally demonstrate our network for cellular and microvascular segmentation, enabling quantitative metrics for organ-scale neurovascular analysis.
Tasks Semantic Segmentation
Published 2020-02-04
URL https://arxiv.org/abs/2002.01568v1
PDF https://arxiv.org/pdf/2002.01568v1.pdf
PWC https://paperswithcode.com/paper/dvnet-a-memory-efficient-three-dimensional
Repo
Framework

ImagineNet: Restyling Apps Using Neural Style Transfer

Title ImagineNet: Restyling Apps Using Neural Style Transfer
Authors Michael H. Fischer, Richard R. Yang, Monica S. Lam
Abstract This paper presents ImagineNet, a tool that uses a novel neural style transfer model to enable end-users and app developers to restyle GUIs using an image of their choice. Former neural style transfer techniques are inadequate for this application because they produce GUIs that are illegible and hence nonfunctional. We propose a neural solution by adding a new loss term to the original formulation, which minimizes the squared error in the uncentered cross-covariance of features from different levels in a CNN between the style and output images. ImagineNet retains the details of GUIs, while transferring the colors and textures of the art. We presented GUIs restyled with ImagineNet as well as other style transfer techniques to 50 evaluators and all preferred those of ImagineNet. We show how ImagineNet can be used to restyle (1) the graphical assets of an app, (2) an app with user-supplied content, and (3) an app with dynamically generated GUIs.
Tasks Style Transfer
Published 2020-01-14
URL https://arxiv.org/abs/2001.04932v2
PDF https://arxiv.org/pdf/2001.04932v2.pdf
PWC https://paperswithcode.com/paper/imaginenet-restyling-apps-using-neural-style
Repo
Framework

Integrating Physics-Based Modeling with Machine Learning: A Survey

Title Integrating Physics-Based Modeling with Machine Learning: A Survey
Authors Jared Willard, Xiaowei Jia, Shaoming Xu, Michael Steinbach, Vipin Kumar
Abstract In this manuscript, we provide a structured and comprehensive overview of techniques to integrate machine learning with physics-based modeling. First, we provide a summary of application areas for which these approaches have been applied. Then, we describe classes of methodologies used to construct physics-guided machine learning models and hybrid physics-machine learning frameworks from a machine learning standpoint. With this foundation, we then provide a systematic organization of these existing techniques and discuss ideas for future research.
Tasks
Published 2020-03-10
URL https://arxiv.org/abs/2003.04919v2
PDF https://arxiv.org/pdf/2003.04919v2.pdf
PWC https://paperswithcode.com/paper/integrating-physics-based-modeling-with
Repo
Framework
Title Session-based Suggestion of Topics for Geographic Exploratory Search
Authors Noemi Mauro, Liliana Ardissono
Abstract Exploratory information search can challenge users in the formulation of efficacious search queries. Moreover, complex information spaces, such as those managed by Geographical Information Systems, can disorient people, making it difficult to find relevant data. In order to address these issues, we developed a session-based suggestion model that proposes concepts as a “you might also be interested in” function, by taking the user’s previous queries into account. Our model can be applied to incrementally generate suggestions in interactive search. It can be used for query expansion, and in general to guide users in the exploration of possibly complex spaces of data categories. Our model is based on a concept co-occurrence graph that describes how frequently concepts are searched together in search sessions. Starting from an ontological domain representation, we generated the graph by analyzing the query log of a major search engine. Moreover, we identified clusters of ontology concepts which frequently co-occur in the sessions of the log via community detection on the graph. The evaluation of our model provided satisfactory accuracy results.
Tasks Community Detection
Published 2020-03-25
URL https://arxiv.org/abs/2003.11314v1
PDF https://arxiv.org/pdf/2003.11314v1.pdf
PWC https://paperswithcode.com/paper/session-based-suggestion-of-topics-for
Repo
Framework

TTTTTackling WinoGrande Schemas

Title TTTTTackling WinoGrande Schemas
Authors Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin
Abstract We applied the T5 sequence-to-sequence model to tackle the AI2 WinoGrande Challenge by decomposing each example into two input text strings, each containing a hypothesis, and using the probabilities assigned to the “entailment” token as a score of the hypothesis. Our first (and only) submission to the official leaderboard yielded 0.7673 AUC on March 13, 2020, which is the best known result at this time and beats the previous state of the art by over five points.
Tasks
Published 2020-03-18
URL https://arxiv.org/abs/2003.08380v1
PDF https://arxiv.org/pdf/2003.08380v1.pdf
PWC https://paperswithcode.com/paper/tttttackling-winogrande-schemas
Repo
Framework

AI Trust in business processes: The need for process-aware explanations

Title AI Trust in business processes: The need for process-aware explanations
Authors Steve T. K. Jan, Vatche Ishakian, Vinod Muthusamy
Abstract Business processes underpin a large number of enterprise operations including processing loan applications, managing invoices, and insurance claims. There is a large opportunity for infusing AI to reduce cost or provide better customer experience, and the business process management (BPM) literature is rich in machine learning solutions including unsupervised learning to gain insights on clusters of process traces, classification models to predict the outcomes, duration, or paths of partial process traces, extracting business process from documents, and models to recommend how to optimize a business process or navigate decision points. More recently, deep learning models including those from the NLP domain have been applied to process predictions. Unfortunately, very little of these innovations have been applied and adopted by enterprise companies. We assert that a large reason for the lack of adoption of AI models in BPM is that business users are risk-averse and do not implicitly trust AI models. There has, unfortunately, been little attention paid to explaining model predictions to business users with process context. We challenge the BPM community to build on the AI interpretability literature, and the AI Trust community to understand
Tasks
Published 2020-01-21
URL https://arxiv.org/abs/2001.07537v1
PDF https://arxiv.org/pdf/2001.07537v1.pdf
PWC https://paperswithcode.com/paper/ai-trust-in-business-processes-the-need-for
Repo
Framework

Automatic Discourse Segmentation: an evaluation in French

Title Automatic Discourse Segmentation: an evaluation in French
Authors Rémy Saksik, Alejandro Molina-Villegas, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno
Abstract In this article, we describe some discursive segmentation methods as well as a preliminary evaluation of the segmentation quality. Although our experiment were carried for documents in French, we have developed three discursive segmentation models solely based on resources simultaneously available in several languages: marker lists and a statistic POS labeling. We have also carried out automatic evaluations of these systems against the Annodis corpus, which is a manually annotated reference. The results obtained are very encouraging.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.04095v1
PDF https://arxiv.org/pdf/2002.04095v1.pdf
PWC https://paperswithcode.com/paper/automatic-discourse-segmentation-an
Repo
Framework

Learning Accurate Integer Transformer Machine-Translation Models

Title Learning Accurate Integer Transformer Machine-Translation Models
Authors Ephrem Wu
Abstract We describe a method for training accurate Transformer machine-translation models to run inference using 8-bit integer (INT8) hardware matrix multipliers, as opposed to the more costly single-precision floating-point (FP32) hardware. Unlike previous work, which converted only 85 Transformer matrix multiplications to INT8, leaving 48 out of 133 of them in FP32 because of unacceptable accuracy loss, we convert them all to INT8 without compromising accuracy. Tested on the newstest2014 English-to-German translation task, our INT8 Transformer Base and Transformer Big models yield BLEU scores that are 99.3% to 100% relative to those of the corresponding FP32 models. Our approach converts all matrix-multiplication tensors from an existing FP32 model into INT8 tensors by automatically making range-precision trade-offs during training. To demonstrate the robustness of this approach, we also include results from INT6 Transformer models.
Tasks Machine Translation
Published 2020-01-03
URL https://arxiv.org/abs/2001.00926v1
PDF https://arxiv.org/pdf/2001.00926v1.pdf
PWC https://paperswithcode.com/paper/learning-accurate-integer-transformer-machine
Repo
Framework

Self-concordant analysis of Frank-Wolfe algorithms

Title Self-concordant analysis of Frank-Wolfe algorithms
Authors Pavel Dvurechensky, Shimrit Shtern, Mathias Staudigl, Petr Ostroukhov, Kamil Safin
Abstract Projection-free optimization via different variants of the Frank-Wolfe (FW) method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k), k being the iteration counter. If the problem can be represented by a local linear minimization oracle, we are the first to propose a FW method with linear convergence rate without assuming neither strong convexity nor a Lipschitz continuous gradient.
Tasks Quantum State Tomography
Published 2020-02-11
URL https://arxiv.org/abs/2002.04320v2
PDF https://arxiv.org/pdf/2002.04320v2.pdf
PWC https://paperswithcode.com/paper/self-concordant-analysis-of-frank-wolfe
Repo
Framework
comments powered by Disqus