Paper Group ANR 138
Convolutional Spectral Kernel Learning. Online Continual Learning on Sequences. Triple Memory Networks: a Brain-Inspired Method for Continual Learning. An ASP semantics for Constraints involving Conditional Aggregates. Prototype Refinement Network for Few-Shot Segmentation. Directions for Explainable Knowledge-Enabled Systems. DVNet: A Memory-Effic …
Convolutional Spectral Kernel Learning
Title | Convolutional Spectral Kernel Learning |
Authors | Jian Li, Yong Liu, Weiping Wang |
Abstract | Recently, non-stationary spectral kernels have drawn much attention, owing to its powerful feature representation ability in revealing long-range correlations and input-dependent characteristics. However, non-stationary spectral kernels are still shallow models, thus they are deficient to learn both hierarchical features and local interdependence. In this paper, to obtain hierarchical and local knowledge, we build an interpretable convolutional spectral kernel network (\texttt{CSKN}) based on the inverse Fourier transform, where we introduce deep architectures and convolutional filters into non-stationary spectral kernel representations. Moreover, based on Rademacher complexity, we derive the generalization error bounds and introduce two regularizers to improve the performance. Combining the regularizers and recent advancements on random initialization, we finally complete the learning framework of \texttt{CSKN}. Extensive experiments results on real-world datasets validate the effectiveness of the learning framework and coincide with our theoretical findings. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12744v1 |
https://arxiv.org/pdf/2002.12744v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-spectral-kernel-learning |
Repo | |
Framework | |
Online Continual Learning on Sequences
Title | Online Continual Learning on Sequences |
Authors | German I. Parisi, Vincenzo Lomonaco |
Abstract | Online continual learning (OCL) refers to the ability of a system to learn over time from a continuous stream of data without having to revisit previously encountered training samples. Learning continually in a single data pass is crucial for agents and robots operating in changing environments and required to acquire, fine-tune, and transfer increasingly complex representations from non-i.i.d. input distributions. Machine learning models that address OCL must alleviate \textit{catastrophic forgetting} in which hidden representations are disrupted or completely overwritten when learning from streams of novel input. In this chapter, we summarize and discuss recent deep learning models that address OCL on sequential input through the use (and combination) of synaptic regularization, structural plasticity, and experience replay. Different implementations of replay have been proposed that alleviate catastrophic forgetting in connectionists architectures via the re-occurrence of (latent representations of) input sequences and that functionally resemble mechanisms of hippocampal replay in the mammalian brain. Empirical evidence shows that architectures endowed with experience replay typically outperform architectures without in (online) incremental learning tasks. |
Tasks | Continual Learning |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09114v1 |
https://arxiv.org/pdf/2003.09114v1.pdf | |
PWC | https://paperswithcode.com/paper/online-continual-learning-on-sequences |
Repo | |
Framework | |
Triple Memory Networks: a Brain-Inspired Method for Continual Learning
Title | Triple Memory Networks: a Brain-Inspired Method for Continual Learning |
Authors | Liyuan Wang, Bo Lei, Qian Li, Hang Su, Jun Zhu, Yi Zhong |
Abstract | Continual acquisition of novel experience without interfering previously learned knowledge, i.e. continual learning, is critical for artificial neural networks, but limited by catastrophic forgetting. A neural network adjusts its parameters when learning a new task, but then fails to conduct the old tasks well. By contrast, the brain has a powerful ability to continually learn new experience without catastrophic interference. The underlying neural mechanisms possibly attribute to the interplay of hippocampus-dependent memory system and neocortex-dependent memory system, mediated by prefrontal cortex. Specifically, the two memory systems develop specialized mechanisms to consolidate information as more specific forms and more generalized forms, respectively, and complement the two forms of information in the interplay. Inspired by such brain strategy, we propose a novel approach named triple memory networks (TMNs) for continual learning. TMNs model the interplay of hippocampus, prefrontal cortex and sensory cortex (a neocortex region) as a triple-network architecture of generative adversarial networks (GAN). The input information is encoded as specific representation of the data distributions in a generator, or generalized knowledge of solving tasks in a discriminator and a classifier, with implementing appropriate brain-inspired algorithms to alleviate catastrophic forgetting in each module. Particularly, the generator replays generated data of the learned tasks to the discriminator and the classifier, both of which are implemented with a weight consolidation regularizer to complement the lost information in generation process. TMNs achieve new state-of-the-art performance on a variety of class-incremental learning benchmarks on MNIST, SVHN, CIFAR-10 and ImageNet-50, comparing with strong baseline methods. |
Tasks | Continual Learning |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03143v1 |
https://arxiv.org/pdf/2003.03143v1.pdf | |
PWC | https://paperswithcode.com/paper/triple-memory-networks-a-brain-inspired |
Repo | |
Framework | |
An ASP semantics for Constraints involving Conditional Aggregates
Title | An ASP semantics for Constraints involving Conditional Aggregates |
Authors | Pedro Cabalar, Jorge Fandinno, Torsten Schaub, Philipp Wanko |
Abstract | We elaborate upon the formal foundations of hybrid Answer Set Programming (ASP) and extend its underlying logical framework with aggregate functions over constraint values and variables. This is achieved by introducing the construct of conditional expressions, which allow for considering two alternatives while evaluating constraints. Which alternative is considered is interpretation-dependent and chosen according to an associated condition. We put some emphasis on logic programs with linear constraints and show how common ASP aggregates can be regarded as particular cases of so-called conditional linear constraints. Finally, we introduce a polynomial-size, modular and faithful translation from our framework into regular (condition-free) Constraint ASP, outlining an implementation of conditional aggregates on top of existing hybrid ASP solvers. |
Tasks | |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06911v2 |
https://arxiv.org/pdf/2002.06911v2.pdf | |
PWC | https://paperswithcode.com/paper/an-asp-semantics-for-constraints-involving |
Repo | |
Framework | |
Prototype Refinement Network for Few-Shot Segmentation
Title | Prototype Refinement Network for Few-Shot Segmentation |
Authors | Jinlu Liu, Yongqiang Qin |
Abstract | Few-shot segmentation targets to segment new classes with few annotated images provided. It is more challenging than traditional semantic segmentation tasks that segment pre-defined classes with abundant annotated data. In this paper, we propose Prototype Refinement Network (PRNet) to attack the challenge of few-shot segmentation. PRNet learns to bidirectionally extract prototypes from both support and query images, which is different from existing methods. To extract representative prototypes of the new classes, we use adaptation and fusion for prototype refinement. The adaptation of PRNet is implemented by fine-tuning on the support set. Furthermore, prototype fusion is adopted to fuse support prototypes with query prototypes, incorporating the knowledge from both sides. Refined in this way, the prototypes become more discriminative in low-data regimes. Experiments on PASAL-$5^i$ and COCO-$20^i$ demonstrate the superiority of our method. Especially on COCO-$20^i$, PRNet significantly outperforms previous methods by a large margin of 13.1% in 1-shot setting and 17.4% in 5-shot setting respectively. |
Tasks | Semantic Segmentation |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03579v1 |
https://arxiv.org/pdf/2002.03579v1.pdf | |
PWC | https://paperswithcode.com/paper/prototype-refinement-network-for-few-shot |
Repo | |
Framework | |
Directions for Explainable Knowledge-Enabled Systems
Title | Directions for Explainable Knowledge-Enabled Systems |
Authors | Shruthi Chari, Daniel M. Gruen, Oshani Seneviratne, Deborah L. McGuinness |
Abstract | Interest in the field of Explainable Artificial Intelligence has been growing for decades and has accelerated recently. As Artificial Intelligence models have become more complex, and often more opaque, with the incorporation of complex machine learning techniques, explainability has become more critical. Recently, researchers have been investigating and tackling explainability with a user-centric focus, looking for explanations to consider trustworthiness, comprehensibility, explicit provenance, and context-awareness. In this chapter, we leverage our survey of explanation literature in Artificial Intelligence and closely related fields and use these past efforts to generate a set of explanation types that we feel reflect the expanded needs of explanation for today’s artificial intelligence applications. We define each type and provide an example question that would motivate the need for this style of explanation. We believe this set of explanation types will help future system designers in their generation and prioritization of requirements and further help generate explanations that are better aligned to users’ and situational needs. |
Tasks | |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07523v1 |
https://arxiv.org/pdf/2003.07523v1.pdf | |
PWC | https://paperswithcode.com/paper/directions-for-explainable-knowledge-enabled |
Repo | |
Framework | |
DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction
Title | DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction |
Authors | Leila Saadatifard, Aryan Mobiny, Pavel Govyadinov, Hien Nguyen, David Mayerich |
Abstract | Maps of brain microarchitecture are important for understanding neurological function and behavior, including alterations caused by chronic conditions such as neurodegenerative disease. Techniques such as knife-edge scanning microscopy (KESM) provide the potential for whole organ imaging at sub-cellular resolution. However, multi-terabyte data sizes make manual annotation impractical and automatic segmentation challenging. Densely packed cells combined with interconnected microvascular networks are a challenge for current segmentation algorithms. The massive size of high-throughput microscopy data necessitates fast and largely unsupervised algorithms. In this paper, we investigate a fully-convolutional, deep, and densely-connected encoder-decoder for pixel-wise semantic segmentation. The excessive memory complexity often encountered with deep and dense networks is mitigated using skip connections, resulting in fewer parameters and enabling a significant performance increase over prior architectures. The proposed network provides superior performance for semantic segmentation problems applied to open-source benchmarks. We finally demonstrate our network for cellular and microvascular segmentation, enabling quantitative metrics for organ-scale neurovascular analysis. |
Tasks | Semantic Segmentation |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01568v1 |
https://arxiv.org/pdf/2002.01568v1.pdf | |
PWC | https://paperswithcode.com/paper/dvnet-a-memory-efficient-three-dimensional |
Repo | |
Framework | |
ImagineNet: Restyling Apps Using Neural Style Transfer
Title | ImagineNet: Restyling Apps Using Neural Style Transfer |
Authors | Michael H. Fischer, Richard R. Yang, Monica S. Lam |
Abstract | This paper presents ImagineNet, a tool that uses a novel neural style transfer model to enable end-users and app developers to restyle GUIs using an image of their choice. Former neural style transfer techniques are inadequate for this application because they produce GUIs that are illegible and hence nonfunctional. We propose a neural solution by adding a new loss term to the original formulation, which minimizes the squared error in the uncentered cross-covariance of features from different levels in a CNN between the style and output images. ImagineNet retains the details of GUIs, while transferring the colors and textures of the art. We presented GUIs restyled with ImagineNet as well as other style transfer techniques to 50 evaluators and all preferred those of ImagineNet. We show how ImagineNet can be used to restyle (1) the graphical assets of an app, (2) an app with user-supplied content, and (3) an app with dynamically generated GUIs. |
Tasks | Style Transfer |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04932v2 |
https://arxiv.org/pdf/2001.04932v2.pdf | |
PWC | https://paperswithcode.com/paper/imaginenet-restyling-apps-using-neural-style |
Repo | |
Framework | |
Integrating Physics-Based Modeling with Machine Learning: A Survey
Title | Integrating Physics-Based Modeling with Machine Learning: A Survey |
Authors | Jared Willard, Xiaowei Jia, Shaoming Xu, Michael Steinbach, Vipin Kumar |
Abstract | In this manuscript, we provide a structured and comprehensive overview of techniques to integrate machine learning with physics-based modeling. First, we provide a summary of application areas for which these approaches have been applied. Then, we describe classes of methodologies used to construct physics-guided machine learning models and hybrid physics-machine learning frameworks from a machine learning standpoint. With this foundation, we then provide a systematic organization of these existing techniques and discuss ideas for future research. |
Tasks | |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04919v2 |
https://arxiv.org/pdf/2003.04919v2.pdf | |
PWC | https://paperswithcode.com/paper/integrating-physics-based-modeling-with |
Repo | |
Framework | |
Session-based Suggestion of Topics for Geographic Exploratory Search
Title | Session-based Suggestion of Topics for Geographic Exploratory Search |
Authors | Noemi Mauro, Liliana Ardissono |
Abstract | Exploratory information search can challenge users in the formulation of efficacious search queries. Moreover, complex information spaces, such as those managed by Geographical Information Systems, can disorient people, making it difficult to find relevant data. In order to address these issues, we developed a session-based suggestion model that proposes concepts as a “you might also be interested in” function, by taking the user’s previous queries into account. Our model can be applied to incrementally generate suggestions in interactive search. It can be used for query expansion, and in general to guide users in the exploration of possibly complex spaces of data categories. Our model is based on a concept co-occurrence graph that describes how frequently concepts are searched together in search sessions. Starting from an ontological domain representation, we generated the graph by analyzing the query log of a major search engine. Moreover, we identified clusters of ontology concepts which frequently co-occur in the sessions of the log via community detection on the graph. The evaluation of our model provided satisfactory accuracy results. |
Tasks | Community Detection |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11314v1 |
https://arxiv.org/pdf/2003.11314v1.pdf | |
PWC | https://paperswithcode.com/paper/session-based-suggestion-of-topics-for |
Repo | |
Framework | |
TTTTTackling WinoGrande Schemas
Title | TTTTTackling WinoGrande Schemas |
Authors | Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin |
Abstract | We applied the T5 sequence-to-sequence model to tackle the AI2 WinoGrande Challenge by decomposing each example into two input text strings, each containing a hypothesis, and using the probabilities assigned to the “entailment” token as a score of the hypothesis. Our first (and only) submission to the official leaderboard yielded 0.7673 AUC on March 13, 2020, which is the best known result at this time and beats the previous state of the art by over five points. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08380v1 |
https://arxiv.org/pdf/2003.08380v1.pdf | |
PWC | https://paperswithcode.com/paper/tttttackling-winogrande-schemas |
Repo | |
Framework | |
AI Trust in business processes: The need for process-aware explanations
Title | AI Trust in business processes: The need for process-aware explanations |
Authors | Steve T. K. Jan, Vatche Ishakian, Vinod Muthusamy |
Abstract | Business processes underpin a large number of enterprise operations including processing loan applications, managing invoices, and insurance claims. There is a large opportunity for infusing AI to reduce cost or provide better customer experience, and the business process management (BPM) literature is rich in machine learning solutions including unsupervised learning to gain insights on clusters of process traces, classification models to predict the outcomes, duration, or paths of partial process traces, extracting business process from documents, and models to recommend how to optimize a business process or navigate decision points. More recently, deep learning models including those from the NLP domain have been applied to process predictions. Unfortunately, very little of these innovations have been applied and adopted by enterprise companies. We assert that a large reason for the lack of adoption of AI models in BPM is that business users are risk-averse and do not implicitly trust AI models. There has, unfortunately, been little attention paid to explaining model predictions to business users with process context. We challenge the BPM community to build on the AI interpretability literature, and the AI Trust community to understand |
Tasks | |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07537v1 |
https://arxiv.org/pdf/2001.07537v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-trust-in-business-processes-the-need-for |
Repo | |
Framework | |
Automatic Discourse Segmentation: an evaluation in French
Title | Automatic Discourse Segmentation: an evaluation in French |
Authors | Rémy Saksik, Alejandro Molina-Villegas, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno |
Abstract | In this article, we describe some discursive segmentation methods as well as a preliminary evaluation of the segmentation quality. Although our experiment were carried for documents in French, we have developed three discursive segmentation models solely based on resources simultaneously available in several languages: marker lists and a statistic POS labeling. We have also carried out automatic evaluations of these systems against the Annodis corpus, which is a manually annotated reference. The results obtained are very encouraging. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.04095v1 |
https://arxiv.org/pdf/2002.04095v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-discourse-segmentation-an |
Repo | |
Framework | |
Learning Accurate Integer Transformer Machine-Translation Models
Title | Learning Accurate Integer Transformer Machine-Translation Models |
Authors | Ephrem Wu |
Abstract | We describe a method for training accurate Transformer machine-translation models to run inference using 8-bit integer (INT8) hardware matrix multipliers, as opposed to the more costly single-precision floating-point (FP32) hardware. Unlike previous work, which converted only 85 Transformer matrix multiplications to INT8, leaving 48 out of 133 of them in FP32 because of unacceptable accuracy loss, we convert them all to INT8 without compromising accuracy. Tested on the newstest2014 English-to-German translation task, our INT8 Transformer Base and Transformer Big models yield BLEU scores that are 99.3% to 100% relative to those of the corresponding FP32 models. Our approach converts all matrix-multiplication tensors from an existing FP32 model into INT8 tensors by automatically making range-precision trade-offs during training. To demonstrate the robustness of this approach, we also include results from INT6 Transformer models. |
Tasks | Machine Translation |
Published | 2020-01-03 |
URL | https://arxiv.org/abs/2001.00926v1 |
https://arxiv.org/pdf/2001.00926v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-accurate-integer-transformer-machine |
Repo | |
Framework | |
Self-concordant analysis of Frank-Wolfe algorithms
Title | Self-concordant analysis of Frank-Wolfe algorithms |
Authors | Pavel Dvurechensky, Shimrit Shtern, Mathias Staudigl, Petr Ostroukhov, Kamil Safin |
Abstract | Projection-free optimization via different variants of the Frank-Wolfe (FW) method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k), k being the iteration counter. If the problem can be represented by a local linear minimization oracle, we are the first to propose a FW method with linear convergence rate without assuming neither strong convexity nor a Lipschitz continuous gradient. |
Tasks | Quantum State Tomography |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04320v2 |
https://arxiv.org/pdf/2002.04320v2.pdf | |
PWC | https://paperswithcode.com/paper/self-concordant-analysis-of-frank-wolfe |
Repo | |
Framework | |