January 31, 2020

2975 words 14 mins read

Paper Group ANR 193

Paper Group ANR 193

Continual Learning via Online Leverage Score Sampling. Parallel Iterative Edit Models for Local Sequence Transduction. Quantitative $W_1$ Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential and State-Dependent Noise. Approaching Small Molecule Prioritization as a Cross-Modal Information Retrieval Task through Coordinated Rep …

Continual Learning via Online Leverage Score Sampling

Title Continual Learning via Online Leverage Score Sampling
Authors Dan Teng, Sakyasingha Dasgupta
Abstract In order to mimic the human ability of continual acquisition and transfer of knowledge across various tasks, a learning system needs the capability for continual learning, effectively utilizing the previously acquired skills. As such, the key challenge is to transfer and generalize the knowledge learned from one task to other tasks, avoiding forgetting and interference of previous knowledge and improving the overall performance. In this paper, within the continual learning paradigm, we introduce a method that effectively forgets the less useful data samples continuously and allows beneficial information to be kept for training of the subsequent tasks, in an online manner. The method uses statistical leverage score information to measure the importance of the data samples in every task and adopts frequent directions approach to enable a continual or life-long learning property. This effectively maintains a constant training size across all tasks. We first provide mathematical intuition for the method and then demonstrate its effectiveness in avoiding catastrophic forgetting and computational efficiency on continual learning of classification tasks when compared with the existing state-of-the-art techniques.
Tasks Continual Learning
Published 2019-08-01
URL https://arxiv.org/abs/1908.00355v1
PDF https://arxiv.org/pdf/1908.00355v1.pdf
PWC https://paperswithcode.com/paper/continual-learning-via-online-leverage-score
Repo
Framework

Parallel Iterative Edit Models for Local Sequence Transduction

Title Parallel Iterative Edit Models for Local Sequence Transduction
Authors Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, Vihari Piratla
Abstract We present a Parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). Recent approaches are based on the popular encoder-decoder (ED) model for sequence to sequence learning. The ED model auto-regressively captures full dependency among output tokens but is slow due to sequential decoding. The PIE model does parallel decoding, giving up the advantage of modelling full dependency in the output, yet it achieves accuracy competitive with the ED model for four reasons: 1.~predicting edits instead of tokens, 2.~labeling sequences instead of generating sequences, 3.~iteratively refining predictions to capture dependencies, and 4.~factorizing logits over edits and their token argument to harness pre-trained language models like BERT. Experiments on tasks spanning GEC, OCR correction and spell correction demonstrate that the PIE model is an accurate and significantly faster alternative for local sequence transduction.
Tasks Grammatical Error Correction, Optical Character Recognition
Published 2019-10-07
URL https://arxiv.org/abs/1910.02893v1
PDF https://arxiv.org/pdf/1910.02893v1.pdf
PWC https://paperswithcode.com/paper/parallel-iterative-edit-models-for-local
Repo
Framework

Quantitative $W_1$ Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential and State-Dependent Noise

Title Quantitative $W_1$ Convergence of Langevin-Like Stochastic Processes with Non-Convex Potential and State-Dependent Noise
Authors Xiang Cheng, Dong Yin, Peter L. Bartlett, Michael I. Jordan
Abstract We prove quantitative convergence rates at which discrete Langevin-like processes converge to the invariant distribution of a related stochastic differential equation. We study the setup where the additive noise can be non-Gaussian and state-dependent and the potential function can be non-convex. We show that the key properties of these processes depend on the potential function and the second moment of the additive noise. We apply our theoretical findings to studying the convergence of Stochastic Gradient Descent (SGD) for non-convex problems and corroborate them with experiments using SGD to train deep neural networks on the CIFAR-10 dataset.
Tasks
Published 2019-07-07
URL https://arxiv.org/abs/1907.03215v3
PDF https://arxiv.org/pdf/1907.03215v3.pdf
PWC https://paperswithcode.com/paper/quantitative-w_1-convergence-of-langevin-like
Repo
Framework

Approaching Small Molecule Prioritization as a Cross-Modal Information Retrieval Task through Coordinated Representation Learning

Title Approaching Small Molecule Prioritization as a Cross-Modal Information Retrieval Task through Coordinated Representation Learning
Authors Samuel G. Finlayson, Matthew B. A. McDermott, Alex V. Pickering, Scott L. Lipnick, William Yuan, Isaac S. Kohane
Abstract Modeling the relationship between chemical structure and molecular activity is a key task in drug development and precision medicine. In this paper, we utilize a novel deep learning architecture to jointly train coordinated embeddings of chemical structures and transcriptional signatures. We do so by training neural networks in a coordinated manner such that learned chemical representations correlate most highly with the encodings of the transcriptional patterns they induce. We then test this approach by using held-out gene expression signatures as queries into embedding space to recover their corresponding compounds. We evaluate these embeddings’ utility for small molecule prioritization on this new benchmark task. Our method outperforms a series of baselines, successfully generalizing to unseen transcriptional experiments, but still struggles to generalize to entirely unseen chemical structures.
Tasks Cross-Modal Information Retrieval, Information Retrieval, Representation Learning
Published 2019-11-22
URL https://arxiv.org/abs/1911.10241v1
PDF https://arxiv.org/pdf/1911.10241v1.pdf
PWC https://paperswithcode.com/paper/approaching-small-molecule-prioritization-as
Repo
Framework

A Survey on Adversarial Information Retrieval on the Web

Title A Survey on Adversarial Information Retrieval on the Web
Authors Saad Farooq
Abstract This survey paper discusses different forms of malicious techniques that can affect how an information retrieval model retrieves documents for a query and their remedies.
Tasks Information Retrieval
Published 2019-11-21
URL https://arxiv.org/abs/1911.11060v3
PDF https://arxiv.org/pdf/1911.11060v3.pdf
PWC https://paperswithcode.com/paper/a-survey-on-adversarial-information-retrieval
Repo
Framework

Example-Guided Scene Image Synthesis using Masked Spatial-Channel Attention and Patch-Based Self-Supervision

Title Example-Guided Scene Image Synthesis using Masked Spatial-Channel Attention and Patch-Based Self-Supervision
Authors Haitian Zheng, Haofu Liao, Lele Chen, Wei Xiong, Tianlang Chen, Jiebo Luo
Abstract Example-guided image synthesis has been recently attempted to synthesize an image from a semantic label map and an exemplary image. In the task, the additional exemplary image serves to provide style guidance that controls the appearance of the synthesized output. Despite the controllability advantage, the previous models are designed on datasets with specific and roughly aligned objects. In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically unaligned to the given label map. To this end, we first propose a new Masked Spatial-Channel Attention (MSCA) module which models the correspondence between two unstructured scenes via cross-attention. Next, we propose an end-to-end network for joint global and local feature alignment and synthesis. In addition, we propose a novel patch-based self-supervision scheme to enable training. Experiments on the large-scale CCOO-stuff dataset show significant improvements over existing methods. Moreover, our approach provides interpretability and can be readily extended to other tasks including style and spatial interpolation or extrapolation, as well as other content manipulation.
Tasks Image Generation
Published 2019-11-27
URL https://arxiv.org/abs/1911.12362v1
PDF https://arxiv.org/pdf/1911.12362v1.pdf
PWC https://paperswithcode.com/paper/example-guided-scene-image-synthesis-using
Repo
Framework

Autoencoder-Based Incremental Class Learning without Retraining on Old Data

Title Autoencoder-Based Incremental Class Learning without Retraining on Old Data
Authors Euntae Choi, Kyungmi Lee, Kiyoung Choi
Abstract Incremental class learning, a scenario in continual learning context where classes and their training data are sequentially and disjointedly observed, challenges a problem widely known as catastrophic forgetting. In this work, we propose a novel incremental class learning method that can significantly reduce memory overhead compared to previous approaches. Apart from conventional classification scheme using softmax, our model bases on an autoencoder to extract prototypes for given inputs so that no change in its output unit is required. It stores only the mean of prototypes per class to perform metric-based classification, unlike rehearsal approaches which rely on large memory or generative model. To mitigate catastrophic forgetting, regularization methods are applied on our model when a new task is encountered. We evaluate our method by experimenting on CIFAR-100 and CUB-200-2011 and show that its performance is comparable to the state-of-the-art method with much lower additional memory cost.
Tasks Continual Learning
Published 2019-07-18
URL https://arxiv.org/abs/1907.07872v1
PDF https://arxiv.org/pdf/1907.07872v1.pdf
PWC https://paperswithcode.com/paper/autoencoder-based-incremental-class-learning
Repo
Framework

Robust Principal Component Analysis for Background Estimation of Particle Image Velocimetry Data

Title Robust Principal Component Analysis for Background Estimation of Particle Image Velocimetry Data
Authors Ahmadreza Baghaie
Abstract Particle Image Velocimetry (PIV) data processing procedures are adversely affected by light reflections and backgrounds as well as defects in the models and sticky particles that occlude the inner walls of the boundaries. In this paper, a novel approach is proposed for decomposition of the PIV data into background/foreground components, greatly reducing the effects of such artifacts. This is achieved by utilizing Robust Principal Component Analysis (RPCA) applied to the data matrix, generated by aggregating the vectorized PIV frames. It is assumed that the data matrix can be decomposed into two statistically different components, a low-rank component depicting the still background and a sparse component representing the moving particles within the imaged geometry. Formulating the assumptions as an optimization problem, Augmented Lagrange Multiplier (ALM) method is used for decomposing the data matrix into the low-rank and sparse components. Experiments and comparisons with the state-of-the-art using several PIV image sequences reveal the superiority of the proposed approach for background removal of PIV data.
Tasks
Published 2019-08-16
URL https://arxiv.org/abs/1908.06047v1
PDF https://arxiv.org/pdf/1908.06047v1.pdf
PWC https://paperswithcode.com/paper/robust-principal-component-analysis-for
Repo
Framework

Temporarily Unavailable: Memory Inhibition in Cognitive and Computer Science

Title Temporarily Unavailable: Memory Inhibition in Cognitive and Computer Science
Authors Tobias Tempel, Claudia Niederée, Christian Jilek, Andrea Ceroni, Heiko Maus, Yannick Runge, Christian Frings
Abstract Inhibition is one of the core concepts in Cognitive Psychology. The idea of inhibitory mechanisms actively weakening representations in the human mind has inspired a great number of studies in various research domains. In contrast, Computer Science only recently has begun to consider inhibition as a second basic processing quality beside activation. Here, we review psychological research on inhibition in memory and link the gained insights with the current efforts in Computer Science of incorporating inhibitory principles for optimizing information retrieval in Personal Information Management. Four common aspects guide this review in both domains: 1. The purpose of inhibition to increase processing efficiency. 2. Its relation to activation. 3. Its links to contexts. 4. Its temporariness. In summary, the concept of inhibition has been used by Computer Science for enhancing software in various ways already. Yet, we also identify areas for promising future developments of inhibitory mechanisms, particularly context inhibition.
Tasks Information Retrieval
Published 2019-11-15
URL https://arxiv.org/abs/1912.00760v1
PDF https://arxiv.org/pdf/1912.00760v1.pdf
PWC https://paperswithcode.com/paper/temporarily-unavailable-memory-inhibition-in
Repo
Framework

Neural Turbo Equalization: Deep Learning for Fiber-Optic Nonlinearity Compensation

Title Neural Turbo Equalization: Deep Learning for Fiber-Optic Nonlinearity Compensation
Authors Toshiaki Koike-Akino, Ye Wang, David S. Millar, Keisuke Kojima, Kieran Parsons
Abstract Recently, data-driven approaches motivated by modern deep learning have been applied to optical communications in place of traditional model-based counterparts. The application of deep neural networks (DNN) allows flexible statistical analysis of complicated fiber-optic systems without relying on any specific physical models. Due to the inherent nonlinearity in DNN, various equalizers based on DNN have shown significant potentials to mitigate fiber nonlinearity. In this paper, we propose a turbo equalization (TEQ) based on DNN as a new alternative framework to deal with nonlinear fiber impairments for future coherent optical communications. The proposed DNN-TEQ is constructed with nested deep residual networks (ResNet) to train extrinsic likelihood given soft-information feedback from channel decoding. Through extrinsic information transfer (EXIT) analysis, we verify that our DNN-TEQ can accelerate decoding convergence to achieve a significant gain in achievable throughput by 0.61b/s/Hz. We also demonstrate that optimizing irregular low-density parity-check (LDPC) codes to match EXIT chart of the DNN-TEQ can improve achievable rates by up to 0.12 b/s/Hz.
Tasks
Published 2019-11-22
URL https://arxiv.org/abs/1911.10131v1
PDF https://arxiv.org/pdf/1911.10131v1.pdf
PWC https://paperswithcode.com/paper/neural-turbo-equalization-deep-learning-for
Repo
Framework

DR-GAN: Conditional Generative Adversarial Network for Fine-Grained Lesion Synthesis on Diabetic Retinopathy Images

Title DR-GAN: Conditional Generative Adversarial Network for Fine-Grained Lesion Synthesis on Diabetic Retinopathy Images
Authors Yi Zhou, Boyang Wang, Xiaodong He, Shanshan Cui, Fan Zhu, Li Liu, Ling Shao
Abstract Diabetic retinopathy (DR) is a complication of diabetes that severely affects eyes. It can be graded into five levels of severity according to international protocol. However, optimizing a grading model to have strong generalizability requires a large amount of balanced training data, which is difficult to collect particularly for the high severity levels. Typical data augmentation methods, including random flipping and rotation, cannot generate data with high diversity. In this paper, we propose a diabetic retinopathy generative adversarial network (DR-GAN) to synthesize high-resolution fundus images which can be manipulated with arbitrary grading and lesion information. Thus, large-scale generated data can be used for more meaningful augmentation to train a DR grading and lesion segmentation model. The proposed retina generator is conditioned on the structural and lesion masks, as well as adaptive grading vectors sampled from the latent grading space, which can be adopted to control the synthesized grading severity. Moreover, a multi-scale spatial and channel attention module is devised to improve the generation ability to synthesize details. Multi-scale discriminators are designed to operate from large to small receptive fields, and joint adversarial losses are adopted to optimize the whole network in an end-to-end manner. With extensive experiments evaluated on the EyePACS dataset connected to Kaggle, as well as our private dataset (SKA - will be released once get official permission), we validate the effectiveness of our method, which can both synthesize highly realistic (1280 x 1280) controllable fundus images and contribute to the DR grading task.
Tasks Data Augmentation, Lesion Segmentation
Published 2019-12-10
URL https://arxiv.org/abs/1912.04670v1
PDF https://arxiv.org/pdf/1912.04670v1.pdf
PWC https://paperswithcode.com/paper/dr-gan-conditional-generative-adversarial
Repo
Framework

Hierarchical Mixtures of Generators for Adversarial Learning

Title Hierarchical Mixtures of Generators for Adversarial Learning
Authors Alper Ahmetoğlu, Ethem Alpaydın
Abstract Generative adversarial networks (GANs) are deep neural networks that allow us to sample from an arbitrary probability distribution without explicitly estimating the distribution. There is a generator that takes a latent vector as input and transforms it into a valid sample from the distribution. There is also a discriminator that is trained to discriminate such fake samples from true samples of the distribution; at the same time, the generator is trained to generate fakes that the discriminator cannot tell apart from the true samples. Instead of learning a global generator, a recent approach involves training multiple generators each responsible from one part of the distribution. In this work, we review such approaches and propose the hierarchical mixture of generators, inspired from the hierarchical mixture of experts model, that learns a tree structure implementing a hierarchical clustering with soft splits in the decision nodes and local generators in the leaves. Since the generators are combined softly, the whole model is continuous and can be trained using gradient-based optimization, just like the original GAN model. Our experiments on five image data sets, namely, MNIST, FashionMNIST, UTZap50K, Oxford Flowers, and CelebA, show that our proposed model generates samples of high quality and diversity in terms of popular GAN evaluation metrics. The learned hierarchical structure also leads to knowledge extraction.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.02069v1
PDF https://arxiv.org/pdf/1911.02069v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-mixtures-of-generators-for
Repo
Framework

Model reconstruction from temporal data for coupled oscillator networks

Title Model reconstruction from temporal data for coupled oscillator networks
Authors Mark J Panaggio, Maria-Veronica Ciocanel, Lauren Lazarus, Chad M Topaz, Bin Xu
Abstract In a complex system, the interactions between individual agents often lead to emergent collective behavior like spontaneous synchronization, swarming, and pattern formation. The topology of the network of interactions can have a dramatic influence over those dynamics. In many studies, researchers start with a specific model for both the intrinsic dynamics of each agent and the interaction network, and attempt to learn about the dynamics that can be observed in the model. Here we consider the inverse problem: given the dynamics of a system, can one learn about the underlying network? We investigate arbitrary networks of coupled phase-oscillators whose dynamics are characterized by synchronization. We demonstrate that, given sufficient observational data on the transient evolution of each oscillator, one can use machine learning methods to reconstruct the interaction network and simultaneously identify the parameters of a model for the intrinsic dynamics of the oscillators and their coupling.
Tasks
Published 2019-05-04
URL https://arxiv.org/abs/1905.01408v1
PDF https://arxiv.org/pdf/1905.01408v1.pdf
PWC https://paperswithcode.com/paper/model-reconstruction-from-temporal-data-for
Repo
Framework
Title Link Prediction via Graph Attention Network
Authors Weiwei Gu, Fei Gao, Xiaodan Lou, Jiang Zhang
Abstract Link prediction aims to infer missing links or predicting the future ones based on currently observed partial networks, it is a fundamental problem in network science with tremendous real-world applications. However, conventional link prediction approaches neither have high prediction accuracy nor being capable of revealing the hidden information behind links. To address this problem, we generalize the latest techniques in deep learning on graphs and present a new link prediction model - DeepLinker. Instead of learning node representation with the node label information, DeepLinker uses the links as supervised information. Experiments on five graphs show that DeepLinker can not only achieve the state-of-the-art link prediction accuracy, but also acquire the efficient node representations and node centrality ranking as the byproducts. Although the representations are obtained without any supervised node label information, they still perform well on node ranking and node classification tasks.
Tasks Information Retrieval, Language Modelling, Link Prediction, Node Classification
Published 2019-10-10
URL https://arxiv.org/abs/1910.04807v3
PDF https://arxiv.org/pdf/1910.04807v3.pdf
PWC https://paperswithcode.com/paper/link-prediction-via-deep-learning
Repo
Framework

Learning Real Estate Automated Valuation Models from Heterogeneous Data Sources

Title Learning Real Estate Automated Valuation Models from Heterogeneous Data Sources
Authors Francesco Bergadano, Roberto Bertilone, Daniela Paolotti, Giancarlo Ruffo
Abstract Real estate appraisal is a complex and important task, that can be made more precise and faster with the help of automated valuation tools. Usually the value of some property is determined by taking into account both structural and geographical characteristics. However, while geographical information is easily found, obtaining significant structural information requires the intervention of a real estate expert, a professional appraiser. In this paper we propose a Web data acquisition methodology, and a Machine Learning model, that can be used to automatically evaluate real estate properties. This method uses data from previous appraisal documents, from the advertised prices of similar properties found via Web crawling, and from open data describing the characteristics of a corresponding geographical area. We describe a case study, applicable to the whole Italian territory, and initially trained on a data set of individual homes located in the city of Turin, and analyze prediction and practical applicability.
Tasks
Published 2019-09-02
URL https://arxiv.org/abs/1909.00704v1
PDF https://arxiv.org/pdf/1909.00704v1.pdf
PWC https://paperswithcode.com/paper/learning-real-estate-automated-valuation
Repo
Framework
comments powered by Disqus