April 2, 2020

3130 words 15 mins read

Paper Group ANR 148

Paper Group ANR 148

Image denoising via K-SVD with primal-dual active set algorithm. XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning. Joint Reasoning for Multi-Faceted Commonsense Knowledge. Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected via Ensemble Feature Selection Methods. Training a U-Net based o …

Image denoising via K-SVD with primal-dual active set algorithm

Title Image denoising via K-SVD with primal-dual active set algorithm
Authors Quan Xiao, Canhong Wen, Zirui Yan
Abstract K-SVD algorithm has been successfully applied to image denoising tasks dozens of years but the big bottleneck in speed and accuracy still needs attention to break. For the sparse coding stage in K-SVD, which involves $\ell_{0}$ constraint, prevailing methods usually seek approximate solutions greedily but are less effective once the noise level is high. The alternative $\ell_{1}$ optimization is proved to be powerful than $\ell_{0}$, however, the time consumption prevents it from the implementation. In this paper, we propose a new K-SVD framework called K-SVD$_P$ by applying the Primal-dual active set (PDAS) algorithm to it. Different from the greedy algorithms based K-SVD, the K-SVD$_P$ algorithm develops a selection strategy motivated by KKT (Karush-Kuhn-Tucker) condition and yields to an efficient update in the sparse coding stage. Since the K-SVD$_P$ algorithm seeks for an equivalent solution to the dual problem iteratively with simple explicit expression in this denoising problem, speed and quality of denoising can be reached simultaneously. Experiments are carried out and demonstrate the comparable denoising performance of our K-SVD$_P$ with state-of-the-art methods.
Tasks Denoising, Image Denoising
Published 2020-01-19
URL https://arxiv.org/abs/2001.06780v1
PDF https://arxiv.org/pdf/2001.06780v1.pdf
PWC https://paperswithcode.com/paper/image-denoising-via-k-svd-with-primal-dual

XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning

Title XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning
Authors Sung Whan Yoon, Do-Yeon Kim, Jun Seo, Jaekyun Moon
Abstract Learning novel concepts while preserving prior knowledge is a long-standing challenge in machine learning. The challenge gets greater when a novel task is given with only a few labeled examples, a problem known as incremental few-shot learning. We propose XtarNet, which learns to extract task-adaptive representation (TAR) for facilitating incremental few-shot learning. The method utilizes a backbone network pretrained on a set of base categories while also employing additional modules that are meta-trained across episodes. Given a new task, the novel feature extracted from the meta-trained modules is mixed with the base feature obtained from the pretrained model. The process of combining two different features provides TAR and is also controlled by meta-trained modules. The TAR contains effective information for classifying both novel and base categories. The base and novel classifiers quickly adapt to a given task by utilizing the TAR. Experiments on standard image datasets indicate that XtarNet achieves state-of-the-art incremental few-shot learning performance. The concept of TAR can also be used in conjunction with existing incremental few-shot learning methods; extensive simulation results in fact show that applying TAR enhances the known methods significantly.
Tasks Few-Shot Learning
Published 2020-03-19
URL https://arxiv.org/abs/2003.08561v1
PDF https://arxiv.org/pdf/2003.08561v1.pdf
PWC https://paperswithcode.com/paper/xtarnet-learning-to-extract-task-adaptive

Joint Reasoning for Multi-Faceted Commonsense Knowledge

Title Joint Reasoning for Multi-Faceted Commonsense Knowledge
Authors Yohan Chalier, Simon Razniewski, Gerhard Weikum
Abstract Commonsense knowledge (CSK) supports a variety of AI applications, from visual understanding to chatbots. Prior works on acquiring CSK, such as ConceptNet, have compiled statements that associate concepts, like everyday objects or activities, with properties that hold for most or some instances of the concept. Each concept is treated in isolation from other concepts, and the only quantitative measure (or ranking) of properties is a confidence score that the statement is valid. This paper aims to overcome these limitations by introducing a multi-faceted model of CSK statements and methods for joint reasoning over sets of inter-related statements. Our model captures four different dimensions of CSK statements: plausibility, typicality, remarkability and salience, with scoring and ranking along each dimension. For example, hyenas drinking water is typical but not salient, whereas hyenas eating carcasses is salient. For reasoning and ranking, we develop a method with soft constraints, to couple the inference over concepts that are related in in a taxonomic hierarchy. The reasoning is cast into an integer linear programming (ILP), and we leverage the theory of reduction costs of a relaxed LP to compute informative rankings. This methodology is applied to several large CSK collections. Our evaluation shows that we can consolidate these inputs into much cleaner and more expressive knowledge. Results are available at https://dice.mpi-inf.mpg.de.
Published 2020-01-13
URL https://arxiv.org/abs/2001.04170v1
PDF https://arxiv.org/pdf/2001.04170v1.pdf
PWC https://paperswithcode.com/paper/joint-reasoning-for-multi-faceted-commonsense

Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected via Ensemble Feature Selection Methods

Title Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected via Ensemble Feature Selection Methods
Authors Hilal Hacilar, O. Ufuk Nalbantoglu, Oya Aran, Burcu Bakir-Gungor
Abstract The tremendous boost in the next generation sequencing and in the omics technologies makes it possible to characterize human gut microbiome (the collective genomes of the microbial community that reside in our gastrointestinal tract). While some of these microorganisms are considered as essential regulators of our immune system, some others can cause several diseases such as Inflammatory Bowel Diseases (IBD), diabetes, and cancer. IBD, is a gut related disorder where the deviations from the healthy gut microbiome are considered to be associated with IBD. Although existing studies attempt to unveal the composition of the gut microbiome in relation to IBD diseases, a comprehensive picture is far from being complete. Due to the complexity of metagenomic studies, the applications of the state of the art machine learning techniques became popular to address a wide range of questions in the field of metagenomic data analysis. In this regard, using IBD associated metagenomics dataset, this study utilizes both supervised and unsupervised machine learning algorithms, i) to generate a classification model that aids IBD diagnosis, ii) to discover IBD associated biomarkers, iii) to find subgroups of IBD patients using k means and hierarchical clustering. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), min redundancy max relevance (mRMR) and Extreme Gradient Boosting (XGBoost). In our experiments with 10 fold cross validation, XGBoost had a considerable effect in terms of minimizing the microbiota used for the diagnosis of IBD and thus reducing the cost and time. We observed that compared to the single classifiers, ensemble methods such as kNN and logitboost resulted in better performance measures for the classification of IBD.
Tasks Feature Selection
Published 2020-01-08
URL https://arxiv.org/abs/2001.03019v1
PDF https://arxiv.org/pdf/2001.03019v1.pdf
PWC https://paperswithcode.com/paper/inflammatory-bowel-disease-biomarkers-of

Training a U-Net based on a random mode-coupling matrix model to recover acoustic interference striations

Title Training a U-Net based on a random mode-coupling matrix model to recover acoustic interference striations
Authors Xiaolei Li, Wenhua Song, Dazhi Gao, Wei Gao, Haozhong Wan
Abstract A U-Net is trained to recover acoustic interference striations (AISs) from distorted ones. A random mode-coupling matrix model is introduced to generate a large number of training data quickly, which are used to train the U-Net. The performance of AIS recovery of the U-Net is tested in range-dependent waveguides with nonlinear internal waves (NLIWs). Although the random mode-coupling matrix model is not an accurate physical model, the test results show that the U-Net successfully recovers AISs under different signal-to-noise ratios (SNRs) and different amplitudes and widths of NLIWs for different shapes.
Published 2020-03-24
URL https://arxiv.org/abs/2003.10661v1
PDF https://arxiv.org/pdf/2003.10661v1.pdf
PWC https://paperswithcode.com/paper/training-a-u-net-based-on-a-random-mode

Depth Edge Guided CNNs for Sparse Depth Upsampling

Title Depth Edge Guided CNNs for Sparse Depth Upsampling
Authors Yi Guo, Ji Liu
Abstract Guided sparse depth upsampling aims to upsample an irregularly sampled sparse depth map when an aligned high-resolution color image is given as guidance. Many neural networks have been designed for this task. However, they often ignore the structural difference between the depth and the color image, resulting in obvious artifacts such as texture copy and depth blur at the upsampling depth. Inspired by the normalized convolution operation, we propose a guided convolutional layer to recover dense depth from sparse and irregular depth image with an depth edge image as guidance. Our novel guided network can prevent the depth value from crossing the depth edge to facilitate upsampling. We further design a convolution network based on proposed convolutional layer to combine the advantages of different algorithms and achieve better performance. We conduct comprehensive experiments to verify our method on real-world indoor and synthetic outdoor datasets. Our method produces strong results. It outperforms state-of-the-art methods on the Virtual KITTI dataset and the Middlebury dataset. It also presents strong generalization capability under different 3D point densities, various lighting and weather conditions.
Published 2020-03-23
URL https://arxiv.org/abs/2003.10138v1
PDF https://arxiv.org/pdf/2003.10138v1.pdf
PWC https://paperswithcode.com/paper/depth-edge-guided-cnns-for-sparse-depth

Stochastic Latent Residual Video Prediction

Title Stochastic Latent Residual Video Prediction
Authors Jean-Yves Franceschi, Edouard Delasalles, Mickaël Chen, Sylvain Lamprier, Patrick Gallinari
Abstract Designing video prediction models that account for the inherent uncertainty of the future is challenging. Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues. An alternative is to use fully latent temporal models which untie frame synthesis and temporal dynamics. However, no such model for stochastic video prediction has been proposed in the literature yet, due to design and training difficulties. In this paper, we overcome these difficulties by introducing a novel stochastic temporal model whose dynamics are governed in a latent space by a residual update rule. This first-order scheme is motivated by discretization schemes of differential equations. It naturally models video dynamics as it allows our simpler, more interpretable, latent model to outperform prior state-of-the-art methods on challenging datasets.
Tasks Video Prediction
Published 2020-02-21
URL https://arxiv.org/abs/2002.09219v1
PDF https://arxiv.org/pdf/2002.09219v1.pdf
PWC https://paperswithcode.com/paper/stochastic-latent-residual-video-prediction-1

Nonlinear Time Series Classification Using Bispectrum-based Deep Convolutional Neural Networks

Title Nonlinear Time Series Classification Using Bispectrum-based Deep Convolutional Neural Networks
Authors Paul A. Parker, Scott H. Holan, Nalini Ravishanker
Abstract Time series classification using novel techniques has experienced a recent resurgence and growing interest from statisticians, subject-domain scientists, and decision makers in business and industry. This is primarily due to the ever increasing amount of big and complex data produced as a result of technological advances. A motivating example is that of Google trends data, which exhibit highly nonlinear behavior. Although a rich literature exists for addressing this problem, existing approaches mostly rely on first and second order properties of the time series, since they typically assume linearity of the underlying process. Often, these are inadequate for effective classification of nonlinear time series data such as Google Trends data. Given these methodological deficiencies and the abundance of nonlinear time series that persist among real-world phenomena, we introduce an approach that merges higher order spectral analysis (HOSA) with deep convolutional neural networks (CNNs) for classifying time series. The effectiveness of our approach is illustrated using simulated data and two motivating industry examples that involve Google trends data and electronic device energy consumption data.
Tasks Time Series, Time Series Classification
Published 2020-03-04
URL https://arxiv.org/abs/2003.02353v1
PDF https://arxiv.org/pdf/2003.02353v1.pdf
PWC https://paperswithcode.com/paper/nonlinear-time-series-classification-using

Bayesian Nonparametric Space Partitions: A Survey

Title Bayesian Nonparametric Space Partitions: A Survey
Authors Xuhui Fan, Bin Li, Ling Luo, Scott A. Sisson
Abstract Bayesian nonparametric space partition (BNSP) models provide a variety of strategies for partitioning a $D$-dimensional space into a set of blocks. In this way, the data points lie in the same block would share certain kinds of homogeneity. BNSP models can be applied to various areas, such as regression/classification trees, random feature construction, relational modeling, etc. In this survey, we investigate the current progress of BNSP research through the following three perspectives: models, which review various strategies for generating the partitions in the space and discuss their theoretical foundation `self-consistency’; applications, which cover the current mainstream usages of BNSP models and their potential future practises; and challenges, which identify the current unsolved problems and valuable future research topics. As there are no comprehensive reviews of BNSP literature before, we hope that this survey can induce further exploration and exploitation on this topic. |
Published 2020-02-26
URL https://arxiv.org/abs/2002.11394v1
PDF https://arxiv.org/pdf/2002.11394v1.pdf
PWC https://paperswithcode.com/paper/bayesian-nonparametric-space-partitions-a

Toward Generalized Clustering through an One-Dimensional Approach

Title Toward Generalized Clustering through an One-Dimensional Approach
Authors Luciano da F. Costa
Abstract After generalizing the concept of clusters to incorporate clusters that are linked to other clusters through some relatively narrow bridges, an approach for detecting patches of separation between these clusters is developed based on an agglomerative clustering, more specifically the single-linkage, applied to one-dimensional slices obtained from respective feature spaces. The potential of this method is illustrated with respect to the analyses of clusterless uniform and normal distributions of points, as well as a one-dimensional clustering model characterized by two intervals with high density of points separated by a less dense interstice. This partial clustering method is then considered as a means of feature selection and cluster identification, and two simple but potentially effective respective methods are described and illustrated with respect to some hypothetical situations.
Tasks Feature Selection
Published 2020-01-01
URL https://arxiv.org/abs/2001.02741v1
PDF https://arxiv.org/pdf/2001.02741v1.pdf
PWC https://paperswithcode.com/paper/toward-generalized-clustering-through-an-one

Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs

Title Communication Contention Aware Scheduling of Multiple Deep Learning Training Jobs
Authors Qiang Wang, Shaohuai Shi, Canhui Wang, Xiaowen Chu
Abstract Distributed Deep Learning (DDL) has rapidly grown its popularity since it helps boost the training performance on high-performance GPU clusters. Efficient job scheduling is indispensable to maximize the overall performance of the cluster when training multiple jobs simultaneously. However, existing schedulers do not consider the communication contention of multiple communication tasks from different distributed training jobs, which could deteriorate the system performance and prolong the job completion time. In this paper, we first establish a new DDL job scheduling framework which organizes DDL jobs as Directed Acyclic Graphs (DAGs) and considers communication contention between nodes. We then propose an efficient algorithm, LWF-$\kappa$, to balance the GPU utilization and consolidate the allocated GPUs for each job. When scheduling those communication tasks, we observe that neither avoiding all the contention nor blindly accepting them is optimal to minimize the job completion time. We thus propose a provable algorithm, AdaDUAL, to efficiently schedule those communication tasks. Based on AdaDUAL, we finally propose Ada-SRSF for the DDL job scheduling problem. Simulations on a 64-GPU cluster connected with 10 Gbps Ethernet show that LWF-$\kappa$ achieves up to $1.59\times$ improvement over the classical first-fit algorithms. More importantly, Ada-SRSF reduces the average job completion time by $20.1%$ and $36.7%$, as compared to the SRSF(1) scheme (avoiding all the contention) and the SRSF(2) scheme (blindly accepting all of two-way communication contention) respectively.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10105v1
PDF https://arxiv.org/pdf/2002.10105v1.pdf
PWC https://paperswithcode.com/paper/communication-contention-aware-scheduling-of

Reinforcement Learning with Probabilistically Complete Exploration

Title Reinforcement Learning with Probabilistically Complete Exploration
Authors Philippe Morere, Gilad Francis, Tom Blau, Fabio Ramos
Abstract Balancing exploration and exploitation remains a key challenge in reinforcement learning (RL). State-of-the-art RL algorithms suffer from high sample complexity, particularly in the sparse reward case, where they can do no better than to explore in all directions until the first positive rewards are found. To mitigate this, we propose Rapidly Randomly-exploring Reinforcement Learning (R3L). We formulate exploration as a search problem and leverage widely-used planning algorithms such as Rapidly-exploring Random Tree (RRT) to find initial solutions. These solutions are used as demonstrations to initialize a policy, then refined by a generic RL algorithm, leading to faster and more stable convergence. We provide theoretical guarantees of R3L exploration finding successful solutions, as well as bounds for its sampling complexity. We experimentally demonstrate the method outperforms classic and intrinsic exploration techniques, requiring only a fraction of exploration samples and achieving better asymptotic performance.
Published 2020-01-20
URL https://arxiv.org/abs/2001.06940v1
PDF https://arxiv.org/pdf/2001.06940v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-with-probabilistically-1

Model-Driven Beamforming Neural Networks

Title Model-Driven Beamforming Neural Networks
Authors Wenchao Xia, Gan Zheng, Kai-Kit Wong, Hongbo Zhu
Abstract Beamforming is evidently a core technology in recent generations of mobile communication networks. Nevertheless, an iterative process is typically required to optimize the parameters, making it ill-placed for real-time implementation due to high complexity and computational delay. Heuristic solutions such as zero-forcing (ZF) are simpler but at the expense of performance loss. Alternatively, deep learning (DL) is well understood to be a generalizing technique that can deliver promising results for a wide range of applications at much lower complexity if it is sufficiently trained. As a consequence, DL may present itself as an attractive solution to beamforming. To exploit DL, this article introduces general data- and model-driven beamforming neural networks (BNNs), presents various possible learning strategies, and also discusses complexity reduction for the DL-based BNNs. We also offer enhancement methods such as training-set augmentation and transfer learning in order to improve the generality of BNNs, accompanied by computer simulation results and testbed results showing the performance of such BNN solutions.
Tasks Transfer Learning
Published 2020-01-15
URL https://arxiv.org/abs/2001.05277v1
PDF https://arxiv.org/pdf/2001.05277v1.pdf
PWC https://paperswithcode.com/paper/model-driven-beamforming-neural-networks

Dual Stochastic Natural Gradient Descent

Title Dual Stochastic Natural Gradient Descent
Authors Borja Sánchez-López, Jesús Cerquides
Abstract Although theoretically appealing, Stochastic Natural Gradient Descent (SNGD) is computationally expensive, it has been shown to be highly sensitive to the learning rate, and it is not guaranteed to be convergent. Convergent Stochastic Natural Gradient Descent (CSNGD) aims at solving the last two problems. However, the computational expense of CSNGD is still unacceptable when the number of parameters is large. In this paper we introduce the Dual Stochastic Natural Gradient Descent (DSNGD) where we take benefit of dually flat manifolds to obtain a robust alternative to SNGD which is also computationally feasible.
Published 2020-01-19
URL https://arxiv.org/abs/2001.06744v1
PDF https://arxiv.org/pdf/2001.06744v1.pdf
PWC https://paperswithcode.com/paper/dual-stochastic-natural-gradient-descent

Explainable Deep Modeling of Tabular Data using TableGraphNet

Title Explainable Deep Modeling of Tabular Data using TableGraphNet
Authors Gabriel Terejanu, Jawad Chowdhury, Rezaur Rashid, Asif Chowdhury
Abstract The vast majority of research on explainability focuses on post-explainability rather than explainable modeling. Namely, an explanation model is derived to explain a complex black box model built with the sole purpose of achieving the highest performance possible. In part, this trend might be driven by the misconception that there is a trade-off between explainability and accuracy. Furthermore, the consequential work on Shapely values, grounded in game theory, has also contributed to a new wave of post-explainability research on better approximations for various machine learning models, including deep learning models. We propose a new architecture that inherently produces explainable predictions in the form of additive feature attributions. Our approach learns a graph representation for each record in the dataset. Attribute centric features are then derived from the graph and fed into a contribution deep set model to produce the final predictions. We show that our explainable model attains the same level of performance as black box models. Finally, we provide an augmented model training approach that leverages the missingness property and yields high levels of consistency (as required for the Shapely values) without loss of accuracy.
Published 2020-02-12
URL https://arxiv.org/abs/2002.05205v1
PDF https://arxiv.org/pdf/2002.05205v1.pdf
PWC https://paperswithcode.com/paper/explainable-deep-modeling-of-tabular-data
comments powered by Disqus