October 16, 2019

3239 words 16 mins read

Paper Group ANR 1095

Paper Group ANR 1095

A Practical Algorithm for Distributed Clustering and Outlier Detection. Uniform Convergence of Gradients for Non-Convex Learning and Optimization. Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation. A Tensor Factorization Method for 3D Super-Resolution with Application to Dental CT …

A Practical Algorithm for Distributed Clustering and Outlier Detection

Title A Practical Algorithm for Distributed Clustering and Outlier Detection
Authors Jiecao Chen, Erfan Sadeqi Azer, Qin Zhang
Abstract We study the classic $k$-means/median clustering, which are fundamental problems in unsupervised learning, in the setting where data are partitioned across multiple sites, and where we are allowed to discard a small portion of the data by labeling them as outliers. We propose a simple approach based on constructing small summary for the original dataset. The proposed method is time and communication efficient, has good approximation guarantees, and can identify the global outliers effectively. To the best of our knowledge, this is the first practical algorithm with theoretical guarantees for distributed clustering with outliers. Our experiments on both real and synthetic data have demonstrated the clear superiority of our algorithm against all the baseline algorithms in almost all metrics.
Tasks Outlier Detection
Published 2018-05-24
URL http://arxiv.org/abs/1805.09495v2
PDF http://arxiv.org/pdf/1805.09495v2.pdf
PWC https://paperswithcode.com/paper/a-practical-algorithm-for-distributed
Repo
Framework

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

Title Uniform Convergence of Gradients for Non-Convex Learning and Optimization
Authors Dylan J. Foster, Ayush Sekhari, Karthik Sridharan
Abstract We investigate 1) the rate at which refined properties of the empirical risk—in particular, gradients—converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization. Our analysis follows the tradition of norm-based capacity control. We propose vector-valued Rademacher complexities as a simple, composable, and user-friendly tool to derive dimension-free uniform convergence bounds for gradients in non-convex learning problems. As an application of our techniques, we give a new analysis of batch gradient descent methods for non-convex generalized linear models and non-convex robust regression, showing how to use any algorithm that finds approximate stationary points to obtain optimal sample complexity, even when dimension is high or possibly infinite and multiple passes over the dataset are allowed. Moving to non-smooth models we show—-in contrast to the smooth case—that even for a single ReLU it is not possible to obtain dimension-independent convergence rates for gradients in the worst case. On the positive side, it is still possible to obtain dimension-independent rates under a new type of distributional assumption.
Tasks
Published 2018-10-25
URL http://arxiv.org/abs/1810.11059v2
PDF http://arxiv.org/pdf/1810.11059v2.pdf
PWC https://paperswithcode.com/paper/uniform-convergence-of-gradients-for-non
Repo
Framework

Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation

Title Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation
Authors Joris van Vugt, Elena Marchiori, Ritse Mann, Albert Gubern-Mérida, Nikita Moriakov, Jonas Teuwen
Abstract Computer-aided detection aims to improve breast cancer screening programs by helping radiologists to evaluate digital mammography (DM) exams. DM exams are generated by devices from different vendors, with diverse characteristics between and even within vendors. Physical properties of these devices and postprocessing of the images can greatly influence the resulting mammogram. This results in the fact that a deep learning model trained on data from one vendor cannot readily be applied to data from another vendor. This paper investigates the use of tailored transfer learning methods based on adversarial learning to tackle this problem. We consider a database of DM exams (mostly bilateral and two views) generated by Hologic and Siemens vendors. We analyze two transfer learning settings: 1) unsupervised transfer, where Hologic data with soft lesion annotation at pixel level and Siemens unlabelled data are used to annotate images in the latter data; 2) weak supervised transfer, where exam level labels for images from the Siemens mammograph are available. We propose tailored variants of recent state-of-the-art methods for transfer learning which take into account the class imbalance and incorporate knowledge provided by the annotations at exam level. Results of experiments indicate the beneficial effect of transfer learning in both transfer settings. Notably, at 0.02 false positives per image, we achieve a sensitivity of 0.37, compared to 0.30 of a baseline with no transfer. Results indicate that using exam level annotations gives an additional increase in sensitivity.
Tasks Domain Adaptation, Transfer Learning
Published 2018-08-14
URL http://arxiv.org/abs/1808.04909v1
PDF http://arxiv.org/pdf/1808.04909v1.pdf
PWC https://paperswithcode.com/paper/vendor-independent-soft-tissue-lesion
Repo
Framework

A Tensor Factorization Method for 3D Super-Resolution with Application to Dental CT

Title A Tensor Factorization Method for 3D Super-Resolution with Application to Dental CT
Authors Janka Hatvani, Adrian Basarab, Jean-Yves Tourneret, Miklós Gyöngy, Denis Kouamé
Abstract Available super-resolution techniques for 3D images are either computationally inefficient prior-knowledge-based iterative techniques or deep learning methods which require a large database of known low- and high-resolution image pairs. A recently introduced tensor-factorization-based approach offers a fast solution without the use of known image pairs or strict prior assumptions. In this article this factorization framework is investigated for single image resolution enhancement with an off-line estimate of the system point spread function. The technique is applied to 3D cone beam computed tomography for dental image resolution enhancement. To demonstrate the efficiency of our method, it is compared to a recent state-of-the-art iterative technique using low-rank and total variation regularizations. In contrast to this comparative technique, the proposed reconstruction technique gives a 2-order-of-magnitude improvement in running time – 2 minutes compared to 2 hours for a dental volume of 282$\times$266$\times$392 voxels. Furthermore, it also offers slightly improved quantitative results (peak signal-to-noise ratio, segmentation quality). Another advantage of the presented technique is the low number of hyperparameters. As demonstrated in this paper, the framework is not sensitive to small changes of its parameters, proposing an ease of use.
Tasks Super-Resolution
Published 2018-07-26
URL http://arxiv.org/abs/1807.10027v1
PDF http://arxiv.org/pdf/1807.10027v1.pdf
PWC https://paperswithcode.com/paper/a-tensor-factorization-method-for-3d-super
Repo
Framework

Learning From Weights: A Cost-Sensitive Approach For Ad Retrieval

Title Learning From Weights: A Cost-Sensitive Approach For Ad Retrieval
Authors Nikit Begwani, Shrutendra Harsola, Rahul Agrawal
Abstract Retrieval models such as CLSM is trained on click-through data which treats each clicked query-document pair as equivalent. While training on click-through data is reasonable, this paper argues that it is sub-optimal because of its noisy and long-tail nature (especially for sponsored search). In this paper, we discuss the impact of incorporating or disregarding the long tail pairs in the training set. Also, we propose a weighing based strategy using which we can learn semantic representations for tail pairs without compromising the quality of retrieval. We conducted our experiments on Bing sponsored search and also on Amazon product recommendation to demonstrate that the methodology is domain agnostic. Online A/B testing on live search engine traffic showed improvements in clicks (11.8% higher CTR) and as well as improvement in quality (8.2% lower bounce rate) when compared to the unweighted model. We also conduct the experiment on Amazon Product Recommendation data where we see slight improvements in NDCG Scores calculated by retrieving among co-purchased product.
Tasks Product Recommendation
Published 2018-11-30
URL http://arxiv.org/abs/1811.12776v2
PDF http://arxiv.org/pdf/1811.12776v2.pdf
PWC https://paperswithcode.com/paper/learning-from-weights-a-cost-sensitive
Repo
Framework

Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices

Title Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices
Authors Yuxin Chen, Chen Cheng, Jianqing Fan
Abstract This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix $\mathbf{M}^{\star}\in \mathbb{R}^{n\times n}$, yet only a randomly perturbed version $\mathbf{M}$ is observed. The noise matrix $\mathbf{M}-\mathbf{M}^{\star}$ is composed of zero-mean independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise, for example, when we have two independent samples for each entry of $\mathbf{M}^{\star}$ and arrange them into an {\em asymmetric} data matrix $\mathbf{M}$. The aim is to estimate the leading eigenvalue and eigenvector of $\mathbf{M}^{\star}$. We demonstrate that the leading eigenvalue of the data matrix $\mathbf{M}$ can be $O(\sqrt{n})$ times more accurate — up to some log factor — than its (unadjusted) leading singular value in eigenvalue estimation. Further, the perturbation of any linear form of the leading eigenvector of $\mathbf{M}$ — say, entrywise eigenvector perturbation — is provably well-controlled. This eigen-decomposition approach is fully adaptive to heteroscedasticity of noise without the need of careful bias correction or any prior knowledge about the noise variance. We also provide partial theory for the more general rank-$r$ case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition could sometimes be beneficial.
Tasks
Published 2018-11-30
URL https://arxiv.org/abs/1811.12804v5
PDF https://arxiv.org/pdf/1811.12804v5.pdf
PWC https://paperswithcode.com/paper/asymmetry-helps-eigenvalue-and-eigenvector
Repo
Framework

Graph Learning-Convolutional Networks

Title Graph Learning-Convolutional Networks
Authors Bo Jiang, Ziyan Zhang, Doudou Lin, Jin Tang
Abstract Recently, graph Convolutional Neural Networks (graph CNNs) have been widely used for graph data representation and semi-supervised learning tasks. However, existing graph CNNs generally use a fixed graph which may be not optimal for semi-supervised learning tasks. In this paper, we propose a novel Graph Learning-Convolutional Network (GLCN) for graph data representation and semi-supervised learning. The aim of GLCN is to learn an optimal graph structure that best serves graph CNNs for semi-supervised learning by integrating both graph learning and graph convolution together in a unified network architecture. The main advantage is that in GLCN, both given labels and the estimated labels are incorporated and thus can provide useful ‘weakly’ supervised information to refine (or learn) the graph construction and also to facilitate the graph convolution operation in GLCN for unknown label estimation. Experimental results on seven benchmarks demonstrate that GLCN significantly outperforms state-of-the-art traditional fixed structure based graph CNNs.
Tasks graph construction
Published 2018-11-25
URL http://arxiv.org/abs/1811.09971v1
PDF http://arxiv.org/pdf/1811.09971v1.pdf
PWC https://paperswithcode.com/paper/graph-learning-convolutional-networks
Repo
Framework

Explicit Spatiotemporal Joint Relation Learning for Tracking Human Pose

Title Explicit Spatiotemporal Joint Relation Learning for Tracking Human Pose
Authors Xiao Sun, Chuankang Li, Stephen Lin
Abstract We present a method for human pose tracking that is based on learning spatiotemporal relationships among joints. Beyond generating the heatmap of a joint in a given frame, our system also learns to predict the offset of the joint from a neighboring joint in the frame. Additionally, it is trained to predict the displacement of the joint from its position in the previous frame, in a manner that can account for possibly changing joint appearance, unlike optical flow. These relational cues in the spatial domain and temporal domain are inferred in a robust manner by attending only to relevant areas in the video frames. By explicitly learning and exploiting these joint relationships, our system achieves state-of-the-art performance on standard benchmarks for various pose tracking tasks including 3D body pose tracking in RGB video, 3D hand pose tracking in depth sequences, and 3D hand gesture tracking in RGB video.
Tasks Optical Flow Estimation, Pose Estimation, Pose Tracking
Published 2018-11-17
URL http://arxiv.org/abs/1811.07123v3
PDF http://arxiv.org/pdf/1811.07123v3.pdf
PWC https://paperswithcode.com/paper/explicit-pose-deformation-learning-for
Repo
Framework

On Matching Pursuit and Coordinate Descent

Title On Matching Pursuit and Coordinate Descent
Authors Francesco Locatello, Anant Raj, Sai Praneeth Karimireddy, Gunnar Rätsch, Bernhard Schölkopf, Sebastian U. Stich, Martin Jaggi
Abstract Two popular examples of first-order optimization methods over linear spaces are coordinate descent and matching pursuit algorithms, with their randomized variants. While the former targets the optimization by moving along coordinates, the latter considers a generalized notion of directions. Exploiting the connection between the two algorithms, we present a unified analysis of both, providing affine invariant sublinear $\mathcal{O}(1/t)$ rates on smooth objectives and linear convergence on strongly convex objectives. As a byproduct of our affine invariant analysis of matching pursuit, our rates for steepest coordinate descent are the tightest known. Furthermore, we show the first accelerated convergence rate $\mathcal{O}(1/t^2)$ for matching pursuit and steepest coordinate descent on convex objectives.
Tasks
Published 2018-03-26
URL https://arxiv.org/abs/1803.09539v7
PDF https://arxiv.org/pdf/1803.09539v7.pdf
PWC https://paperswithcode.com/paper/on-matching-pursuit-and-coordinate-descent
Repo
Framework

Random Language Model

Title Random Language Model
Authors E. DeGiuli
Abstract Many complex generative systems use languages to create structured objects. We consider a model of random languages, defined by weighted context-free grammars. As the distribution of grammar weights broadens, a transition is found from a random phase, in which sentences are indistinguishable from noise, to an organized phase in which nontrivial information is carried. This marks the emergence of deep structure in the language, and can be understood by a competition between energy and entropy.
Tasks Language Modelling
Published 2018-09-04
URL http://arxiv.org/abs/1809.01201v2
PDF http://arxiv.org/pdf/1809.01201v2.pdf
PWC https://paperswithcode.com/paper/random-language-model-a-path-to-principled
Repo
Framework

An amplitudes-perturbation data augmentation method in convolutional neural networks for EEG decoding

Title An amplitudes-perturbation data augmentation method in convolutional neural networks for EEG decoding
Authors Xian-Rui Zhang, Meng-Ying Lei, Yang Li
Abstract Brain-Computer Interface (BCI) system provides a pathway between humans and the outside world by analyzing brain signals which contain potential neural information. Electroencephalography (EEG) is one of most commonly used brain signals and EEG recognition is an important part of BCI system. Recently, convolutional neural networks (ConvNet) in deep learning are becoming the new cutting edge tools to tackle the problem of EEG recognition. However, training an effective deep learning model requires a big number of data, which limits the application of EEG datasets with a small number of samples. In order to solve the issue of data insufficiency in deep learning for EEG decoding, we propose a novel data augmentation method that add perturbations to amplitudes of EEG signals after transform them to frequency domain. In experiments, we explore the performance of signal recognition with the state-of-the-art models before and after data augmentation on BCI Competition IV dataset 2a and our local dataset. The results show that our data augmentation technique can improve the accuracy of EEG recognition effectively.
Tasks Data Augmentation, EEG, Eeg Decoding
Published 2018-11-06
URL http://arxiv.org/abs/1811.02353v1
PDF http://arxiv.org/pdf/1811.02353v1.pdf
PWC https://paperswithcode.com/paper/an-amplitudes-perturbation-data-augmentation
Repo
Framework

A Recurrent CNN for Automatic Detection and Classification of Coronary Artery Plaque and Stenosis in Coronary CT Angiography

Title A Recurrent CNN for Automatic Detection and Classification of Coronary Artery Plaque and Stenosis in Coronary CT Angiography
Authors Majd Zreik, Robbert W. van Hamersvelt, Jelmer M. Wolterink, Tim Leiner, Max A. Viergever, Ivana Isgum
Abstract Various types of atherosclerotic plaque and varying grades of stenosis could lead to different management of patients with coronary artery disease. Therefore, it is crucial to detect and classify the type of coronary artery plaque, as well as to detect and determine the degree of coronary artery stenosis. This study includes retrospectively collected clinically obtained coronary CT angiography (CCTA) scans of 163 patients. To perform automatic analysis for coronary artery plaque and stenosis classification, a multi-task recurrent convolutional neural network is applied on multi-planar reformatted (MPR) images of the coronary arteries. First, a 3D convolutional neural network is utilized to extract features along the coronary artery. Subsequently, the extracted features are aggregated by a recurrent neural network that performs two simultaneous multi-class classification tasks. In the first task, the network detects and characterizes the type of the coronary artery plaque (no plaque, non-calcified, mixed, calcified). In the second task, the network detects and determines the anatomical significance of the coronary artery stenosis (no stenosis, non-significant i.e. <50% luminal narrowing, significant i.e. >50% luminal narrowing). For detection and classification of coronary plaque, the method achieved an accuracy of 0.77. For detection and classification of stenosis, the method achieved an accuracy of 0.80. The results demonstrate that automatic detection and classification of coronary artery plaque and stenosis are feasible. This may enable automated triage of patients to those without coronary plaque and those with coronary plaque and stenosis in need for further cardiovascular workup.
Tasks
Published 2018-04-12
URL http://arxiv.org/abs/1804.04360v4
PDF http://arxiv.org/pdf/1804.04360v4.pdf
PWC https://paperswithcode.com/paper/a-recurrent-cnn-for-automatic-detection-and
Repo
Framework

Omega: An Architecture for AI Unification

Title Omega: An Architecture for AI Unification
Authors Eray Özkural
Abstract We introduce the open-ended, modular, self-improving Omega AI unification architecture which is a refinement of Solomonoff’s Alpha architecture, as considered from first principles. The architecture embodies several crucial principles of general intelligence including diversity of representations, diversity of data types, integrated memory, modularity, and higher-order cognition. We retain the basic design of a fundamental algorithmic substrate called an “AI kernel” for problem solving and basic cognitive functions like memory, and a larger, modular architecture that re-uses the kernel in many ways. Omega includes eight representation languages and six classes of neural networks, which are briefly introduced. The architecture is intended to initially address data science automation, hence it includes many problem solving methods for statistical tasks. We review the broad software architecture, higher-order cognition, self-improvement, modular neural architectures, intelligent agents, the process and memory hierarchy, hardware abstraction, peer-to-peer computing, and data abstraction facility.
Tasks
Published 2018-05-16
URL http://arxiv.org/abs/1805.12069v1
PDF http://arxiv.org/pdf/1805.12069v1.pdf
PWC https://paperswithcode.com/paper/omega-an-architecture-for-ai-unification
Repo
Framework

Evolutionary Innovations and Where to Find Them: Routes to Open-Ended Evolution in Natural and Artificial Systems

Title Evolutionary Innovations and Where to Find Them: Routes to Open-Ended Evolution in Natural and Artificial Systems
Authors Tim Taylor
Abstract This paper presents a high-level conceptual framework to help orient the discussion and implementation of open-endedness in evolutionary systems. Drawing upon earlier work by Banzhaf et al., three different kinds of open-endedness are identified: exploratory, expansive, and transformational. These are characterised in terms of their relationship to the search space of phenotypic behaviours. A formalism is introduced to describe three key processes required for an evolutionary process: the generation of a phenotype from a genetic description, the evaluation of that phenotype, and the reproduction with variation of individuals according to their evaluation. The distinction is made between intrinsic and extrinsic implementations of these processes. A discussion then investigates how various interactions between these processes, and their modes of implementation, can lead to open-endedness. However, an important contribution of the paper is the demonstration that these considerations relate to exploratory open-endedness only. Conditions for the implementation of the more interesting kinds of open-endedness - expansive and transformational - are also discussed, emphasizing factors such as multiple domains of behaviour, transdomain bridges, and non-additive compositional systems. These factors relate not to the generic evolutionary properties of individuals and populations, but rather to the nature of the building blocks out of which individual organisms are constructed, and the laws and properties of the environment in which they exist. The paper ends with suggestions of how the framework can be used to categorise and compare the open-ended evolutionary potential of different systems, how it might guide the design of systems with greater capacity for open-ended evolution, and how it might be further improved.
Tasks
Published 2018-06-05
URL http://arxiv.org/abs/1806.01883v4
PDF http://arxiv.org/pdf/1806.01883v4.pdf
PWC https://paperswithcode.com/paper/evolutionary-innovations-and-where-to-find
Repo
Framework

Tree-CNN: A Hierarchical Deep Convolutional Neural Network for Incremental Learning

Title Tree-CNN: A Hierarchical Deep Convolutional Neural Network for Incremental Learning
Authors Deboleena Roy, Priyadarshini Panda, Kaushik Roy
Abstract Over the past decade, Deep Convolutional Neural Networks (DCNNs) have shown remarkable performance in most computer vision tasks. These tasks traditionally use a fixed dataset, and the model, once trained, is deployed as is. Adding new information to such a model presents a challenge due to complex training issues, such as “catastrophic forgetting”, and sensitivity to hyper-parameter tuning. However, in this modern world, data is constantly evolving, and our deep learning models are required to adapt to these changes. In this paper, we propose an adaptive hierarchical network structure composed of DCNNs that can grow and learn as new data becomes available. The network grows in a tree-like fashion to accommodate new classes of data, while preserving the ability to distinguish the previously trained classes. The network organizes the incrementally available data into feature-driven super-classes and improves upon existing hierarchical CNN models by adding the capability of self-growth. The proposed hierarchical model, when compared against fine-tuning a deep network, achieves significant reduction of training effort, while maintaining competitive accuracy on CIFAR-10 and CIFAR-100.
Tasks Object Recognition
Published 2018-02-15
URL https://arxiv.org/abs/1802.05800v3
PDF https://arxiv.org/pdf/1802.05800v3.pdf
PWC https://paperswithcode.com/paper/tree-cnn-a-hierarchical-deep-convolutional
Repo
Framework
comments powered by Disqus