May 6, 2019

3618 words 17 mins read

Paper Group ANR 285

Compressive K-means. Should Terminology Principles be re-examined?. The Univariate Flagging Algorithm (UFA): a Fully-Automated Approach for Identifying Optimal Thresholds in Data. Modeling the Sequence of Brain Volumes by Local Mesh Models for Brain Decoding. Online Rules for Control of False Discovery Rate and False Discovery Exceedance. Dynamical …

Compressive K-means


Title	Compressive K-means
Authors	Nicolas Keriven, Nicolas Tremblay, Yann Traonmilin, Rémi Gribonval
Abstract	The Lloyd-Max algorithm is a classical approach to perform K-means clustering. Unfortunately, its cost becomes prohibitive as the training dataset grows large. We propose a compressive version of K-means (CKM), that estimates cluster centers from a sketch, i.e. from a drastically compressed representation of the training dataset. We demonstrate empirically that CKM performs similarly to Lloyd-Max, for a sketch size proportional to the number of cen-troids times the ambient dimension, and independent of the size of the original dataset. Given the sketch, the computational complexity of CKM is also independent of the size of the dataset. Unlike Lloyd-Max which requires several replicates, we further demonstrate that CKM is almost insensitive to initialization. For a large dataset of 10^7 data points, we show that CKM can run two orders of magnitude faster than five replicates of Lloyd-Max, with similar clustering performance on artificial data. Finally, CKM achieves lower classification errors on handwritten digits classification.
Tasks
Published	2016-10-27
URL	http://arxiv.org/abs/1610.08738v4
PDF	http://arxiv.org/pdf/1610.08738v4.pdf
PWC	https://paperswithcode.com/paper/compressive-k-means
Repo
Framework

Should Terminology Principles be re-examined?


Title	Should Terminology Principles be re-examined?
Authors	Christophe Roche
Abstract	Operationalization of terminology for IT applications has revived the Wusterian approach. The conceptual dimension once more prevails after taking back seat to specialised lexicography. This is demonstrated by the emergence of ontology in terminology. While the Terminology Principles as defined in Felber manual and the ISO standards remain at the core of traditional terminology, their computational implementation raises some issues. In this article, while reiterating their importance, we will be re-examining these Principles from a dual perspective: that of logic in the mathematical sense of the term and that of epistemology as in the theory of knowledge. We will thus be clarifying and describing some of them so as to take into account advances in knowledge engineering (ontology) and formal systems (logic). The notion of ontoterminology, terminology whose conceptual system is a formal ontology, results from this approach.
Tasks
Published	2016-09-16
URL	http://arxiv.org/abs/1609.05170v1
PDF	http://arxiv.org/pdf/1609.05170v1.pdf
PWC	https://paperswithcode.com/paper/should-terminology-principles-be-re-examined
Repo
Framework

The Univariate Flagging Algorithm (UFA): a Fully-Automated Approach for Identifying Optimal Thresholds in Data


Title	The Univariate Flagging Algorithm (UFA): a Fully-Automated Approach for Identifying Optimal Thresholds in Data
Authors	Mallory Sheth, Roy Welsch, Natasha Markuzon
Abstract	In many data classification problems, there is no linear relationship between an explanatory and the dependent variables. Instead, there may be ranges of the input variable for which the observed outcome is signficantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while providing the maximum support. We evaluate its performance using three examples and demonstrate that thresholds identified by the algorithm align well with visual inspection and subject matter expertise. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is equal to or better than that of more traditional classifiers. We demonstrate that the proposed algorithm is robust against missing data and noise, is scalable, and is easy to interpret and visualize. It is also well suited for problems where incidence of the target is low.
Tasks
Published	2016-04-12
URL	http://arxiv.org/abs/1604.03248v1
PDF	http://arxiv.org/pdf/1604.03248v1.pdf
PWC	https://paperswithcode.com/paper/the-univariate-flagging-algorithm-ufa-a-fully
Repo
Framework

Modeling the Sequence of Brain Volumes by Local Mesh Models for Brain Decoding


Title	Modeling the Sequence of Brain Volumes by Local Mesh Models for Brain Decoding
Authors	Itir Onal, Mete Ozay, Eda Mizrak, Ilke Oztekin, Fatos T. Yarman Vural
Abstract	We represent the sequence of fMRI (Functional Magnetic Resonance Imaging) brain volumes recorded during a cognitive stimulus by a graph which consists of a set of local meshes. The corresponding cognitive process, encoded in the brain, is then represented by these meshes each of which is estimated assuming a linear relationship among the voxel time series in a predefined locality. First, we define the concept of locality in two neighborhood systems, namely, the spatial and functional neighborhoods. Then, we construct spatially and functionally local meshes around each voxel, called seed voxel, by connecting it either to its spatial or functional p-nearest neighbors. The mesh formed around a voxel is a directed sub-graph with a star topology, where the direction of the edges is taken towards the seed voxel at the center of the mesh. We represent the time series recorded at each seed voxel in terms of linear combination of the time series of its p-nearest neighbors in the mesh. The relationships between a seed voxel and its neighbors are represented by the edge weights of each mesh, and are estimated by solving a linear regression equation. The estimated mesh edge weights lead to a better representation of information in the brain for encoding and decoding of the cognitive tasks. We test our model on a visual object recognition and emotional memory retrieval experiments using Support Vector Machines that are trained using the mesh edge weights as features. In the experimental analysis, we observe that the edge weights of the spatial and functional meshes perform better than the state-of-the-art brain decoding models.
Tasks	Brain Decoding, Object Recognition, Time Series
Published	2016-03-03
URL	http://arxiv.org/abs/1603.01067v1
PDF	http://arxiv.org/pdf/1603.01067v1.pdf
PWC	https://paperswithcode.com/paper/modeling-the-sequence-of-brain-volumes-by
Repo
Framework

Online Rules for Control of False Discovery Rate and False Discovery Exceedance


Title	Online Rules for Control of False Discovery Rate and False Discovery Exceedance
Authors	Adel Javanmard, Andrea Montanari
Abstract	Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. Given a set of null hypotheses $\mathcal{H}(n) = (H_1,\dotsc, H_n)$, Benjamini and Hochberg introduced the false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and proposed a testing procedure that controls FDR below a pre-assigned significance level. Nowadays FDR is the criterion of choice for large scale multiple hypothesis testing. In this paper we consider the problem of controlling FDR in an “online manner”. Concretely, we consider an ordered –possibly infinite– sequence of null hypotheses $\mathcal{H} = (H_1,H_2,H_3,\dots )$ where, at each step $i$, the statistician must decide whether to reject hypothesis $H_i$ having access only to the previous decisions. This model was introduced by Foster and Stine. We study a class of “generalized alpha-investing” procedures and prove that any rule in this class controls online FDR, provided $p$-values corresponding to true nulls are independent from the other $p$-values. (Earlier work only established mFDR control.) Next, we obtain conditions under which generalized alpha-investing controls FDR in the presence of general $p$-values dependencies. Finally, we develop a modified set of procedures that also allow to control the false discovery exceedance (the tail of the proportion of false discoveries). Numerical simulations and analytical results indicate that online procedures do not incur a large loss in statistical power with respect to offline approaches, such as Benjamini-Hochberg.
Tasks
Published	2016-03-29
URL	http://arxiv.org/abs/1603.09000v3
PDF	http://arxiv.org/pdf/1603.09000v3.pdf
PWC	https://paperswithcode.com/paper/online-rules-for-control-of-false-discovery
Repo
Framework

Dynamical Kinds and their Discovery


Title	Dynamical Kinds and their Discovery
Authors	Benjamin C. Jantzen
Abstract	We demonstrate the possibility of classifying causal systems into kinds that share a common structure without first constructing an explicit dynamical model or using prior knowledge of the system dynamics. The algorithmic ability to determine whether arbitrary systems are governed by causal relations of the same form offers significant practical applications in the development and validation of dynamical models. It is also of theoretical interest as an essential stage in the scientific inference of laws from empirical data. The algorithm presented is based on the dynamical symmetry approach to dynamical kinds. A dynamical symmetry with respect to time is an intervention on one or more variables of a system that commutes with the time evolution of the system. A dynamical kind is a class of systems sharing a set of dynamical symmetries. The algorithm presented classifies deterministic, time-dependent causal systems by directly comparing their exhibited symmetries. Using simulated, noisy data from a variety of nonlinear systems, we show that this algorithm correctly sorts systems into dynamical kinds. It is robust under significant sampling error, is immune to violations of normality in sampling error, and fails gracefully with increasing dynamical similarity. The algorithm we demonstrate is the first to address this aspect of automated scientific discovery.
Tasks
Published	2016-12-15
URL	http://arxiv.org/abs/1612.04933v1
PDF	http://arxiv.org/pdf/1612.04933v1.pdf
PWC	https://paperswithcode.com/paper/dynamical-kinds-and-their-discovery
Repo
Framework

STORE: Sparse Tensor Response Regression and Neuroimaging Analysis


Title	STORE: Sparse Tensor Response Regression and Neuroimaging Analysis
Authors	Will Wei Sun, Lexin Li
Abstract	Motivated by applications in neuroimaging analysis, we propose a new regression model, Sparse TensOr REsponse regression (STORE), with a tensor response and a vector predictor. STORE embeds two key sparse structures: element-wise sparsity and low-rankness. It can handle both a non-symmetric and a symmetric tensor response, and thus is applicable to both structural and functional neuroimaging data. We formulate the parameter estimation as a non-convex optimization problem, and develop an efficient alternating updating algorithm. We establish a non-asymptotic estimation error bound for the actual estimator obtained from the proposed algorithm. This error bound reveals an interesting interaction between the computational efficiency and the statistical rate of convergence. When the distribution of the error tensor is Gaussian, we further obtain a fast estimation error rate which allows the tensor dimension to grow exponentially with the sample size. We illustrate the efficacy of our model through intensive simulations and an analysis of the Autism spectrum disorder neuroimaging data.
Tasks
Published	2016-09-15
URL	http://arxiv.org/abs/1609.04523v3
PDF	http://arxiv.org/pdf/1609.04523v3.pdf
PWC	https://paperswithcode.com/paper/store-sparse-tensor-response-regression-and
Repo
Framework

Cancerous Nuclei Detection and Scoring in Breast Cancer Histopathological Images


Title	Cancerous Nuclei Detection and Scoring in Breast Cancer Histopathological Images
Authors	Pegah Faridi, Habibollah Danyali, Mohammad Sadegh Helfroush, Mojgan Akbarzadeh Jahromi
Abstract	Early detection and prognosis of breast cancer are feasible by utilizing histopathological grading of biopsy specimens. This research is focused on detection and grading of nuclear pleomorphism in histopathological images of breast cancer. The proposed method consists of three internal steps. First, unmixing colors of H&E is used in the preprocessing step. Second, nuclei boundaries are extracted incorporating the center of cancerous nuclei which are detected by applying morphological operations and Difference of Gaussian filter on the preprocessed image. Finally, segmented nuclei are scored to accomplish one parameter of the Nottingham grading system for breast cancer. In this approach, the nuclei area, chromatin density, contour regularity, and nucleoli presence, are features for nuclear pleomorphism scoring. Experimental results showed that the proposed algorithm, with an accuracy of 86.6%, made significant advancement in detecting cancerous nuclei compared to existing methods in the related literature.
Tasks
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01237v1
PDF	http://arxiv.org/pdf/1612.01237v1.pdf
PWC	https://paperswithcode.com/paper/cancerous-nuclei-detection-and-scoring-in
Repo
Framework

Learning Multi-Scale Deep Features for High-Resolution Satellite Image Classification


Title	Learning Multi-Scale Deep Features for High-Resolution Satellite Image Classification
Authors	Qingshan Liu, Renlong Hang, Huihui Song, Zhi Li
Abstract	In this paper, we propose a multi-scale deep feature learning method for high-resolution satellite image classification. Specifically, we firstly warp the original satellite image into multiple different scales. The images in each scale are employed to train a deep convolutional neural network (DCNN). However, simultaneously training multiple DCNNs is time-consuming. To address this issue, we explore DCNN with spatial pyramid pooling (SPP-net). Since different SPP-nets have the same number of parameters, which share the identical initial values, and only fine-tuning the parameters in fully-connected layers ensures the effectiveness of each network, thereby greatly accelerating the training process. Then, the multi-scale satellite images are fed into their corresponding SPP-nets respectively to extract multi-scale deep features. Finally, a multiple kernel learning method is developed to automatically learn the optimal combination of such features. Experiments on two difficult datasets show that the proposed method achieves favorable performance compared to other state-of-the-art methods.
Tasks	Image Classification
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03591v1
PDF	http://arxiv.org/pdf/1611.03591v1.pdf
PWC	https://paperswithcode.com/paper/learning-multi-scale-deep-features-for-high
Repo
Framework

Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots


Title	Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots
Authors	Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
Abstract	A key requirement for the current generation of artificial decision-makers is that they should adapt well to changes in unexpected situations. This paper addresses the situation in which an AI for aerial dog fighting, with tunable parameters that govern its behavior, must optimize behavior with respect to an objective function that is evaluated and learned through simulations. Bayesian optimization with a Gaussian Process surrogate is used as the method for investigating the objective function. One key benefit is that during optimization, the Gaussian Process learns a global estimate of the true objective function, with predicted outcomes and a statistical measure of confidence in areas that haven’t been investigated yet. Having a model of the objective function is important for being able to understand possible outcomes in the decision space; for example this is crucial for training and providing feedback to human pilots. However, standard Bayesian optimization does not perform consistently or provide an accurate Gaussian Process surrogate function for highly volatile objective functions. We treat these problems by introducing a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This technique gives the AI ability to learn optimum behaviors in a highly uncertain environment. More importantly, it not only improves the reliability of the optimization, but also creates a better model of the entire objective surface. With this improved model the agent is equipped to more accurately/efficiently predict performance in unexplored scenarios.
Tasks
Published	2016-12-13
URL	http://arxiv.org/abs/1612.04315v1
PDF	http://arxiv.org/pdf/1612.04315v1.pdf
PWC	https://paperswithcode.com/paper/towards-adaptive-training-of-agent-based
Repo
Framework

Utilitarian Distributed Constraint Optimization Problems


Title	Utilitarian Distributed Constraint Optimization Problems
Authors	Julien Savaux, Julien Vion, Sylvain Piechowiak, René Mandiau, Toshihiro Matsui, Katsutoshi Hirayama, Makoto Yokoo, Shakre Elmane, Marius Silaghi
Abstract	Privacy has been a major motivation for distributed problem optimization. However, even though several methods have been proposed to evaluate it, none of them is widely used. The Distributed Constraint Optimization Problem (DCOP) is a fundamental model used to approach various families of distributed problems. As privacy loss does not occur when a solution is accepted, but when it is proposed, privacy requirements cannot be interpreted as a criteria of the objective function of the DCOP. Here we approach the problem by letting both the optimized costs found in DCOPs and the privacy requirements guide the agents’ exploration of the search space. We introduce Utilitarian Distributed Constraint Optimization Problem (UDCOP) where the costs and the privacy requirements are used as parameters to a heuristic modifying the search process. Common stochastic algorithms for decentralized constraint optimization problems are evaluated here according to how well they preserve privacy. Further, we propose some extensions where these solvers modify their search process to take into account their privacy requirements, succeeding in significantly reducing their privacy loss without significant degradation of the solution quality.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06787v1
PDF	http://arxiv.org/pdf/1604.06787v1.pdf
PWC	https://paperswithcode.com/paper/utilitarian-distributed-constraint
Repo
Framework

Finding Proofs in Tarskian Geometry


Title	Finding Proofs in Tarskian Geometry
Authors	Michael Beeson, Larry Wos
Abstract	We report on a project to use a theorem prover to find proofs of the theorems in Tarskian geometry. These theorems start with fundamental properties of betweenness, proceed through the derivations of several famous theorems due to Gupta and end with the derivation from Tarski’s axioms of Hilbert’s 1899 axioms for geometry. They include the four challenge problems left unsolved by Quaife, who two decades ago found some \Otter proofs in Tarskian geometry (solving challenges issued in Wos’s 1998 book). There are 212 theorems in this collection. We were able to find \Otter proofs of all these theorems. We developed a methodology for the automated preparation and checking of the input files for those theorems, to ensure that no human error has corrupted the formal development of an entire theory as embodied in two hundred input files and proofs. We distinguish between proofs that were found completely mechanically (without reference to the steps of a book proof) and proofs that were constructed by some technique that involved a human knowing the steps of a book proof. Proofs of length 40–100, roughly speaking, are difficult exercises for a human, and proofs of 100-250 steps belong in a Ph.D. thesis or publication. 29 of the proofs in our collection are longer than 40 steps, and ten are longer than 90 steps. We were able to derive completely mechanically all but 26 of the 183 theorems that have “short” proofs (40 or fewer deduction steps). We found proofs of the rest, as well as the 29 “hard” theorems, using a method that requires consulting the book proof at the outset. Our “subformula strategy” enabled us to prove four of the 29 hard theorems completely mechanically. These are Ph.D. level proofs, of length up to 108.
Tasks
Published	2016-06-22
URL	http://arxiv.org/abs/1606.07095v1
PDF	http://arxiv.org/pdf/1606.07095v1.pdf
PWC	https://paperswithcode.com/paper/finding-proofs-in-tarskian-geometry
Repo
Framework

A Three Spatial Dimension Wave Latent Force Model for Describing Excitation Sources and Electric Potentials Produced by Deep Brain Stimulation


Title	A Three Spatial Dimension Wave Latent Force Model for Describing Excitation Sources and Electric Potentials Produced by Deep Brain Stimulation
Authors	Pablo A. Alvarado, Mauricio A. Álvarez, Álvaro A. Orozco
Abstract	Deep brain stimulation (DBS) is a surgical treatment for Parkinson’s Disease. Static models based on quasi-static approximation are common approaches for DBS modeling. While this simplification has been validated for bioelectric sources, its application to rapid stimulation pulses, which contain more high-frequency power, may not be appropriate, as DBS therapeutic results depend on stimulus parameters such as frequency and pulse width, which are related to time variations of the electric field. We propose an alternative hybrid approach based on probabilistic models and differential equations, by using Gaussian processes and wave equation. Our model avoids quasi-static approximation, moreover, it is able to describe dynamic behavior of DBS. Therefore, the proposed model may be used to obtain a more realistic phenomenon description. The proposed model can also solve inverse problems, i.e. to recover the corresponding source of excitation, given electric potential distribution. The electric potential produced by a time-varying source was predicted using proposed model. For static sources, the electric potential produced by different electrode configurations were modeled. Four different sources of excitation were recovered by solving the inverse problem. We compare our outcomes with the electric potential obtained by solving Poisson’s equation using the Finite Element Method (FEM). Our approach is able to take into account time variations of the source and the produced field. Also, inverse problem can be addressed using the proposed model. The electric potential calculated with the proposed model is close to the potential obtained by solving Poisson’s equation using FEM.
Tasks	Gaussian Processes
Published	2016-08-17
URL	http://arxiv.org/abs/1608.04972v1
PDF	http://arxiv.org/pdf/1608.04972v1.pdf
PWC	https://paperswithcode.com/paper/a-three-spatial-dimension-wave-latent-force
Repo
Framework

Deep learning trends for focal brain pathology segmentation in MRI


Title	Deep learning trends for focal brain pathology segmentation in MRI
Authors	Mohammad Havaei, Nicolas Guizard, Hugo Larochelle, Pierre-Marc Jodoin
Abstract	Segmentation of focal (localized) brain pathologies such as brain tumors and brain lesions caused by multiple sclerosis and ischemic strokes are necessary for medical diagnosis, surgical planning and disease development as well as other applications such as tractography. Over the years, attempts have been made to automate this process for both clinical and research reasons. In this regard, machine learning methods have long been a focus of attention. Over the past two years, the medical imaging field has seen a rise in the use of a particular branch of machine learning commonly known as deep learning. In the non-medical computer vision world, deep learning based methods have obtained state-of-the-art results on many datasets. Recent studies in computer aided diagnostics have shown deep learning methods (and especially convolutional neural networks - CNN) to yield promising results. In this chapter, we provide a survey of CNN methods applied to medical imaging with a focus on brain pathology segmentation. In particular, we discuss their characteristic peculiarities and their specific configuration and adjustments that are best suited to segment medical images. We also underline the intrinsic differences deep learning methods have with other machine learning methods.
Tasks	Medical Diagnosis
Published	2016-07-18
URL	http://arxiv.org/abs/1607.05258v3
PDF	http://arxiv.org/pdf/1607.05258v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-trends-for-focal-brain
Repo
Framework

Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation


Title	Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation
Authors	Konstantinos Benidis, Ying Sun, Prabhu Babu, Daniel P. Palomar
Abstract	The problem of estimating sparse eigenvectors of a symmetric matrix attracts a lot of attention in many applications, especially those with high dimensional data set. While classical eigenvectors can be obtained as the solution of a maximization problem, existing approaches formulate this problem by adding a penalty term into the objective function that encourages a sparse solution. However, the resulting methods achieve sparsity at the expense of sacrificing the orthogonality property. In this paper, we develop a new method to estimate dominant sparse eigenvectors without trading off their orthogonality. The problem is highly non-convex and hard to handle. We apply the MM framework where we iteratively maximize a tight lower bound (surrogate function) of the objective function over the Stiefel manifold. The inner maximization problem turns out to be a rectangular Procrustes problem, which has a closed form solution. In addition, we propose a method to improve the covariance estimation problem when its underlying eigenvectors are known to be sparse. We use the eigenvalue decomposition of the covariance matrix to formulate an optimization problem where we impose sparsity on the corresponding eigenvectors. Numerical experiments show that the proposed eigenvector extraction algorithm matches or outperforms existing algorithms in terms of support recovery and explained variance, while the covariance estimation algorithms improve significantly the sample covariance estimator.
Tasks
Published	2016-02-12
URL	http://arxiv.org/abs/1602.03992v1
PDF	http://arxiv.org/pdf/1602.03992v1.pdf
PWC	https://paperswithcode.com/paper/orthogonal-sparse-pca-and-covariance
Repo
Framework