Paper Group ANR 794
Automatic segmentation method of pelvic floor levator hiatus in ultrasound using a self-normalising neural network. Deep vs. Diverse Architectures for Classification Problems. Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach. Super-Resolution with Deep Adaptive Image Resampling. Skin lesion segmentation based …
Automatic segmentation method of pelvic floor levator hiatus in ultrasound using a self-normalising neural network
Title | Automatic segmentation method of pelvic floor levator hiatus in ultrasound using a self-normalising neural network |
Authors | Ester Bonmati, Yipeng Hu, Nikhil Sindhwani, Hans Peter Dietz, Jan D’hooge, Dean Barratt, Jan Deprest, Tom Vercauteren |
Abstract | Segmentation of the levator hiatus in ultrasound allows to extract biometrics which are of importance for pelvic floor disorder assessment. In this work, we present a fully automatic method using a convolutional neural network (CNN) to outline the levator hiatus in a 2D image extracted from a 3D ultrasound volume. In particular, our method uses a recently developed scaled exponential linear unit (SELU) as a nonlinear self-normalising activation function, which for the first time has been applied in medical imaging with CNN. SELU has important advantages such as being parameter-free and mini-batch independent, which may help to overcome memory constraints during training. A dataset with 91 images from 35 patients during Valsalva, contraction and rest, all labelled by three operators, is used for training and evaluation in a leave-one-patient-out cross-validation. Results show a median Dice similarity coefficient of 0.90 with an interquartile range of 0.08, with equivalent performance to the three operators (with a Williams’ index of 1.03), and outperforming a U-Net architecture without the need for batch normalisation. We conclude that the proposed fully automatic method achieved equivalent accuracy in segmenting the pelvic floor levator hiatus compared to a previous semi-automatic approach. |
Tasks | |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06452v1 |
http://arxiv.org/pdf/1712.06452v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-method-of-pelvic-floor |
Repo | |
Framework | |
Deep vs. Diverse Architectures for Classification Problems
Title | Deep vs. Diverse Architectures for Classification Problems |
Authors | Colleen M. Farrelly |
Abstract | This study compares various superlearner and deep learning architectures (machine-learning-based and neural-network-based) for classification problems across several simulated and industrial datasets to assess performance and computational efficiency, as both methods have nice theoretical convergence properties. Superlearner formulations outperform other methods at small to moderate sample sizes (500-2500) on nonlinear and mixed linear/nonlinear predictor relationship datasets, while deep neural networks perform well on linear predictor relationship datasets of all sizes. This suggests faster convergence of the superlearner compared to deep neural network architectures on many messy classification problems for real-world data. Superlearners also yield interpretable models, allowing users to examine important signals in the data; in addition, they offer flexible formulation, where users can retain good performance with low-computational-cost base algorithms. K-nearest-neighbor (KNN) regression demonstrates improvements using the superlearner framework, as well; KNN superlearners consistently outperform deep architectures and KNN regression, suggesting that superlearners may be better able to capture local and global geometric features through utilizing a variety of algorithms to probe the data space. |
Tasks | |
Published | 2017-08-21 |
URL | http://arxiv.org/abs/1708.06347v1 |
http://arxiv.org/pdf/1708.06347v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-vs-diverse-architectures-for |
Repo | |
Framework | |
Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach
Title | Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach |
Authors | Roel Dobbe, David Fridovich-Keil, Claire Tomlin |
Abstract | Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem. Our main contribution is an information theoretic framework based on rate distortion theory which facilitates analysis of how well the resulting fully decentralized policies are able to reconstruct the optimal solution. Moreover, this framework provides a natural extension that addresses which nodes an agent should communicate with to improve the performance of its individual policy. |
Tasks | |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06334v2 |
http://arxiv.org/pdf/1707.06334v2.pdf | |
PWC | https://paperswithcode.com/paper/fully-decentralized-policies-for-multi-agent |
Repo | |
Framework | |
Super-Resolution with Deep Adaptive Image Resampling
Title | Super-Resolution with Deep Adaptive Image Resampling |
Authors | Xu Jia, Hong Chang, Tinne Tuytelaars |
Abstract | Deep learning based methods have recently pushed the state-of-the-art on the problem of Single Image Super-Resolution (SISR). In this work, we revisit the more traditional interpolation-based methods, that were popular before, now with the help of deep learning. In particular, we propose to use a Convolutional Neural Network (CNN) to estimate spatially variant interpolation kernels and apply the estimated kernels adaptively to each position in the image. The whole model is trained in an end-to-end manner. We explore two ways to improve the results for the case of large upscaling factors, and propose a recursive extension of our basic model. This achieves results that are on par with state-of-the-art methods. We visualize the estimated adaptive interpolation kernels to gain more insight on the effectiveness of the proposed method. We also extend the method to the task of joint image filtering and again achieve state-of-the-art performance. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06463v1 |
http://arxiv.org/pdf/1712.06463v1.pdf | |
PWC | https://paperswithcode.com/paper/super-resolution-with-deep-adaptive-image |
Repo | |
Framework | |
Skin lesion segmentation based on preprocessing, thresholding and neural networks
Title | Skin lesion segmentation based on preprocessing, thresholding and neural networks |
Authors | Juana M. Gutiérrez-Arriola, Marta Gómez-Álvarez, Victor Osma-Ruiz, Nicolás Sáenz-Lechón, Rubén Fraile |
Abstract | This abstract describes the segmentation system used to participate in the challenge ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection. Several preprocessing techniques have been tested for three color representations (RGB, YCbCr and HSV) of 392 images. Results have been used to choose the better preprocessing for each channel. In each case a neural network is trained to predict the Jaccard Index based on object characteristics. The system includes black frames and reference circle detection algorithms but no special treatment is done for hair removal. Segmentation is performed in two steps first the best channel to be segmented is chosen by selecting the best neural network output. If this output does not predict a Jaccard Index over 0.5 a more aggressive preprocessing is performed using open and close morphological operations and the segmentation of the channel that obtains the best output from the neural networks is selected as the lesion. |
Tasks | Lesion Segmentation |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04845v1 |
http://arxiv.org/pdf/1703.04845v1.pdf | |
PWC | https://paperswithcode.com/paper/skin-lesion-segmentation-based-on |
Repo | |
Framework | |
A Decidable Very Expressive Description Logic for Databases (Extended Version)
Title | A Decidable Very Expressive Description Logic for Databases (Extended Version) |
Authors | Alessandro Artale, Enrico Franconi, Rafael Peñaloza, Francesco Sportelli |
Abstract | We introduce $\mathcal{DLR}^+$, an extension of the n-ary propositionally closed description logic $\mathcal{DLR}$ to deal with attribute-labelled tuples (generalising the positional notation), projections of relations, and global and local objectification of relations, able to express inclusion, functional, key, and external uniqueness dependencies. The logic is equipped with both TBox and ABox axioms. We show how a simple syntactic restriction on the appearance of projections sharing common attributes in a $\mathcal{DLR}^+$ knowledge base makes reasoning in the language decidable with the same computational complexity as $\mathcal{DLR}$. The obtained $\mathcal{DLR}^\pm$ n-ary description logic is able to encode more thoroughly conceptual data models such as EER, UML, and ORM. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08468v1 |
http://arxiv.org/pdf/1707.08468v1.pdf | |
PWC | https://paperswithcode.com/paper/a-decidable-very-expressive-description-logic |
Repo | |
Framework | |
Detection for 5G-NOMA: An Online Adaptive Machine Learning Approach
Title | Detection for 5G-NOMA: An Online Adaptive Machine Learning Approach |
Authors | Daniyal Amir Awan, Renato L. G. Cavalcante, Masahiro Yukawa, Slawomir Stanczak |
Abstract | Non-orthogonal multiple access (NOMA) has emerged as a promising radio access technique for enabling the performance enhancements promised by the fifth-generation (5G) networks in terms of connectivity, low latency, and high spectrum efficiency. In the NOMA uplink, successive interference cancellation (SIC) based detection with device clustering has been suggested. In the case of multiple receive antennas, SIC can be combined with the minimum mean-squared error (MMSE) beamforming. However, there exists a tradeoff between the NOMA cluster size and the incurred SIC error. Larger clusters lead to larger errors but they are desirable from the spectrum efficiency and connectivity point of view. We propose a novel online learning based detection for the NOMA uplink. In particular, we design an online adaptive filter in the sum space of linear and Gaussian reproducing kernel Hilbert spaces (RKHSs). Such a sum space design is robust against variations of a dynamic wireless network that can deteriorate the performance of a purely nonlinear adaptive filter. We demonstrate by simulations that the proposed method outperforms the MMSE-SIC based detection for large cluster sizes. |
Tasks | |
Published | 2017-11-01 |
URL | http://arxiv.org/abs/1711.00355v2 |
http://arxiv.org/pdf/1711.00355v2.pdf | |
PWC | https://paperswithcode.com/paper/detection-for-5g-noma-an-online-adaptive |
Repo | |
Framework | |
Cross-language Framework for Word Recognition and Spotting of Indic Scripts
Title | Cross-language Framework for Word Recognition and Spotting of Indic Scripts |
Authors | Ayan Kumar Bhunia, Partha Pratim Roy, Akash Mohta, Umapada Pal |
Abstract | Handwritten word recognition and spotting of low-resource scripts are difficult as sufficient training data is not available and it is often expensive for collecting data of such scripts. This paper presents a novel cross language platform for handwritten word recognition and spotting for such low-resource scripts where training is performed with a sufficiently large dataset of an available script (considered as source script) and testing is done on other scripts (considered as target script). Training with one source script and testing with another script to have a reasonable result is not easy in handwriting domain due to the complex nature of handwriting variability among scripts. Also it is difficult in mapping between source and target characters when they appear in cursive word images. The proposed Indic cross language framework exploits a large resource of dataset for training and uses it for recognizing and spotting text of other target scripts where sufficient amount of training data is not available. Since, Indic scripts are mostly written in 3 zones, namely, upper, middle and lower, we employ zone-wise character (or component) mapping for efficient learning purpose. The performance of our cross-language framework depends on the extent of similarity between the source and target scripts. Hence, we devise an entropy based script similarity score using source to target character mapping that will provide a feasibility of cross language transcription. We have tested our approach in three Indic scripts, namely, Bangla, Devanagari and Gurumukhi, and the corresponding results are reported. |
Tasks | |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.06908v2 |
http://arxiv.org/pdf/1712.06908v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-language-framework-for-word-recognition |
Repo | |
Framework | |
Characterization of Hemodynamic Signal by Learning Multi-View Relationships
Title | Characterization of Hemodynamic Signal by Learning Multi-View Relationships |
Authors | Eric Lei, Kyle Miller, Michael R. Pinsky, Artur Dubrawski |
Abstract | Multi-view data are increasingly prevalent in practice. It is often relevant to analyze the relationships between pairs of views by multi-view component analysis techniques such as Canonical Correlation Analysis (CCA). However, data may easily exhibit nonlinear relations, which CCA cannot reveal. We aim to investigate the usefulness of nonlinear multi-view relations to characterize multi-view data in an explainable manner. To address this challenge, we propose a method to characterize globally nonlinear multi-view relationships as a mixture of linear relationships. A clustering method, it identifies partitions of observations that exhibit the same relationships and learns those relationships simultaneously. It defines cluster variables by multi-view rather than spatial relationships, unlike almost all other clustering methods. Furthermore, we introduce a supervised classification method that builds on our clustering method by employing multi-view relationships as discriminative factors. The value of these methods resides in their capability to find useful structure in the data that single-view or current multi-view methods may struggle to find. We demonstrate the potential utility of the proposed approach using an application in clinical informatics to detect and characterize slow bleeding in patients whose central venous pressure (CVP) is monitored at the bedside. Presently, CVP is considered an insensitive measure of a subject’s intravascular volume status or its change. However, we reason that features of CVP during inspiration and expiration should be informative in early identification of emerging changes of patient status. We empirically show how the proposed method can help discover and analyze multiple-to-multiple correlations, which could be nonlinear or vary throughout the population, by finding explainable structure of operational interest to practitioners. |
Tasks | |
Published | 2017-09-17 |
URL | https://arxiv.org/abs/1709.05602v2 |
https://arxiv.org/pdf/1709.05602v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-mixtures-of-multi-output-regression |
Repo | |
Framework | |
Indoor Localization Using Visible Light Via Fusion Of Multiple Classifiers
Title | Indoor Localization Using Visible Light Via Fusion Of Multiple Classifiers |
Authors | Xiansheng Guo, Sihua Shao, Nirwan Ansari, Abdallah Khreishah |
Abstract | A multiple classifiers fusion localization technique using received signal strengths (RSSs) of visible light is proposed, in which the proposed system transmits different intensity modulated sinusoidal signals by LEDs and the signals received by a Photo Diode (PD) placed at various grid points. First, we obtain some {\emph{approximate}} received signal strengths (RSSs) fingerprints by capturing the peaks of power spectral density (PSD) of the received signals at each given grid point. Unlike the existing RSSs based algorithms, several representative machine learning approaches are adopted to train multiple classifiers based on these RSSs fingerprints. The multiple classifiers localization estimators outperform the classical RSS-based LED localization approaches in accuracy and robustness. To further improve the localization performance, two robust fusion localization algorithms, namely, grid independent least square (GI-LS) and grid dependent least square (GD-LS), are proposed to combine the outputs of these classifiers. We also use a singular value decomposition (SVD) based LS (LS-SVD) method to mitigate the numerical stability problem when the prediction matrix is singular. Experiments conducted on intensity modulated direct detection (IM/DD) systems have demonstrated the effectiveness of the proposed algorithms. The experimental results show that the probability of having mean square positioning error (MSPE) of less than 5cm achieved by GD-LS is improved by 93.03% and 93.15%, respectively, as compared to those by the RSS ratio (RSSR) and RSS matching methods with the FFT length of 2000. |
Tasks | |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02184v2 |
http://arxiv.org/pdf/1703.02184v2.pdf | |
PWC | https://paperswithcode.com/paper/indoor-localization-using-visible-light-via |
Repo | |
Framework | |
Multi-modal Face Pose Estimation with Multi-task Manifold Deep Learning
Title | Multi-modal Face Pose Estimation with Multi-task Manifold Deep Learning |
Authors | Chaoqun Hong, Jun Yu |
Abstract | Human face pose estimation aims at estimating the gazing direction or head postures with 2D images. It gives some very important information such as communicative gestures, saliency detection and so on, which attracts plenty of attention recently. However, it is challenging because of complex background, various orientations and face appearance visibility. Therefore, a descriptive representation of face images and mapping it to poses are critical. In this paper, we make use of multi-modal data and propose a novel face pose estimation method that uses a novel deep learning framework named Multi-task Manifold Deep Learning $M^2DL$. It is based on feature extraction with improved deep neural networks and multi-modal mapping relationship with multi-task learning. In the proposed deep learning based framework, Manifold Regularized Convolutional Layers (MRCL) improve traditional convolutional layers by learning the relationship among outputs of neurons. Besides, in the proposed mapping relationship learning method, different modals of face representations are naturally combined to learn the mapping function from face images to poses. In this way, the computed mapping model with multiple tasks is improved. Experimental results on three challenging benchmark datasets DPOSE, HPID and BKHPD demonstrate the outstanding performance of $M^2DL$. |
Tasks | Multi-Task Learning, Pose Estimation, Saliency Detection |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06467v1 |
http://arxiv.org/pdf/1712.06467v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-face-pose-estimation-with-multi |
Repo | |
Framework | |
Classification on Large Networks: A Quantitative Bound via Motifs and Graphons
Title | Classification on Large Networks: A Quantitative Bound via Motifs and Graphons |
Authors | Andreas Haupt, Mohammad Khatami, Thomas Schultz, Ngoc Mai Tran |
Abstract | When each data point is a large graph, graph statistics such as densities of certain subgraphs (motifs) can be used as feature vectors for machine learning. While intuitive, motif counts are expensive to compute and difficult to work with theoretically. Via graphon theory, we give an explicit quantitative bound for the ability of motif homomorphisms to distinguish large networks under both generative and sampling noise. Furthermore, we give similar bounds for the graph spectrum and connect it to homomorphism densities of cycles. This results in an easily computable classifier on graph data with theoretical performance guarantee. Our method yields competitive results on classification tasks for the autoimmune disease Lupus Erythematosus. |
Tasks | |
Published | 2017-10-24 |
URL | http://arxiv.org/abs/1710.08878v1 |
http://arxiv.org/pdf/1710.08878v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-on-large-networks-a |
Repo | |
Framework | |
A Memristor-Based Optimization Framework for AI Applications
Title | A Memristor-Based Optimization Framework for AI Applications |
Authors | Sijia Liu, Yanzhi Wang, Makan Fardad, Pramod K. Varshney |
Abstract | Memristors have recently received significant attention as ubiquitous device-level components for building a novel generation of computing systems. These devices have many promising features, such as non-volatility, low power consumption, high density, and excellent scalability. The ability to control and modify biasing voltages at the two terminals of memristors make them promising candidates to perform matrix-vector multiplications and solve systems of linear equations. In this article, we discuss how networks of memristors arranged in crossbar arrays can be used for efficiently solving optimization and machine learning problems. We introduce a new memristor-based optimization framework that combines the computational merit of memristor crossbars with the advantages of an operator splitting method, alternating direction method of multipliers (ADMM). Here, ADMM helps in splitting a complex optimization problem into subproblems that involve the solution of systems of linear equations. The capability of this framework is shown by applying it to linear programming, quadratic programming, and sparse optimization. In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed. The memristor-based PI method can further be applied to principal component analysis (PCA). The use of memristor crossbars yields a significant speed-up in computation, and thus, we believe, has the potential to advance optimization and machine learning research in artificial intelligence (AI). |
Tasks | |
Published | 2017-10-18 |
URL | http://arxiv.org/abs/1710.08882v1 |
http://arxiv.org/pdf/1710.08882v1.pdf | |
PWC | https://paperswithcode.com/paper/a-memristor-based-optimization-framework-for |
Repo | |
Framework | |
Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework
Title | Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework |
Authors | Risheng Liu, Xin Fan, Shichao Cheng, Xiangyu Wang, Zhongxuan Luo |
Abstract | Deep learning models have gained great success in many real-world applications. However, most existing networks are typically designed in heuristic manners, thus lack of rigorous mathematical principles and derivations. Several recent studies build deep structures by unrolling a particular optimization model that involves task information. Unfortunately, due to the dynamic nature of network parameters, their resultant deep propagation networks do \emph{not} possess the nice convergence property as the original optimization scheme does. This paper provides a novel proximal unrolling framework to establish deep models by integrating experimentally verified network architectures and rich cues of the tasks. More importantly, we \emph{prove in theory} that 1) the propagation generated by our unrolled deep model globally converges to a critical-point of a given variational energy, and 2) the proposed framework is still able to learn priors from training data to generate a convergent propagation even when task information is only partially available. Indeed, these theoretical results are the best we can ask for, unless stronger assumptions are enforced. Extensive experiments on various real-world applications verify the theoretical convergence and demonstrate the effectiveness of designed deep models. |
Tasks | |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07653v2 |
http://arxiv.org/pdf/1711.07653v2.pdf | |
PWC | https://paperswithcode.com/paper/proximal-alternating-direction-network-a |
Repo | |
Framework | |
Dependencies: Formalising Semantic Catenae for Information Retrieval
Title | Dependencies: Formalising Semantic Catenae for Information Retrieval |
Authors | Christina Lioma |
Abstract | Building machines that can understand text like humans is an AI-complete problem. A great deal of research has already gone into this, with astounding results, allowing everyday people to discuss with their telephones, or have their reading materials analysed and classified by computers. A prerequisite for processing text semantics, common to the above examples, is having some computational representation of text as an abstract object. Operations on this representation practically correspond to making semantic inferences, and by extension simulating understanding text. The complexity and granularity of semantic processing that can be realised is constrained by the mathematical and computational robustness, expressiveness, and rigour of the tools used. This dissertation contributes a series of such tools, diverse in their mathematical formulation, but common in their application to model semantic inferences when machines process text. These tools are principally expressed in nine distinct models that capture aspects of semantic dependence in highly interpretable and non-complex ways. This dissertation further reflects on present and future problems with the current research paradigm in this area, and makes recommendations on how to overcome them. The amalgamation of the body of work presented in this dissertation advances the complexity and granularity of semantic inferences that can be made automatically by machines. |
Tasks | Information Retrieval |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03742v1 |
http://arxiv.org/pdf/1709.03742v1.pdf | |
PWC | https://paperswithcode.com/paper/dependencies-formalising-semantic-catenae-for |
Repo | |
Framework | |