October 19, 2019

2927 words 14 mins read

Paper Group ANR 150

Paper Group ANR 150

Matrix Completion and Performance Guarantees for Single Individual Haplotyping. k-Space Deep Learning for Reference-free EPI Ghost Correction. Scalable and Robust Community Detection with Randomized Sketching. Hallucinating Point Cloud into 3D Sculptural Object. Discrete minimax estimation with trees. A refinement of Bennett’s inequality with appli …

Matrix Completion and Performance Guarantees for Single Individual Haplotyping

Title Matrix Completion and Performance Guarantees for Single Individual Haplotyping
Authors Somsubhra Barik, Haris Vikalo
Abstract Single individual haplotyping is an NP-hard problem that emerges when attempting to reconstruct an organism’s inherited genetic variations using data typically generated by high-throughput DNA sequencing platforms. Genomes of diploid organisms, including humans, are organized into homologous pairs of chromosomes that differ from each other in a relatively small number of variant positions. Haplotypes are ordered sequences of the nucleotides in the variant positions of the chromosomes in a homologous pair; for diploids, haplotypes associated with a pair of chromosomes may be conveniently represented by means of complementary binary sequences. In this paper, we consider a binary matrix factorization formulation of the single individual haplotyping problem and efficiently solve it by means of alternating minimization. We analyze the convergence properties of the alternating minimization algorithm and establish theoretical bounds for the achievable haplotype reconstruction error. The proposed technique is shown to outperform existing methods when applied to synthetic as well as real-world Fosmid-based HapMap NA12878 datasets.
Tasks Matrix Completion
Published 2018-06-13
URL http://arxiv.org/abs/1806.08647v2
PDF http://arxiv.org/pdf/1806.08647v2.pdf
PWC https://paperswithcode.com/paper/matrix-completion-and-performance-guarantees
Repo
Framework

k-Space Deep Learning for Reference-free EPI Ghost Correction

Title k-Space Deep Learning for Reference-free EPI Ghost Correction
Authors Juyoung Lee, Yoseob Han, Jae-Kyun Ryu, Jang-Yeon Park, Jong Chul Ye
Abstract Nyquist ghost artifacts in EPI are originated from phase mismatch between the even and odd echoes. However, conventional correction methods using reference scans often produce erroneous results especially in high-field MRI due to the non-linear and time-varying local magnetic field changes. Recently, it was shown that the problem of ghost correction can be reformulated as k-space interpolation problem that can be solved using structured low-rank Hankel matrix approaches. Another recent work showed that data driven Hankel matrix decomposition can be reformulated to exhibit similar structures as deep convolutional neural network. By synergistically combining these findings, we propose a k-space deep learning approach that immediately corrects the phase mismatch without a reference scan in both accelerated and non-accelerated EPI acquisitions. To take advantage of the even and odd-phase directional redundancy, the k-space data is divided into two channels configured with even and odd phase encodings. The redundancies between coils are also exploited by stacking the multi-coil k-space data into additional input channels. Then, our k-space ghost correction network is trained to learn the interpolation kernel to estimate the missing virtual k-space data. For the accelerated EPI data, the same neural network is trained to directly estimate the interpolation kernels for missing k-space data from both ghost and subsampling. Reconstruction results using 3T and 7T in-vivo data showed that the proposed method outperformed the image quality compared to the existing methods, and the computing time is much faster.The proposed k-space deep learning for EPI ghost correction is highly robust and fast, and can be combined with acceleration, so that it can be used as a promising correction tool for high-field MRI without changing the current acquisition protocol.
Tasks Matrix Completion
Published 2018-06-01
URL https://arxiv.org/abs/1806.00153v3
PDF https://arxiv.org/pdf/1806.00153v3.pdf
PWC https://paperswithcode.com/paper/k-space-deep-learning-for-reference-free-epi
Repo
Framework

Scalable and Robust Community Detection with Randomized Sketching

Title Scalable and Robust Community Detection with Randomized Sketching
Authors Mostafa Rahmani, Andre Beckus, Adel Karimian, George Atia
Abstract This paper explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph’s adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the full graph are inferred based on the clusters extracted from the sketch using a correlation-based retrieval step. Uniform random node sampling is shown to improve the computational complexity over clustering of the full graph when the cluster sizes are balanced. A new random degree-based node sampling algorithm is presented which significantly improves upon the performance of the clustering algorithm even when clusters are unbalanced. This algorithm improves the phase transitions for matrix-decomposition-based clustering with regard to computational complexity and minimum cluster size, which are shown to be nearly dimension-free in the low inter-cluster connectivity regime. A third sampling technique is shown to improve balance by randomly sampling nodes based on spatial distribution. We provide analysis and numerical results using a convex clustering algorithm based on matrix completion.
Tasks Community Detection, Matrix Completion
Published 2018-05-25
URL https://arxiv.org/abs/1805.10927v3
PDF https://arxiv.org/pdf/1805.10927v3.pdf
PWC https://paperswithcode.com/paper/randomized-robust-matrix-completion-for-the
Repo
Framework

Hallucinating Point Cloud into 3D Sculptural Object

Title Hallucinating Point Cloud into 3D Sculptural Object
Authors Chun-Liang Li, Eunsu Kang, Songwei Ge, Lingyao Zhang, Austin Dill, Manzil Zaheer, Barnabas Poczos
Abstract Our team of artists and machine learning researchers designed a creative algorithm that can generate authentic sculptural artworks. These artworks do not mimic any given forms and cannot be easily categorized into the dataset categories. Our approach extends DeepDream from images to 3D point clouds. The proposed algorithm, Amalgamated DeepDream (ADD), leverages the properties of point clouds to create objects with better quality than the naive extension. ADD presents promise for the creativity of machines, the kind of creativity that pushes artists to explore novel methods or materials and to create new genres instead of creating variations of existing forms or styles within one genre. For example, from Realism to Abstract Expressionism, or to Minimalism. Lastly, we present the sculptures that are 3D printed based on the point clouds created by ADD.
Tasks
Published 2018-11-13
URL http://arxiv.org/abs/1811.05389v3
PDF http://arxiv.org/pdf/1811.05389v3.pdf
PWC https://paperswithcode.com/paper/hallucinating-point-cloud-into-3d-sculptural
Repo
Framework

Discrete minimax estimation with trees

Title Discrete minimax estimation with trees
Authors Luc Devroye, Tommy Reddad
Abstract We propose a simple recursive data-based partitioning scheme which produces piecewise-constant or piecewise-linear density estimates on intervals, and show how this scheme can determine the optimal $L_1$ minimax rate for some discrete nonparametric classes.
Tasks
Published 2018-12-14
URL https://arxiv.org/abs/1812.06063v3
PDF https://arxiv.org/pdf/1812.06063v3.pdf
PWC https://paperswithcode.com/paper/discrete-minimax-estimation-with-trees
Repo
Framework

A refinement of Bennett’s inequality with applications to portfolio optimization

Title A refinement of Bennett’s inequality with applications to portfolio optimization
Authors Tony Jebara
Abstract A refinement of Bennett’s inequality is introduced which is strictly tighter than the classical bound. The new bound establishes the convergence of the average of independent random variables to its expected value. It also carefully exploits information about the potentially heterogeneous mean, variance, and ceiling of each random variable. The bound is strictly sharper in the homogeneous setting and very often significantly sharper in the heterogeneous setting. The improved convergence rates are obtained by leveraging Lambert’s W function. We apply the new bound in a portfolio optimization setting to allocate a budget across investments with heterogeneous returns.
Tasks Portfolio Optimization
Published 2018-04-16
URL http://arxiv.org/abs/1804.05454v1
PDF http://arxiv.org/pdf/1804.05454v1.pdf
PWC https://paperswithcode.com/paper/a-refinement-of-bennetts-inequality-with
Repo
Framework

Infrastructure for the representation and electronic exchange of design knowledge

Title Infrastructure for the representation and electronic exchange of design knowledge
Authors Laurent Buzon, Abdelaziz Bouras, Yacine Ouzrout
Abstract This paper develops the concept of knowledge and its exchange using Semantic Web technologies. It points out that knowledge is more than information because it embodies the meaning, that is to say semantic and context. These characteristics will influence our approach to represent and to treat the knowledge. In order to be adopted, the developed system needs to be simple and to use standards. The goal of the paper is to find standards to model knowledge and exchange it with an other person. Therefore, we propose to model knowledge using UML models to show a graphical representation and to exchange it with XML to ensure the portability at low cost. We introduce the concept of ontology for organizing knowledge and for facilitating the knowledge exchange. Proposals have been tested by implementing an application on the design knowledge of a pen.
Tasks
Published 2018-10-31
URL http://arxiv.org/abs/1810.13191v1
PDF http://arxiv.org/pdf/1810.13191v1.pdf
PWC https://paperswithcode.com/paper/infrastructure-for-the-representation-and
Repo
Framework

Training Set Debugging Using Trusted Items

Title Training Set Debugging Using Trusted Items
Authors Xuezhou Zhang, Xiaojin Zhu, Stephen J. Wright
Abstract Training set bugs are flaws in the data that adversely affect machine learning. The training set is usually too large for man- ual inspection, but one may have the resources to verify a few trusted items. The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus im- proves learning. Specifically, our approach seeks the smallest set of changes to the training set labels such that the model learned from this corrected training set predicts labels of the trusted items correctly. We flag the items whose labels are changed as potential bugs, whose labels can be checked for veracity by human experts. To find the bugs in this way is a challenging combinatorial bilevel optimization problem, but it can be relaxed into a continuous optimization problem. Ex- periments on toy and real data demonstrate that our approach can identify training set bugs effectively and suggest appro- priate changes to the labels. Our algorithm is a step toward trustworthy machine learning.
Tasks bilevel optimization
Published 2018-01-24
URL http://arxiv.org/abs/1801.08019v1
PDF http://arxiv.org/pdf/1801.08019v1.pdf
PWC https://paperswithcode.com/paper/training-set-debugging-using-trusted-items
Repo
Framework

Spline-Based Probability Calibration

Title Spline-Based Probability Calibration
Authors Brian Lucena
Abstract In many classification problems it is desirable to output well-calibrated probabilities on the different classes. We propose a robust, non-parametric method of calibrating probabilities called SplineCalib that utilizes smoothing splines to determine a calibration function. We demonstrate how applying certain transformations as part of the calibration process can improve performance on problems in deep learning and other domains where the scores tend to be “overconfident”. We adapt the approach to multi-class problems and find that better calibration can improve accuracy as well as log-loss by better resolving uncertain cases. Finally, we present a cross-validated approach to calibration which conserves data. Significant improvements to log-loss and accuracy are shown on several different problems. We also introduce the ml-insights python package which contains an implementation of the SplineCalib algorithm.
Tasks Calibration
Published 2018-09-20
URL http://arxiv.org/abs/1809.07751v1
PDF http://arxiv.org/pdf/1809.07751v1.pdf
PWC https://paperswithcode.com/paper/spline-based-probability-calibration
Repo
Framework

Neural Based Statement Classification for Biased Language

Title Neural Based Statement Classification for Biased Language
Authors Christoph Hube, Besnik Fetahu
Abstract Biased language commonly occurs around topics which are of controversial nature, thus, stirring disagreement between the different involved parties of a discussion. This is due to the fact that for language and its use, specifically, the understanding and use of phrases, the stances are cohesive within the particular groups. However, such cohesiveness does not hold across groups. In collaborative environments or environments where impartial language is desired (e.g. Wikipedia, news media), statements and the language therein should represent equally the involved parties and be neutrally phrased. Biased language is introduced through the presence of inflammatory words or phrases, or statements that may be incorrect or one-sided, thus violating such consensus. In this work, we focus on the specific case of phrasing bias, which may be introduced through specific inflammatory words or phrases in a statement. For this purpose, we propose an approach that relies on a recurrent neural networks in order to capture the inter-dependencies between words in a phrase that introduced bias. We perform a thorough experimental evaluation, where we show the advantages of a neural based approach over competitors that rely on word lexicons and other hand-crafted features in detecting biased language. We are able to distinguish biased statements with a precision of P=0.92, thus significantly outperforming baseline models with an improvement of over 30%. Finally, we release the largest corpus of statements annotated for biased language.
Tasks
Published 2018-11-14
URL http://arxiv.org/abs/1811.05740v1
PDF http://arxiv.org/pdf/1811.05740v1.pdf
PWC https://paperswithcode.com/paper/neural-based-statement-classification-for
Repo
Framework

Building Disease Detection Algorithms with Very Small Numbers of Positive Samples

Title Building Disease Detection Algorithms with Very Small Numbers of Positive Samples
Authors Ken C. L. Wong, Alexandros Karargyris, Tanveer Syeda-Mahmood, Mehdi Moradi
Abstract Although deep learning can provide promising results in medical image analysis, the lack of very large annotated datasets confines its full potential. Furthermore, limited positive samples also create unbalanced datasets which limit the true positive rates of trained models. As unbalanced datasets are mostly unavoidable, it is greatly beneficial if we can extract useful knowledge from negative samples to improve classification accuracy on limited positive samples. To this end, we propose a new strategy for building medical image analysis pipelines that target disease detection. We train a discriminative segmentation model only on normal images to provide a source of knowledge to be transferred to a disease detection classifier. We show that using the feature maps of a trained segmentation network, deviations from normal anatomy can be learned by a two-class classification network on an extremely unbalanced training dataset with as little as one positive for 17 negative samples. We demonstrate that even though the segmentation network is only trained on normal cardiac computed tomography images, the resulting feature maps can be used to detect pericardial effusion and cardiac septal defects with two-class convolutional classification networks.
Tasks
Published 2018-05-07
URL http://arxiv.org/abs/1805.02730v1
PDF http://arxiv.org/pdf/1805.02730v1.pdf
PWC https://paperswithcode.com/paper/building-disease-detection-algorithms-with
Repo
Framework

Towards Semantically Enhanced Data Understanding

Title Towards Semantically Enhanced Data Understanding
Authors Markus Schröder, Christian Jilek, Jörn Hees, Andreas Dengel
Abstract In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning of the data. Usually, documentation is separate from the data in various external documents, diagrams, spreadsheets and tools which causes considerable look up overhead. Moreover, other supporting applications are not able to consume and utilize such unstructured data. That is why we propose a methodology that uses a single semantic model that interlinks data with its documentation. Hence, data scientists are able to directly look up the connected information about the data by simply following links. Equally, they can browse the documentation which always refers to the data. Furthermore, the model can be used by other approaches providing additional support, like searching, comparing, integrating or visualizing data. To showcase our approach we also demonstrate an early prototype.
Tasks
Published 2018-06-13
URL http://arxiv.org/abs/1806.04952v1
PDF http://arxiv.org/pdf/1806.04952v1.pdf
PWC https://paperswithcode.com/paper/towards-semantically-enhanced-data
Repo
Framework

Learning Beam Search Policies via Imitation Learning

Title Learning Beam Search Policies via Imitation Learning
Authors Renato Negrinho, Matthew R. Gormley, Geoffrey J. Gordon
Abstract Beam search is widely used for approximate decoding in structured prediction problems. Models often use a beam at test time but ignore its existence at train time, and therefore do not explicitly learn how to use the beam. We develop an unifying meta-algorithm for learning beam search policies using imitation learning. In our setting, the beam is part of the model, and not just an artifact of approximate decoding. Our meta-algorithm captures existing learning algorithms and suggests new ones. It also lets us show novel no-regret guarantees for learning beam search policies.
Tasks Imitation Learning, Structured Prediction
Published 2018-11-01
URL https://arxiv.org/abs/1811.00512v2
PDF https://arxiv.org/pdf/1811.00512v2.pdf
PWC https://paperswithcode.com/paper/learning-beam-search-policies-via-imitation
Repo
Framework

An Analysis of the t-SNE Algorithm for Data Visualization

Title An Analysis of the t-SNE Algorithm for Data Visualization
Authors Sanjeev Arora, Wei Hu, Pravesh K. Kothari
Abstract A first line of attack in exploratory data analysis is data visualization, i.e., generating a 2-dimensional representation of data that makes clusters of similar points visually identifiable. Standard Johnson-Lindenstrauss dimensionality reduction does not produce data visualizations. The t-SNE heuristic of van der Maaten and Hinton, which is based on non-convex optimization, has become the de facto standard for visualization in a wide range of applications. This work gives a formal framework for the problem of data visualization - finding a 2-dimensional embedding of clusterable data that correctly separates individual clusters to make them visually identifiable. We then give a rigorous analysis of the performance of t-SNE under a natural, deterministic condition on the “ground-truth” clusters (similar to conditions assumed in earlier analyses of clustering) in the underlying data. These are the first provable guarantees on t-SNE for constructing good data visualizations. We show that our deterministic condition is satisfied by considerably general probabilistic generative models for clusterable data such as mixtures of well-separated log-concave distributions. Finally, we give theoretical evidence that t-SNE provably succeeds in partially recovering cluster structure even when the above deterministic condition is not met.
Tasks Dimensionality Reduction
Published 2018-03-05
URL http://arxiv.org/abs/1803.01768v2
PDF http://arxiv.org/pdf/1803.01768v2.pdf
PWC https://paperswithcode.com/paper/an-analysis-of-the-t-sne-algorithm-for-data
Repo
Framework

Deep Back Projection for Sparse-View CT Reconstruction

Title Deep Back Projection for Sparse-View CT Reconstruction
Authors Dong Hye Ye, Gregery T. Buzzard, Max Ruby, Charles A. Bouman
Abstract Filtered back projection (FBP) is a classical method for image reconstruction from sinogram CT data. FBP is computationally efficient but produces lower quality reconstructions than more sophisticated iterative methods, particularly when the number of views is lower than the number required by the Nyquist rate. In this paper, we use a deep convolutional neural network (CNN) to produce high-quality reconstructions directly from sinogram data. A primary novelty of our approach is that we first back project each view separately to form a stack of back projections and then feed this stack as input into the convolutional neural network. These single-view back projections convert the encoding of sinogram data into the appropriate spatial location, which can then be leveraged by the spatial invariance of the CNN to learn the reconstruction effectively. We demonstrate the benefit of our CNN based back projection on simulated sparse-view CT data over classical FBP.
Tasks Image Reconstruction
Published 2018-07-06
URL http://arxiv.org/abs/1807.02370v1
PDF http://arxiv.org/pdf/1807.02370v1.pdf
PWC https://paperswithcode.com/paper/deep-back-projection-for-sparse-view-ct
Repo
Framework
comments powered by Disqus