July 27, 2019

3181 words 15 mins read

Paper Group ANR 512

Boosted nonparametric hazards with time-dependent covariates. SEGMENT3D: A Web-based Application for Collaborative Segmentation of 3D images used in the Shoot Apical Meristem. Image Processing in Floriculture Using a robotic Mobile Platform. Consistent Alignment of Word Embedding Models. Classification in biological networks with hypergraphlet kern …

Boosted nonparametric hazards with time-dependent covariates


Title	Boosted nonparametric hazards with time-dependent covariates
Authors	Donald K. K. Lee, Ningyuan Chen, Hemant Ishwaran
Abstract	Given functional data from a survival process with time-dependent covariates, we derive a smooth convex representation for its nonparametric log-likelihood functional and obtain its functional gradient. From this we devise a generic gradient boosting procedure for estimating the hazard function nonparametrically. An illustrative implementation of the procedure using regression trees is described to show how to recover the unknown hazard. We show that the generic estimator is consistent if the model is correctly specified; alternatively an oracle inequality can be demonstrated for tree-based models. To avoid overfitting, boosting employs several regularization devices. One of them is step-size restriction, but the rationale for this is somewhat mysterious from the viewpoint of consistency. Our work brings some clarity to this issue by revealing that step-size restriction is a mechanism for preventing the curvature of the risk from derailing convergence.
Tasks
Published	2017-01-27
URL	https://arxiv.org/abs/1701.07926v6
PDF	https://arxiv.org/pdf/1701.07926v6.pdf
PWC	https://paperswithcode.com/paper/boosted-nonparametric-hazards-with-time
Repo
Framework

SEGMENT3D: A Web-based Application for Collaborative Segmentation of 3D images used in the Shoot Apical Meristem


Title	SEGMENT3D: A Web-based Application for Collaborative Segmentation of 3D images used in the Shoot Apical Meristem
Authors	Thiago V. Spina, Johannes Stegmaier, Alexandre X. Falcão, Elliot Meyerowitz, Alexandre Cunha
Abstract	The quantitative analysis of 3D confocal microscopy images of the shoot apical meristem helps understanding the growth process of some plants. Cell segmentation in these images is crucial for computational plant analysis and many automated methods have been proposed. However, variations in signal intensity across the image mitigate the effectiveness of those approaches with no easy way for user correction. We propose a web-based collaborative 3D image segmentation application, SEGMENT3D, to leverage automatic segmentation results. The image is divided into 3D tiles that can be either segmented interactively from scratch or corrected from a pre-existing segmentation. Individual segmentation results per tile are then automatically merged via consensus analysis and then stitched to complete the segmentation for the entire image stack. SEGMENT3D is a comprehensive application that can be applied to other 3D imaging modalities and general objects. It also provides an easy way to create supervised data to advance segmentation using machine learning models.
Tasks	Cell Segmentation, Semantic Segmentation
Published	2017-10-26
URL	http://arxiv.org/abs/1710.09933v1
PDF	http://arxiv.org/pdf/1710.09933v1.pdf
PWC	https://paperswithcode.com/paper/segment3d-a-web-based-application-for
Repo
Framework

Image Processing in Floriculture Using a robotic Mobile Platform


Title	Image Processing in Floriculture Using a robotic Mobile Platform
Authors	Juan Garcia-Torres, Diana Caro-Prieto
Abstract	Colombia has a privileged geographical location which makes it a cornerstone and equidistant point to all regional markets. The country has a great ecological diversity and it is one of the largest suppliers of flowers for US. Colombian flower companies have made innovations in the marketing process, using methods to reach all conditions for final consumers. This article develops a monitoring system for floriculture industries. The system was implemented in a robotic platform. This device has the ability to be programmed in different programming languages. The robot takes the necessary environment information from its camera. The algorithm of the monitoring system was developed with the image processing toolbox on Matlab. The implemented algorithm acquires images through its camera, it performs a preprocessing of the image, noise filter, enhancing of the color and adjusting the dimension in order to increase processing speed. Then, the image is segmented by color and with the binarized version of the image using morphological operations (erosion and dilation), extract relevant features such as centroid, perimeter and area. The data obtained from the image processing helps the robot with the automatic identification of objectives, orientation and move towards them. Also, the results generate a diagnostic quality of each object scanned.
Tasks
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08436v1
PDF	http://arxiv.org/pdf/1706.08436v1.pdf
PWC	https://paperswithcode.com/paper/image-processing-in-floriculture-using-a
Repo
Framework

Consistent Alignment of Word Embedding Models


Title	Consistent Alignment of Word Embedding Models
Authors	Cem Safak Sahin, Rajmonda S. Caceres, Brandon Oselio, William M. Campbell
Abstract	Word embedding models offer continuous vector representations that can capture rich contextual semantics based on their word co-occurrence patterns. While these word vectors can provide very effective features used in many NLP tasks such as clustering similar words and inferring learning relationships, many challenges and open research questions remain. In this paper, we propose a solution that aligns variations of the same model (or different models) in a joint low-dimensional latent space leveraging carefully generated synthetic data points. This generative process is inspired by the observation that a variety of linguistic relationships is captured by simple linear operations in embedded space. We demonstrate that our approach can lead to substantial improvements in recovering embeddings of local neighborhoods.
Tasks
Published	2017-02-24
URL	http://arxiv.org/abs/1702.07680v1
PDF	http://arxiv.org/pdf/1702.07680v1.pdf
PWC	https://paperswithcode.com/paper/consistent-alignment-of-word-embedding-models
Repo
Framework

Classification in biological networks with hypergraphlet kernels


Title	Classification in biological networks with hypergraphlet kernels
Authors	Jose Lugo-Martinez, Predrag Radivojac
Abstract	Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins, drugs) and edges represent relational ties among these objects (binds-to, interacts-with, regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, often suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. In this paper, we present a hypergraph-based approach for modeling physical systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs in a semi-supervised setting. We introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of small simple hypergraphs, referred to as hypergraphlets, rooted at a vertex of interest. We extensively evaluate this method and show its potential use in a positive-unlabeled setting to estimate the number of missing and false positive links in protein-protein interaction networks.
Tasks	Link Prediction
Published	2017-03-14
URL	http://arxiv.org/abs/1703.04823v1
PDF	http://arxiv.org/pdf/1703.04823v1.pdf
PWC	https://paperswithcode.com/paper/classification-in-biological-networks-with
Repo
Framework

A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series


Title	A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series
Authors	Stanislas Chambon, Mathieu Galtier, Pierrick Arnal, Gilles Wainrib, Alexandre Gramfort
Abstract	Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30s of signal a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEG), electrooculograms (EOG), electrocardiograms (ECG) and electromyograms (EMG). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting hand-crafted features, that exploits all multivariate and multimodal Polysomnography (PSG) signals (EEG, EMG and EOG), and that can exploit the temporal context of each 30s window of data. For each modality the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields state-of-the-art performance. Our study reveals a number of insights on the spatio-temporal distribution of the signal of interest: a good trade-off for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting one minute of data before and after each data segment offers the strongest improvement when a limited number of channels is available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver state-of-the-art classification performance with a small computational cost.
Tasks	EEG, Time Series
Published	2017-07-05
URL	http://arxiv.org/abs/1707.03321v2
PDF	http://arxiv.org/pdf/1707.03321v2.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-architecture-for-temporal
Repo
Framework

Building effective deep neural network architectures one feature at a time


Title	Building effective deep neural network architectures one feature at a time
Authors	Martin Mundt, Tobias Weis, Kishore Konda, Visvanathan Ramesh
Abstract	Successful training of convolutional neural networks is often associated with sufficiently deep architectures composed of high amounts of features. These networks typically rely on a variety of regularization and pruning techniques to converge to less redundant states. We introduce a novel bottom-up approach to expand representations in fixed-depth architectures. These architectures start from just a single feature per layer and greedily increase width of individual layers to attain effective representational capacities needed for a specific task. While network growth can rely on a family of metrics, we propose a computationally efficient version based on feature time evolution and demonstrate its potency in determining feature importance and a networks’ effective capacity. We demonstrate how automatically expanded architectures converge to similar topologies that benefit from lesser amount of parameters or improved accuracy and exhibit systematic correspondence in representational complexity with the specified task. In contrast to conventional design patterns with a typical monotonic increase in the amount of features with increased depth, we observe that CNNs perform better when there is more learnable parameters in intermediate, with falloffs to earlier and later layers.
Tasks	Feature Importance
Published	2017-05-18
URL	http://arxiv.org/abs/1705.06778v2
PDF	http://arxiv.org/pdf/1705.06778v2.pdf
PWC	https://paperswithcode.com/paper/building-effective-deep-neural-network
Repo
Framework

Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)


Title	Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)
Authors	Maikel Leemans, Wil M. P. van der Aalst, Mark G. J. van den Brand
Abstract	This extended paper presents 1) a novel hierarchy and recursion extension to the process tree model; and 2) the first, recursion aware process model discovery technique that leverages hierarchical information in event logs, typically available for software systems. This technique allows us to analyze the operational processes of software systems under real-life conditions at multiple levels of granularity. The work can be positioned in-between reverse engineering and process mining. An implementation of the proposed approach is available as a ProM plugin. Experimental results based on real-life (software) event logs demonstrate the feasibility and usefulness of the approach and show the huge potential to speed up discovery by exploiting the available hierarchy.
Tasks
Published	2017-10-17
URL	http://arxiv.org/abs/1710.09323v1
PDF	http://arxiv.org/pdf/1710.09323v1.pdf
PWC	https://paperswithcode.com/paper/recursion-aware-modeling-and-discovery-for
Repo
Framework

Can string kernels pass the test of time in Native Language Identification?


Title	Can string kernels pass the test of time in Native Language Identification?
Authors	Radu Tudor Ionescu, Marius Popescu
Abstract	We describe a machine learning approach for the 2017 shared task on Native Language Identification (NLI). The proposed approach combines several kernels using multiple kernel learning. While most of our kernels are based on character p-grams (also known as n-grams) extracted from essays or speech transcripts, we also use a kernel based on i-vectors, a low-dimensional representation of audio recordings, provided by the shared task organizers. For the learning stage, we choose Kernel Discriminant Analysis (KDA) over Kernel Ridge Regression (KRR), because the former classifier obtains better results than the latter one on the development set. In our previous work, we have used a similar machine learning approach to achieve state-of-the-art NLI results. The goal of this paper is to demonstrate that our shallow and simple approach based on string kernels (with minor improvements) can pass the test of time and reach state-of-the-art performance in the 2017 NLI shared task, despite the recent advances in natural language processing. We participated in all three tracks, in which the competitors were allowed to use only the essays (essay track), only the speech transcripts (speech track), or both (fusion track). Using only the data provided by the organizers for training our models, we have reached a macro F1 score of 86.95% in the closed essay track, a macro F1 score of 87.55% in the closed speech track, and a macro F1 score of 93.19% in the closed fusion track. With these scores, our team (UnibucKernel) ranked in the first group of teams in all three tracks, while attaining the best scores in the speech and the fusion tracks.
Tasks	Language Identification, Native Language Identification
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08349v2
PDF	http://arxiv.org/pdf/1707.08349v2.pdf
PWC	https://paperswithcode.com/paper/can-string-kernels-pass-the-test-of-time-in
Repo
Framework

A Sample Complexity Measure with Applications to Learning Optimal Auctions


Title	A Sample Complexity Measure with Applications to Learning Optimal Auctions
Authors	Vasilis Syrgkanis
Abstract	We introduce a new sample complexity measure, which we refer to as split-sample growth rate. For any hypothesis $H$ and for any sample $S$ of size $m$, the split-sample growth rate $\hat{\tau}_H(m)$ counts how many different hypotheses can empirical risk minimization output on any sub-sample of $S$ of size $m/2$. We show that the expected generalization error is upper bounded by $O\left(\sqrt{\frac{\log(\hat{\tau}_H(2m))}{m}}\right)$. Our result is enabled by a strengthening of the Rademacher complexity analysis of the expected generalization error. We show that this sample complexity measure, greatly simplifies the analysis of the sample complexity of optimal auction design, for many auction classes studied in the literature. Their sample complexity can be derived solely by noticing that in these auction classes, ERM on any sample or sub-sample will pick parameters that are equal to one of the points in the sample.
Tasks
Published	2017-04-09
URL	http://arxiv.org/abs/1704.02598v2
PDF	http://arxiv.org/pdf/1704.02598v2.pdf
PWC	https://paperswithcode.com/paper/a-sample-complexity-measure-with-applications
Repo
Framework

German in Flux: Detecting Metaphoric Change via Word Entropy


Title	German in Flux: Detecting Metaphoric Change via Word Entropy
Authors	Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole
Abstract	This paper explores the information-theoretic measure entropy to detect metaphoric change, transferring ideas from hypernym detection to research on language change. We also build the first diachronic test set for German as a standard for metaphoric change annotation. Our model shows high performance, is unsupervised, language-independent and generalizable to other processes of semantic change.
Tasks
Published	2017-06-15
URL	http://arxiv.org/abs/1706.04971v1
PDF	http://arxiv.org/pdf/1706.04971v1.pdf
PWC	https://paperswithcode.com/paper/german-in-flux-detecting-metaphoric-change
Repo
Framework

Instrument-Armed Bandits


Title	Instrument-Armed Bandits
Authors	Nathan Kallus
Abstract	We extend the classic multi-armed bandit (MAB) model to the setting of noncompliance, where the arm pull is a mere instrument and the treatment applied may differ from it, which gives rise to the instrument-armed bandit (IAB) problem. The IAB setting is relevant whenever the experimental units are human since free will, ethics, and the law may prohibit unrestricted or forced application of treatment. In particular, the setting is relevant in bandit models of dynamic clinical trials and other controlled trials on human interventions. Nonetheless, the setting has not been fully investigate in the bandit literature. We show that there are various and divergent notions of regret in this setting, all of which coincide only in the classic MAB setting. We characterize the behavior of these regrets and analyze standard MAB algorithms. We argue for a particular kind of regret that captures the causal effect of treatments but show that standard MAB algorithms cannot achieve sublinear control on this regret. Instead, we develop new algorithms for the IAB problem, prove new regret bounds for them, and compare them to standard MAB algorithms in numerical examples.
Tasks
Published	2017-05-21
URL	http://arxiv.org/abs/1705.07377v1
PDF	http://arxiv.org/pdf/1705.07377v1.pdf
PWC	https://paperswithcode.com/paper/instrument-armed-bandits
Repo
Framework

Efficient Parallel Connected Components Labeling with a Coarse-to-fine Strategy


Title	Efficient Parallel Connected Components Labeling with a Coarse-to-fine Strategy
Authors	Jun Chen, Keisuke Nonaka, Ryosuke Watanabe, Hiroshi Sankoh, Houari Sabirin, Sei Naito
Abstract	This paper proposes a new parallel approach to solve connected components on a 2D binary image implemented with CUDA. We employ the following strategies to accelerate neighborhood exploration after dividing an input image into independent blocks. In the local labeling stage, a coarse-labeling algorithm, including row-column connection and label-equivalence list unification, is applied first to sort out the mess of an initialized local label map; a refinement algorithm is then introduced to merge separated sub-regions from a single component. In the block merge stage, we scan the pixels located on the boundary of each block instead of solving the connectivity of all the pixels. With the proposed method, the length of label-equivalence lists is compressed, and the number of memory accesses is reduced. Thus, the efficiency of connected components labeling is improved. Experimental results show that our method outperforms the other approaches between $29%$ and $80%$ on average.
Tasks
Published	2017-12-28
URL	http://arxiv.org/abs/1712.09789v2
PDF	http://arxiv.org/pdf/1712.09789v2.pdf
PWC	https://paperswithcode.com/paper/efficient-parallel-connected-components
Repo
Framework


Title	Blind Image Deblurring via Reweighted Graph Total Variation
Authors	Yuanchao Bai, Gene Cheung, Xianming Liu, Wen Gao
Abstract	Blind image deblurring, i.e., deblurring without knowledge of the blur kernel, is a highly ill-posed problem. The problem can be solved in two parts: i) estimate a blur kernel from the blurry image, and ii) given estimated blur kernel, de-convolve blurry input to restore the target image. In this paper, by interpreting an image patch as a signal on a weighted graph, we first argue that a skeleton image—a proxy that retains the strong gradients of the target but smooths out the details—can be used to accurately estimate the blur kernel and has a unique bi-modal edge weight distribution. We then design a reweighted graph total variation (RGTV) prior that can efficiently promote bi-modal edge weight distribution given a blurry patch. However, minimizing a blind image deblurring objective with RGTV results in a non-convex non-differentiable optimization problem. We propose a fast algorithm that solves for the skeleton image and the blur kernel alternately. Finally with the computed blur kernel, recent non-blind image deblurring algorithms can be applied to restore the target image. Experimental results show that our algorithm can robustly estimate the blur kernel with large kernel size, and the reconstructed sharp image is competitive against the state-of-the-art methods.
Tasks	Blind Image Deblurring, Deblurring
Published	2017-12-24
URL	http://arxiv.org/abs/1712.08877v1
PDF	http://arxiv.org/pdf/1712.08877v1.pdf
PWC	https://paperswithcode.com/paper/blind-image-deblurring-via-reweighted-graph
Repo
Framework

Upper Bounds on the Runtime of the Univariate Marginal Distribution Algorithm on OneMax


Title	Upper Bounds on the Runtime of the Univariate Marginal Distribution Algorithm on OneMax
Authors	Carsten Witt
Abstract	A runtime analysis of the Univariate Marginal Distribution Algorithm (UMDA) is presented on the OneMax function for wide ranges of its parameters $\mu$ and $\lambda$. If $\mu\ge c\log n$ for some constant $c>0$ and $\lambda=(1+\Theta(1))\mu$, a general bound $O(\mu n)$ on the expected runtime is obtained. This bound crucially assumes that all marginal probabilities of the algorithm are confined to the interval $[1/n,1-1/n]$. If $\mu\ge c’ \sqrt{n}\log n$ for a constant $c'>0$ and $\lambda=(1+\Theta(1))\mu$, the behavior of the algorithm changes and the bound on the expected runtime becomes $O(\mu\sqrt{n})$, which typically even holds if the borders on the marginal probabilities are omitted. The results supplement the recently derived lower bound $\Omega(\mu\sqrt{n}+n\log n)$ by Krejca and Witt (FOGA 2017) and turn out as tight for the two very different values $\mu=c\log n$ and $\mu=c’\sqrt{n}\log n$. They also improve the previously best known upper bound $O(n\log n\log\log n)$ by Dang and Lehre (GECCO 2015).
Tasks
Published	2017-03-31
URL	http://arxiv.org/abs/1704.00026v4
PDF	http://arxiv.org/pdf/1704.00026v4.pdf
PWC	https://paperswithcode.com/paper/upper-bounds-on-the-runtime-of-the-univariate
Repo
Framework