May 7, 2019

3044 words 15 mins read

Paper Group ANR 20

A new selection strategy for selective cluster ensemble based on Diversity and Independency. Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs. Tighter bounds lead to improved classifiers. Fuzzy Sets Across the Natural Language Generation Pipeline. Ischemic Stroke Identification Based on EEG and EOG using 1D …

A new selection strategy for selective cluster ensemble based on Diversity and Independency


Title	A new selection strategy for selective cluster ensemble based on Diversity and Independency
Authors	Muhammad Yousefnezhad, Ali Reihanian, Daoqiang Zhang, Behrouz Minaei-Bidgoli
Abstract	This research introduces a new strategy in cluster ensemble selection by using Independency and Diversity metrics. In recent years, Diversity and Quality, which are two metrics in evaluation procedure, have been used for selecting basic clustering results in the cluster ensemble selection. Although quality can improve the final results in cluster ensemble, it cannot control the procedures of generating basic results, which causes a gap in prediction of the generated basic results’ accuracy. Instead of quality, this paper introduces Independency as a supplementary method to be used in conjunction with Diversity. Therefore, this paper uses a heuristic metric, which is based on the procedure of converting code to graph in Software Testing, in order to calculate the Independency of two basic clustering algorithms. Moreover, a new modeling language, which we called as “Clustering Algorithms Independency Language” (CAIL), is introduced in order to generate graphs which depict Independency of algorithms. Also, Uniformity, which is a new similarity metric, has been introduced for evaluating the diversity of basic results. As a credential, our experimental results on varied different standard data sets show that the proposed framework improves the accuracy of final results dramatically in comparison with other cluster ensemble methods.
Tasks
Published	2016-10-09
URL	http://arxiv.org/abs/1610.02649v1
PDF	http://arxiv.org/pdf/1610.02649v1.pdf
PWC	https://paperswithcode.com/paper/a-new-selection-strategy-for-selective
Repo
Framework

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs


Title	Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
Authors	Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Zhijiang Zhang, Xiang Bai
Abstract	Object skeleton is a useful cue for object detection, complementary to the object contour, as it provides a structural representation to describe the relationship among object parts. While object skeleton extraction in natural images is a very challenging problem, as it requires the extractor to be able to capture both local and global image context to determine the intrinsic scale of each skeleton pixel. Existing methods rely on per-pixel based multi-scale feature computation, which results in difficult modeling and high time consumption. In this paper, we present a fully convolutional network with multiple scale-associated side outputs to address this problem. By observing the relationship between the receptive field sizes of the sequential stages in the network and the skeleton scales they can capture, we introduce a scale-associated side output to each stage. We impose supervision to different stages by guiding the scale-associated side outputs toward groundtruth skeletons of different scales. The responses of the multiple scale-associated side outputs are then fused in a scale-specific way to localize skeleton pixels with multiple scales effectively. Our method achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors.
Tasks	Object Detection
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09446v2
PDF	http://arxiv.org/pdf/1603.09446v2.pdf
PWC	https://paperswithcode.com/paper/object-skeleton-extraction-in-natural-images
Repo
Framework

Tighter bounds lead to improved classifiers


Title	Tighter bounds lead to improved classifiers
Authors	Nicolas Le Roux
Abstract	The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. While this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.
Tasks
Published	2016-06-29
URL	http://arxiv.org/abs/1606.09202v2
PDF	http://arxiv.org/pdf/1606.09202v2.pdf
PWC	https://paperswithcode.com/paper/tighter-bounds-lead-to-improved-classifiers
Repo
Framework

Fuzzy Sets Across the Natural Language Generation Pipeline


Title	Fuzzy Sets Across the Natural Language Generation Pipeline
Authors	A. Ramos-Soto, A. Bugarín, S. Barro
Abstract	We explore the implications of using fuzzy techniques (mainly those commonly used in the linguistic description/summarization of data discipline) from a natural language generation perspective. For this, we provide an extensive discussion of some general convergence points and an exploration of the relationship between the different tasks involved in the standard NLG system pipeline architecture and the most common fuzzy approaches used in linguistic summarization/description of data, such as fuzzy quantified statements, evaluation criteria or aggregation operators. Each individual discussion is illustrated with a related use case. Recent work made in the context of cross-fertilization of both research fields is also referenced. This paper encompasses general ideas that emerged as part of the PhD thesis “Application of fuzzy sets in data-to-text systems”. It does not present a specific application or a formal approach, but rather discusses current high-level issues and potential usages of fuzzy sets (focused on linguistic summarization of data) in natural language generation.
Tasks	Text Generation
Published	2016-05-17
URL	http://arxiv.org/abs/1605.05303v1
PDF	http://arxiv.org/pdf/1605.05303v1.pdf
PWC	https://paperswithcode.com/paper/fuzzy-sets-across-the-natural-language
Repo
Framework

Ischemic Stroke Identification Based on EEG and EOG using 1D Convolutional Neural Network and Batch Normalization


Title	Ischemic Stroke Identification Based on EEG and EOG using 1D Convolutional Neural Network and Batch Normalization
Authors	Endang Purnama Giri, Mohamad Ivan Fanany, Aniati Murni Arymurthy
Abstract	In 2015, stroke was the number one cause of death in Indonesia. The majority type of stroke is ischemic. The standard tool for diagnosing stroke is CT-Scan. For developing countries like Indonesia, the availability of CT-Scan is very limited and still relatively expensive. Because of the availability, another device that potential to diagnose stroke in Indonesia is EEG. Ischemic stroke occurs because of obstruction that can make the cerebral blood flow (CBF) on a person with stroke has become lower than CBF on a normal person (control) so that the EEG signal have a deceleration. On this study, we perform the ability of 1D Convolutional Neural Network (1DCNN) to construct classification model that can distinguish the EEG and EOG stroke data from EEG and EOG control data. To accelerate training process our model we use Batch Normalization. Involving 62 person data object and from leave one out the scenario with five times repetition of measurement we obtain the average of accuracy 0.86 (F-Score 0.861) only at 200 epoch. This result is better than all over shallow and popular classifiers as the comparator (the best result of accuracy 0.69 and F-Score 0.72 ). The feature used in our study were only 24 handcrafted feature with simple feature extraction process.
Tasks	EEG
Published	2016-10-06
URL	http://arxiv.org/abs/1610.01757v1
PDF	http://arxiv.org/pdf/1610.01757v1.pdf
PWC	https://paperswithcode.com/paper/ischemic-stroke-identification-based-on-eeg
Repo
Framework

Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking


Title	Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
Authors	Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg
Abstract	Tracking-by-detection methods have demonstrated competitive performance in recent years. In these approaches, the tracking model heavily relies on the quality of the training set. Due to the limited amount of labeled training data, additional samples need to be extracted and labeled by the tracker itself. This often leads to the inclusion of corrupted training samples, due to occlusions, misalignments and other perturbations. Existing tracking-by-detection methods either ignore this problem, or employ a separate component for managing the training set. We propose a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks. Our approach dynamically manages the training set by estimating the quality of the samples. Contrary to existing approaches, we propose a unified formulation by minimizing a single loss over both the target appearance model and the sample quality weights. The joint formulation enables corrupted samples to be down-weighted while increasing the impact of correct ones. Experiments are performed on three benchmarks: OTB-2015 with 100 videos, VOT-2015 with 60 videos, and Temple-Color with 128 videos. On the OTB-2015, our unified formulation significantly improves the baseline, with a gain of 3.8% in mean overlap precision. Finally, our method achieves state-of-the-art results on all three datasets. Code and supplementary material are available at http://www.cvl.isy.liu.se/research/objrec/visualtracking/decontrack/index.html .
Tasks	Visual Tracking
Published	2016-09-20
URL	http://arxiv.org/abs/1609.06118v1
PDF	http://arxiv.org/pdf/1609.06118v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-decontamination-of-the-training-set
Repo
Framework

Learning Binary Features Online from Motion Dynamics for Incremental Loop-Closure Detection and Place Recognition


Title	Learning Binary Features Online from Motion Dynamics for Incremental Loop-Closure Detection and Place Recognition
Authors	Guangcong Zhang, Mason J. Lilly, Patricio A. Vela
Abstract	This paper proposes a simple yet effective approach to learn visual features online for improving loop-closure detection and place recognition, based on bag-of-words frameworks. The approach learns a codeword in bag-of-words model from a pair of matched features from two consecutive frames, such that the codeword has temporally-derived perspective invariance to camera motion. The learning algorithm is efficient: the binary descriptor is generated from the mean image patch, and the mask is learned based on discriminative projection by minimizing the intra-class distances among the learned feature and the two original features. A codeword for bag-of-words models is generated by packaging the learned descriptor and mask, with a masked Hamming distance defined to measure the distance between two codewords. The geometric properties of the learned codewords are then mathematically justified. In addition, hypothesis constraints are imposed through temporal consistency in matched codewords, which improves precision. The approach, integrated in an incremental bag-of-words system, is validated on multiple benchmark data sets and compared to state-of-the-art methods. Experiments demonstrate improved precision/recall outperforming state of the art with little loss in runtime.
Tasks	Loop Closure Detection
Published	2016-01-15
URL	http://arxiv.org/abs/1601.03821v2
PDF	http://arxiv.org/pdf/1601.03821v2.pdf
PWC	https://paperswithcode.com/paper/learning-binary-features-online-from-motion
Repo
Framework

A Neural Network Model to Classify Liver Cancer Patients Using Data Expansion and Compression


Title	A Neural Network Model to Classify Liver Cancer Patients Using Data Expansion and Compression
Authors	Ashkan Zeinalzadeh, Tom Wenska, Gordon Okimoto
Abstract	We develop a neural network model to classify liver cancer patients into high-risk and low-risk groups using genomic data. Our approach provides a novel technique to classify big data sets using neural network models. We preprocess the data before training the neural network models. We first expand the data using wavelet analysis. We then compress the wavelet coefficients by mapping them onto a new scaled orthonormal coordinate system. Then the data is used to train a neural network model that enables us to classify cancer patients into two different classes of high-risk and low-risk patients. We use the leave-one-out approach to build a neural network model. This neural network model enables us to classify a patient using genomic data as a high-risk or low-risk patient without any information about the survival time of the patient. The results from genomic data analysis are compared with survival time analysis. It is shown that the expansion and compression of data using wavelet analysis and singular value decomposition (SVD) is essential to train the neural network model.
Tasks
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07588v2
PDF	http://arxiv.org/pdf/1611.07588v2.pdf
PWC	https://paperswithcode.com/paper/a-neural-network-model-to-classify-liver
Repo
Framework

Quantum Perceptron Models


Title	Quantum Perceptron Models
Authors	Nathan Wiebe, Ashish Kapoor, Krysta M Svore
Abstract	We demonstrate how quantum computation can provide non-trivial improvements in the computational and statistical complexity of the perceptron model. We develop two quantum algorithms for perceptron learning. The first algorithm exploits quantum information processing to determine a separating hyperplane using a number of steps sublinear in the number of data points $N$, namely $O(\sqrt{N})$. The second algorithm illustrates how the classical mistake bound of $O(\frac{1}{\gamma^2})$ can be further improved to $O(\frac{1}{\sqrt{\gamma}})$ through quantum means, where $\gamma$ denotes the margin. Such improvements are achieved through the application of quantum amplitude amplification to the version space interpretation of the perceptron model.
Tasks
Published	2016-02-15
URL	http://arxiv.org/abs/1602.04799v1
PDF	http://arxiv.org/pdf/1602.04799v1.pdf
PWC	https://paperswithcode.com/paper/quantum-perceptron-models
Repo
Framework

Estimating the class prior and posterior from noisy positives and unlabeled data


Title	Estimating the class prior and posterior from noisy positives and unlabeled data
Authors	Shantanu Jain, Martha White, Predrag Radivojac
Abstract	We develop a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data. In recent years, several algorithms have been proposed to learn from positive-unlabeled data; however, many of these contributions remain theoretical, performing poorly on real high-dimensional data that is typically contaminated with noise. We build on this previous work to develop two practical classification algorithms that explicitly model the noise in the positive labels and utilize univariate transforms built on discriminative classifiers. We prove that these univariate transforms preserve the class prior, enabling estimation in the univariate space and avoiding kernel density estimation for high-dimensional data. The theoretical development and both parametric and nonparametric algorithms proposed here constitutes an important step towards wide-spread use of robust classification algorithms for positive-unlabeled data.
Tasks	Density Estimation
Published	2016-06-28
URL	http://arxiv.org/abs/1606.08561v2
PDF	http://arxiv.org/pdf/1606.08561v2.pdf
PWC	https://paperswithcode.com/paper/estimating-the-class-prior-and-posterior-from
Repo
Framework

Deep learning in color: towards automated quark/gluon jet discrimination


Title	Deep learning in color: towards automated quark/gluon jet discrimination
Authors	Patrick T. Komiske, Eric M. Metodiev, Matthew D. Schwartz
Abstract	Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. To establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, the deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.
Tasks
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01551v3
PDF	http://arxiv.org/pdf/1612.01551v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-in-color-towards-automated
Repo
Framework

Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation


Title	Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation
Authors	Antonio Jimeno Yepes
Abstract	Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set.
Tasks	Word Embeddings, Word Sense Disambiguation
Published	2016-04-09
URL	http://arxiv.org/abs/1604.02506v3
PDF	http://arxiv.org/pdf/1604.02506v3.pdf
PWC	https://paperswithcode.com/paper/word-embeddings-and-recurrent-neural-networks
Repo
Framework

Semi-supervised Discovery of Informative Tweets During the Emerging Disasters


Title	Semi-supervised Discovery of Informative Tweets During the Emerging Disasters
Authors	Shanshan Zhang, Slobodan Vucetic
Abstract	The first objective towards the effective use of microblogging services such as Twitter for situational awareness during the emerging disasters is discovery of the disaster-related postings. Given the wide range of possible disasters, using a pre-selected set of disaster-related keywords for the discovery is suboptimal. An alternative that we focus on in this work is to train a classifier using a small set of labeled postings that are becoming available as a disaster is emerging. Our hypothesis is that utilizing large quantities of historical microblogs could improve the quality of classification, as compared to training a classifier only on the labeled data. We propose to use unlabeled microblogs to cluster words into a limited number of clusters and use the word clusters as features for classification. To evaluate the proposed semi-supervised approach, we used Twitter data from 6 different disasters. Our results indicate that when the number of labeled tweets is 100 or less, the proposed approach is superior to the standard classification based on the bag or words feature representation. Our results also reveal that the choice of the unlabeled corpus, the choice of word clustering algorithm, and the choice of hyperparameters can have a significant impact on the classification accuracy.
Tasks
Published	2016-10-12
URL	http://arxiv.org/abs/1610.03750v1
PDF	http://arxiv.org/pdf/1610.03750v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-discovery-of-informative
Repo
Framework

Compressed Online Dictionary Learning for Fast fMRI Decomposition


Title	Compressed Online Dictionary Learning for Fast fMRI Decomposition
Authors	Arthur Mensch, Gaël Varoquaux, Bertrand Thirion
Abstract	We present a method for fast resting-state fMRI spatial decomposi-tions of very large datasets, based on the reduction of the temporal dimension before applying dictionary learning on concatenated individual records from groups of subjects. Introducing a measure of correspondence between spatial decompositions of rest fMRI, we demonstrates that time-reduced dictionary learning produces result as reliable as non-reduced decompositions. We also show that this reduction significantly improves computational scalability.
Tasks	Dictionary Learning
Published	2016-02-08
URL	http://arxiv.org/abs/1602.02701v1
PDF	http://arxiv.org/pdf/1602.02701v1.pdf
PWC	https://paperswithcode.com/paper/compressed-online-dictionary-learning-for
Repo
Framework

Fast Randomized Semi-Supervised Clustering


Title	Fast Randomized Semi-Supervised Clustering
Authors	Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová
Abstract	We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be achieved from $O(n)$ randomly chosen measurements, where $n$ is the number of items in the dataset. Our algorithm is therefore efficient both in terms of time and space complexities. We also investigate numerically the performance of the algorithm on synthetic and real world data.
Tasks
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06422v3
PDF	http://arxiv.org/pdf/1605.06422v3.pdf
PWC	https://paperswithcode.com/paper/fast-randomized-semi-supervised-clustering
Repo
Framework