Paper Group ANR 550
Learning to Rank Scientific Documents from the Crowd. Efficient Multiple Incremental Computation for Kernel Ridge Regression with Bayesian Uncertainty Modeling. Using Fuzzy Logic to Leverage HTML Markup for Web Page Representation. What makes ImageNet good for transfer learning?. Note on the equivalence of hierarchical variational models and auxili …
Learning to Rank Scientific Documents from the Crowd
Title | Learning to Rank Scientific Documents from the Crowd |
Authors | Jesse M Lingeman, Hong Yu |
Abstract | Finding related published articles is an important task in any science, but with the explosion of new work in the biomedical domain it has become especially challenging. Most existing methodologies use text similarity metrics to identify whether two articles are related or not. However biomedical knowledge discovery is hypothesis-driven. The most related articles may not be ones with the highest text similarities. In this study, we first develop an innovative crowd-sourcing approach to build an expert-annotated document-ranking corpus. Using this corpus as the gold standard, we then evaluate the approaches of using text similarity to rank the relatedness of articles. Finally, we develop and evaluate a new supervised model to automatically rank related scientific articles. Our results show that authors’ ranking differ significantly from rankings by text-similarity-based models. By training a learning-to-rank model on a subset of the annotated corpus, we found the best supervised learning-to-rank model (SVM-Rank) significantly surpassed state-of-the-art baseline systems. |
Tasks | Document Ranking, Learning-To-Rank |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.01400v1 |
http://arxiv.org/pdf/1611.01400v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-rank-scientific-documents-from |
Repo | |
Framework | |
Efficient Multiple Incremental Computation for Kernel Ridge Regression with Bayesian Uncertainty Modeling
Title | Efficient Multiple Incremental Computation for Kernel Ridge Regression with Bayesian Uncertainty Modeling |
Authors | Bo-Wei Chen, Nik Nailah Binti Abdullah, Sangoh Park |
Abstract | This study presents an efficient incremental/decremental approach for big streams based on Kernel Ridge Regression (KRR), a frequently used data analysis in cloud centers. To avoid reanalyzing the whole dataset whenever sensors receive new training data, typical incremental KRR used a single-instance mechanism for updating an existing system. However, this inevitably increased redundant computational time, not to mention applicability to big streams. To this end, the proposed mechanism supports incremental/decremental processing for both single and multiple samples (i.e., batch processing). A large scale of data can be divided into batches, processed by a machine, without sacrificing the accuracy. Moreover, incremental/decremental analyses in empirical and intrinsic space are also proposed in this study to handle different types of data either with a large number of samples or high feature dimensions, whereas typical methods focused only on one type. At the end of this study, we further the proposed mechanism to statistical Kernelized Bayesian Regression, so that uncertainty modeling with incremental/decremental computation becomes applicable. Experimental results showed that computational time was significantly reduced, better than the original nonincremental design and the typical single incremental method. Furthermore, the accuracy of the proposed method remained the same as the baselines. This implied that the system enhanced efficiency without sacrificing the accuracy. These findings proved that the proposed method was appropriate for variable streaming data analysis, thereby demonstrating the effectiveness of the proposed method. |
Tasks | |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00621v3 |
http://arxiv.org/pdf/1608.00621v3.pdf | |
PWC | https://paperswithcode.com/paper/efficient-multiple-incremental-computation |
Repo | |
Framework | |
Using Fuzzy Logic to Leverage HTML Markup for Web Page Representation
Title | Using Fuzzy Logic to Leverage HTML Markup for Web Page Representation |
Authors | Alberto P. García-Plaza, Víctor Fresno, Raquel Martínez, Arkaitz Zubiaga |
Abstract | The selection of a suitable document representation approach plays a crucial role in the performance of a document clustering task. Being able to pick out representative words within a document can lead to substantial improvements in document clustering. In the case of web documents, the HTML markup that defines the layout of the content provides additional structural information that can be further exploited to identify representative words. In this paper we introduce a fuzzy term weighing approach that makes the most of the HTML structure for document clustering. We set forth and build on the hypothesis that a good representation can take advantage of how humans skim through documents to extract the most representative words. The authors of web pages make use of HTML tags to convey the most important message of a web page through page elements that attract the readers’ attention, such as page titles or emphasized elements. We define a set of criteria to exploit the information provided by these page elements, and introduce a fuzzy combination of these criteria that we evaluate within the context of a web page clustering task. Our proposed approach, called Abstract Fuzzy Combination of Criteria (AFCC), can adapt to datasets whose features are distributed differently, achieving good results compared to other similar fuzzy logic based approaches and TF-IDF across different datasets. |
Tasks | |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04429v1 |
http://arxiv.org/pdf/1606.04429v1.pdf | |
PWC | https://paperswithcode.com/paper/using-fuzzy-logic-to-leverage-html-markup-for |
Repo | |
Framework | |
What makes ImageNet good for transfer learning?
Title | What makes ImageNet good for transfer learning? |
Authors | Minyoung Huh, Pulkit Agrawal, Alexei A. Efros |
Abstract | The tremendous success of ImageNet-trained deep features on a wide range of transfer tasks begs the question: what are the properties of the ImageNet dataset that are critical for learning good, general-purpose features? This work provides an empirical investigation of various facets of this question: Is more pre-training data always better? How does feature quality depend on the number of training examples per class? Does adding more object classes improve performance? For the same data budget, how should the data be split into classes? Is fine-grained recognition necessary for learning good features? Given the same number of training classes, is it better to have coarse classes or fine-grained classes? Which is better: more classes or more examples per class? To answer these and related questions, we pre-trained CNN features on various subsets of the ImageNet dataset and evaluated transfer performance on PASCAL detection, PASCAL action classification, and SUN scene classification tasks. Our overall findings suggest that most changes in the choice of pre-training data long thought to be critical do not significantly affect transfer performance.? Given the same number of training classes, is it better to have coarse classes or fine-grained classes? Which is better: more classes or more examples per class? |
Tasks | Action Classification, Scene Classification, Transfer Learning |
Published | 2016-08-30 |
URL | http://arxiv.org/abs/1608.08614v2 |
http://arxiv.org/pdf/1608.08614v2.pdf | |
PWC | https://paperswithcode.com/paper/what-makes-imagenet-good-for-transfer |
Repo | |
Framework | |
Note on the equivalence of hierarchical variational models and auxiliary deep generative models
Title | Note on the equivalence of hierarchical variational models and auxiliary deep generative models |
Authors | Niko Brümmer |
Abstract | This note compares two recently published machine learning methods for constructing flexible, but tractable families of variational hidden-variable posteriors. The first method, called “hierarchical variational models” enriches the inference model with an extra variable, while the other, called “auxiliary deep generative models”, enriches the generative model instead. We conclude that the two methods are mathematically equivalent. |
Tasks | |
Published | 2016-03-08 |
URL | http://arxiv.org/abs/1603.02443v2 |
http://arxiv.org/pdf/1603.02443v2.pdf | |
PWC | https://paperswithcode.com/paper/note-on-the-equivalence-of-hierarchical |
Repo | |
Framework | |
Vector Quantization for Machine Vision
Title | Vector Quantization for Machine Vision |
Authors | Vincenzo Liguori |
Abstract | This paper shows how to reduce the computational cost for a variety of common machine vision tasks by operating directly in the compressed domain, particularly in the context of hardware acceleration. Pyramid Vector Quantization (PVQ) is the compression technique of choice and its properties are exploited to simplify Support Vector Machines (SVM), Convolutional Neural Networks(CNNs), Histogram of Oriented Gradients (HOG) features, interest points matching and other algorithms. |
Tasks | Quantization |
Published | 2016-03-30 |
URL | http://arxiv.org/abs/1603.09037v1 |
http://arxiv.org/pdf/1603.09037v1.pdf | |
PWC | https://paperswithcode.com/paper/vector-quantization-for-machine-vision |
Repo | |
Framework | |
Vocabulary Manipulation for Neural Machine Translation
Title | Vocabulary Manipulation for Neural Machine Translation |
Authors | Haitao Mi, Zhiguo Wang, Abe Ittycheriah |
Abstract | In order to capture rich language phenomena, neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary. For each sentence or batch, we only predict the target words in its sentence-level or batch-level vocabulary. Thus, we reduce both the computing time and the memory usage. Our method simply takes into account the translation options of each word or phrase in the source sentence, and picks a very small target vocabulary for each sentence based on a word-to-word translation model or a bilingual phrase library learned from a traditional machine translation model. Experimental results on the large-scale English-to-French task show that our method achieves better translation performance by 1 BLEU point over the large vocabulary neural machine translation system of Jean et al. (2015). |
Tasks | Machine Translation |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.03209v1 |
http://arxiv.org/pdf/1605.03209v1.pdf | |
PWC | https://paperswithcode.com/paper/vocabulary-manipulation-for-neural-machine |
Repo | |
Framework | |
Deep Learning Markov Random Field for Semantic Segmentation
Title | Deep Learning Markov Random Field for Semantic Segmentation |
Authors | Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang |
Abstract | Semantic segmentation tasks can be well modeled by Markov Random Field (MRF). This paper addresses semantic segmentation by incorporating high-order relations and mixture of label contexts into MRF. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Specifically, DPN extends a contemporary CNN to model unary terms and additional layers are devised to approximate the mean field (MF) algorithm for pairwise terms. It has several appealing properties. First, different from the recent works that required many iterations of MF during back-propagation, DPN is able to achieve high performance by approximating one iteration of MF. Second, DPN represents various types of pairwise terms, making many existing models as its special cases. Furthermore, pairwise terms in DPN provide a unified framework to encode rich contextual information in high-dimensional data, such as images and videos. Third, DPN makes MF easier to be parallelized and speeded up, thus enabling efficient inference. DPN is thoroughly evaluated on standard semantic image/video segmentation benchmarks, where a single DPN model yields state-of-the-art segmentation accuracies on PASCAL VOC 2012, Cityscapes dataset and CamVid dataset. |
Tasks | Semantic Segmentation, Video Semantic Segmentation |
Published | 2016-06-23 |
URL | http://arxiv.org/abs/1606.07230v2 |
http://arxiv.org/pdf/1606.07230v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-markov-random-field-for |
Repo | |
Framework | |
A Novel Progressive Learning Technique for Multi-class Classification
Title | A Novel Progressive Learning Technique for Multi-class Classification |
Authors | Rajasekar Venkatesan, Meng Joo Er |
Abstract | In this paper, a progressive learning technique for multi-class classification is proposed. This newly developed learning technique is independent of the number of class constraints and it can learn new classes while still retaining the knowledge of previous classes. Whenever a new class (non-native to the knowledge learnt thus far) is encountered, the neural network structure gets remodeled automatically by facilitating new neurons and interconnections, and the parameters are calculated in such a way that it retains the knowledge learnt thus far. This technique is suitable for real-world applications where the number of classes is often unknown and online learning from real-time data is required. The consistency and the complexity of the progressive learning technique are analyzed. Several standard datasets are used to evaluate the performance of the developed technique. A comparative study shows that the developed technique is superior. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00085v2 |
http://arxiv.org/pdf/1609.00085v2.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-progressive-learning-technique-for |
Repo | |
Framework | |
UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation
Title | UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation |
Authors | Chunlei Zhang, Fahimeh Bahmaninezhad, Shivesh Ranjan, Chengzhu Yu, Navid Shokouhi, John H. L. Hansen |
Abstract | This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE). We developed several UBM and DNN i-Vector based speaker recognition systems with different data sets and feature representations. Given that the emphasis of the NIST SRE 2016 is on language mismatch between training and enrollment/test data, so-called domain mismatch, in our system development we focused on: (1) using unlabeled in-domain data for centralizing data to alleviate the domain mismatch problem, (2) finding the best data set for training LDA/PLDA, (3) using newly proposed dimension reduction technique incorporating unlabeled in-domain data before PLDA training, (4) unsupervised speaker clustering of unlabeled data and using them alone or with previous SREs for PLDA training, (5) score calibration using only unlabeled data and combination of unlabeled and development (Dev) data as separate experiments. |
Tasks | Calibration, Dimensionality Reduction, Speaker Recognition |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07651v1 |
http://arxiv.org/pdf/1610.07651v1.pdf | |
PWC | https://paperswithcode.com/paper/utd-crss-systems-for-2016-nist-speaker |
Repo | |
Framework | |
Improving Vertebra Segmentation through Joint Vertebra-Rib Atlases
Title | Improving Vertebra Segmentation through Joint Vertebra-Rib Atlases |
Authors | Yinong Wang, Jianhua Yao, Holger R. Roth, Joseph E. Burns, Ronald M. Summers |
Abstract | Accurate spine segmentation allows for improved identification and quantitative characterization of abnormalities of the vertebra, such as vertebral fractures. However, in existing automated vertebra segmentation methods on computed tomography (CT) images, leakage into nearby bones such as ribs occurs due to the close proximity of these visibly intense structures in a 3D CT volume. To reduce this error, we propose the use of joint vertebra-rib atlases to improve the segmentation of vertebrae via multi-atlas joint label fusion. Segmentation was performed and evaluated on CTs containing 106 thoracic and lumbar vertebrae from 10 pathological and traumatic spine patients on an individual vertebra level basis. Vertebra atlases produced errors where the segmentation leaked into the ribs. The use of joint vertebra-rib atlases produced a statistically significant increase in the Dice coefficient from 92.5 $\pm$ 3.1% to 93.8 $\pm$ 2.1% for the left and right transverse processes and a decrease in the mean and max surface distance from 0.75 $\pm$ 0.60mm and 8.63 $\pm$ 4.44mm to 0.30 $\pm$ 0.27mm and 3.65 $\pm$ 2.87mm, respectively. |
Tasks | Computed Tomography (CT) |
Published | 2016-02-01 |
URL | http://arxiv.org/abs/1602.00585v1 |
http://arxiv.org/pdf/1602.00585v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-vertebra-segmentation-through-joint |
Repo | |
Framework | |
Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning
Title | Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning |
Authors | Wendelin Böhmer, Rong Guo, Klaus Obermayer |
Abstract | This paper investigates a type of instability that is linked to the greedy policy improvement in approximated reinforcement learning. We show empirically that non-deterministic policy improvement can stabilize methods like LSPI by controlling the improvements’ stochasticity. Additionally we show that a suitable representation of the value function also stabilizes the solution to some degree. The presented approach is simple and should also be easily transferable to more sophisticated algorithms like deep reinforcement learning. |
Tasks | |
Published | 2016-12-22 |
URL | http://arxiv.org/abs/1612.07548v1 |
http://arxiv.org/pdf/1612.07548v1.pdf | |
PWC | https://paperswithcode.com/paper/non-deterministic-policy-improvement |
Repo | |
Framework | |
Kernel regression, minimax rates and effective dimensionality: beyond the regular case
Title | Kernel regression, minimax rates and effective dimensionality: beyond the regular case |
Authors | Gilles Blanchard, Nicole Mücke |
Abstract | We investigate if kernel regularization methods can achieve minimax convergence rates over a source condition regularity assumption for the target function. These questions have been considered in past literature, but only under specific assumptions about the decay, typically polynomial, of the spectrum of the the kernel mapping covariance operator. In the perspective of distribution-free results, we investigate this issue under much weaker assumption on the eigenvalue decay, allowing for more complex behavior that can reflect different structure of the data at different scales. |
Tasks | |
Published | 2016-11-12 |
URL | http://arxiv.org/abs/1611.03979v1 |
http://arxiv.org/pdf/1611.03979v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-regression-minimax-rates-and-effective |
Repo | |
Framework | |
Novelty Detection in MultiClass Scenarios with Incomplete Set of Class Labels
Title | Novelty Detection in MultiClass Scenarios with Incomplete Set of Class Labels |
Authors | Nomi Vinokurov, Daphna Weinshall |
Abstract | We address the problem of novelty detection in multiclass scenarios where some class labels are missing from the training set. Our method is based on the initial assignment of confidence values, which measure the affinity between a new test point and each known class. We first compare the values of the two top elements in this vector of confidence values. In the heart of our method lies the training of an ensemble of classifiers, each trained to discriminate known from novel classes based on some partition of the training data into presumed-known and presumednovel classes. Our final novelty score is derived from the output of this ensemble of classifiers. We evaluated our method on two datasets of images containing a relatively large number of classes - the Caltech-256 and Cifar-100 datasets. We compared our method to 3 alternative methods which represent commonly used approaches, including the one-class SVM, novelty based on k-NN, novelty based on maximal confidence, and the recent KNFST method. The results show a very clear and marked advantage for our method over all alternative methods, in an experimental setup where class labels are missing during training. |
Tasks | |
Published | 2016-04-21 |
URL | http://arxiv.org/abs/1604.06242v2 |
http://arxiv.org/pdf/1604.06242v2.pdf | |
PWC | https://paperswithcode.com/paper/novelty-detection-in-multiclass-scenarios |
Repo | |
Framework | |
Multiple protein feature prediction with statistical relational learning
Title | Multiple protein feature prediction with statistical relational learning |
Authors | Luca Masera |
Abstract | High throughput sequencing techniques have highly impactedon modern biology, widening the gap between sequenced andannotated data. Automatic annotation tools are thereforeof the foremost importance to guide biologists’ experiments. However, most of the state-of-the-art methods rely on annotation transfer, offering reliable predictions only in homology settings. In this work we present a novel appraoch to protein feature prediction, which exploits the Semanti Based Regularization to inject prior knowledge in the learning process. The experimental results conducted on the yeast genome show that the introduction of the constraints positively impacts on the overall prediction quality. |
Tasks | Relational Reasoning |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08391v1 |
http://arxiv.org/pdf/1609.08391v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-protein-feature-prediction-with |
Repo | |
Framework | |