Paper Group ANR 383
A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences. Scale-aware Pixel-wise Object Proposal Networks. Face Recognition Using Scattering Convolutional Network. The Landscape of Empirical Risk for Non-convex Losses. Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks …
A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences
Title | A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences |
Authors | Asis Roy, Sourangshu Bhattacharya, Kalyan Guin |
Abstract | Tests for Esophageal cancer can be expensive, uncomfortable and can have side effects. For many patients, we can predict non-existence of disease with 100% certainty, just using demographics, lifestyle, and medical history information. Our objective is to devise a general methodology for customizing tests using user preferences so that expensive or uncomfortable tests can be avoided. We propose to use classifiers trained from electronic health records (EHR) for selection of tests. The key idea is to design classifiers with 100% false normal rates, possibly at the cost higher false abnormals. We compare Naive Bayes classification (NB), Random Forests (RF), Support Vector Machines (SVM) and Logistic Regression (LR), and find kernel Logistic regression to be most suitable for the task. We propose an algorithm for finding the best probability threshold for kernel LR, based on test set accuracy. Using the proposed algorithm, we describe schemes for selecting tests, which appear as features in the automatic classification algorithm, using preferences on costs and discomfort of the users. We test our methodology with EHRs collected for more than 3000 patients, as a part of project carried out by a reputed hospital in Mumbai, India. Kernel SVM and kernel LR with a polynomial kernel of degree 3, yields an accuracy of 99.8% and sensitivity 100%, without the MP features, i.e. using only clinical tests. We demonstrate our test selection algorithm using two case studies, one using cost of clinical tests, and other using “discomfort” values for clinical tests. We compute the test sets corresponding to the lowest false abnormals for each criterion described above, using exhaustive enumeration of 15 clinical tests. The sets turn out to different, substantiating our claim that one can customize test sets based on user preferences. |
Tasks | |
Published | 2016-10-06 |
URL | http://arxiv.org/abs/1610.01712v1 |
http://arxiv.org/pdf/1610.01712v1.pdf | |
PWC | https://paperswithcode.com/paper/a-methodology-for-customizing-clinical-tests |
Repo | |
Framework | |
Scale-aware Pixel-wise Object Proposal Networks
Title | Scale-aware Pixel-wise Object Proposal Networks |
Authors | Zequn Jie, Xiaodan Liang, Jiashi Feng, Wen Feng Lu, Eng Hock Francis Tay, Shuicheng Yan |
Abstract | Object proposal is essential for current state-of-the-art object detection pipelines. However, the existing proposal methods generally fail in producing results with satisfying localization accuracy. The case is even worse for small objects which however are quite common in practice. In this paper we propose a novel Scale-aware Pixel-wise Object Proposal (SPOP) network to tackle the challenges. The SPOP network can generate proposals with high recall rate and average best overlap (ABO), even for small objects. In particular, in order to improve the localization accuracy, a fully convolutional network is employed which predicts locations of object proposals for each pixel. The produced ensemble of pixel-wise object proposals enhances the chance of hitting the object significantly without incurring heavy extra computational cost. To solve the challenge of localizing objects at small scale, two localization networks which are specialized for localizing objects with different scales are introduced, following the divide-and-conquer philosophy. Location outputs of these two networks are then adaptively combined to generate the final proposals by a large-/small-size weighting network. Extensive evaluations on PASCAL VOC 2007 show the SPOP network is superior over the state-of-the-art models. The high-quality proposals from SPOP network also significantly improve the mean average precision (mAP) of object detection with Fast-RCNN framework. Finally, the SPOP network (trained on PASCAL VOC) shows great generalization performance when testing it on ILSVRC 2013 validation set. |
Tasks | Object Detection |
Published | 2016-01-19 |
URL | http://arxiv.org/abs/1601.04798v3 |
http://arxiv.org/pdf/1601.04798v3.pdf | |
PWC | https://paperswithcode.com/paper/scale-aware-pixel-wise-object-proposal |
Repo | |
Framework | |
Face Recognition Using Scattering Convolutional Network
Title | Face Recognition Using Scattering Convolutional Network |
Authors | Shervin Minaee, Amirali Abdolrashidi, Yao Wang |
Abstract | Face recognition has been an active research area in the past few decades. In general, face recognition can be very challenging due to variations in viewpoint, illumination, facial expression, etc. Therefore it is essential to extract features which are invariant to some or all of these variations. Here a new image representation, called scattering transform/network, has been used to extract features from faces. The scattering transform is a kind of convolutional network which provides a powerful multi-layer representation for signals. After extraction of scattering features, PCA is applied to reduce the dimensionality of the data and then a multi-class support vector machine is used to perform recognition. The proposed algorithm has been tested on three face datasets and achieved a very high recognition rate. |
Tasks | Face Recognition |
Published | 2016-07-30 |
URL | http://arxiv.org/abs/1608.00059v2 |
http://arxiv.org/pdf/1608.00059v2.pdf | |
PWC | https://paperswithcode.com/paper/face-recognition-using-scattering |
Repo | |
Framework | |
The Landscape of Empirical Risk for Non-convex Losses
Title | The Landscape of Empirical Risk for Non-convex Losses |
Authors | Song Mei, Yu Bai, Andrea Montanari |
Abstract | Most high-dimensional estimation and prediction methods propose to minimize a cost function (empirical risk) that is written as a sum of losses associated to each data point. In this paper we focus on the case of non-convex losses, which is practically important but still poorly understood. Classical empirical process theory implies uniform convergence of the empirical risk to the population risk. While uniform convergence implies consistency of the resulting M-estimator, it does not ensure that the latter can be computed efficiently. In order to capture the complexity of computing M-estimators, we propose to study the landscape of the empirical risk, namely its stationary points and their properties. We establish uniform convergence of the gradient and Hessian of the empirical risk to their population counterparts, as soon as the number of samples becomes larger than the number of unknown parameters (modulo logarithmic factors). Consequently, good properties of the population risk can be carried to the empirical risk, and we can establish one-to-one correspondence of their stationary points. We demonstrate that in several problems such as non-convex binary classification, robust regression, and Gaussian mixture model, this result implies a complete characterization of the landscape of the empirical risk, and of the convergence properties of descent algorithms. We extend our analysis to the very high-dimensional setting in which the number of parameters exceeds the number of samples, and provide a characterization of the empirical risk landscape under a nearly information-theoretically minimal condition. Namely, if the number of samples exceeds the sparsity of the unknown parameters vector (modulo logarithmic factors), then a suitable uniform convergence result takes place. We apply this result to non-convex binary classification and robust regression in very high-dimension. |
Tasks | |
Published | 2016-07-22 |
URL | http://arxiv.org/abs/1607.06534v3 |
http://arxiv.org/pdf/1607.06534v3.pdf | |
PWC | https://paperswithcode.com/paper/the-landscape-of-empirical-risk-for-non |
Repo | |
Framework | |
Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning
Title | Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning |
Authors | Akosua Busia, Jasmine Collins, Navdeep Jaitly |
Abstract | Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we adapt some of these techniques for protein secondary structure prediction. We first train a series of deep neural networks to predict eight-class secondary structure labels given a protein’s amino acid sequence information and find that using recent methods for regularization, such as dropout and weight-norm constraining, leads to measurable gains in accuracy. We then adapt recent convolutional neural network architectures–Inception, ReSNet, and DenseNet with Batch Normalization–to the problem of protein structure prediction. These convolutional architectures make heavy use of multi-scale filter layers that simultaneously compute features on several scales, and use residual connections to prevent underfitting. Using a carefully modified version of these architectures, we achieve state-of-the-art performance of 70.0% per amino acid accuracy on the public CB513 benchmark dataset. Finally, we explore additions from sequence-to-sequence learning, altering the model to make its predictions conditioned on both the protein’s amino acid sequence and its past secondary structure labels. We introduce a new method of ensembling such a conditional model with our convolutional model, an approach which reaches 70.6% Q8 accuracy on CB513. We argue that these results can be further refined for larger boosts in prediction accuracy through more sophisticated attempts to control overfitting of conditional models. We aim to release the code for these experiments as part of the TensorFlow repository. |
Tasks | Protein Secondary Structure Prediction |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.01503v1 |
http://arxiv.org/pdf/1611.01503v1.pdf | |
PWC | https://paperswithcode.com/paper/protein-secondary-structure-prediction-using-1 |
Repo | |
Framework | |
Inherent Trade-Offs in the Fair Determination of Risk Scores
Title | Inherent Trade-Offs in the Fair Determination of Risk Scores |
Authors | Jon Kleinberg, Sendhil Mullainathan, Manish Raghavan |
Abstract | Recent discussion in the public sphere about algorithmic classification has involved tension between competing notions of what it means for a probabilistic classification to be fair to different groups. We formalize three fairness conditions that lie at the heart of these debates, and we prove that except in highly constrained special cases, there is no method that can satisfy these three conditions simultaneously. Moreover, even satisfying all three conditions approximately requires that the data lie in an approximate version of one of the constrained special cases identified by our theorem. These results suggest some of the ways in which key notions of fairness are incompatible with each other, and hence provide a framework for thinking about the trade-offs between them. |
Tasks | |
Published | 2016-09-19 |
URL | http://arxiv.org/abs/1609.05807v2 |
http://arxiv.org/pdf/1609.05807v2.pdf | |
PWC | https://paperswithcode.com/paper/inherent-trade-offs-in-the-fair-determination |
Repo | |
Framework | |
An assessment of orthographic similarity measures for several African languages
Title | An assessment of orthographic similarity measures for several African languages |
Authors | C. Maria Keet |
Abstract | Natural Language Interfaces and tools such as spellcheckers and Web search in one’s own language are known to be useful in ICT-mediated communication. Most languages in Southern Africa are under-resourced, however. Therefore, it would be very useful if both the generic and the few language-specific NLP tools could be reused or easily adapted across languages. This depends on the notion, and extent, of similarity between the languages. We assess this from the angle of orthography and corpora. Twelve versions of the Universal Declaration of Human Rights (UDHR) are examined, showing clusters of languages, and which are thus more or less amenable to cross-language adaptation of NLP tools, which do not match with Guthrie zones. To examine the generalisability of these results, we zoom in on isiZulu both quantitatively and qualitatively with four other corpora and texts in different genres. The results show that the UDHR is a typical text document orthographically. The results also provide insight into usability of typical measures such as lexical diversity and genre, and that the same statistic may mean different things in different documents. While NLTK for Python could be used for basic analyses of text, it, and similar NLP tools, will need considerable customization. |
Tasks | |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03065v1 |
http://arxiv.org/pdf/1608.03065v1.pdf | |
PWC | https://paperswithcode.com/paper/an-assessment-of-orthographic-similarity |
Repo | |
Framework | |
Automatic 3D modelling of craniofacial form
Title | Automatic 3D modelling of craniofacial form |
Authors | Nick Pears, Christian Duncan |
Abstract | Three-dimensional models of craniofacial variation over the general population are useful for assessing pre- and post-operative head shape when treating various craniofacial conditions, such as craniosynostosis. We present a new method of automatically building both sagittal profile models and full 3D surface models of the human head using a range of techniques in 3D surface image analysis; in particular, automatic facial landmarking using supervised machine learning, global and local symmetry plane detection using a variant of trimmed iterative closest points, locally-affine template warping (for full 3D models) and a novel pose normalisation using robust iterative ellipse fitting. The PCA-based models built using the new pose normalisation are more compact than those using Generalised Procrustes Analysis and we demonstrate their utility in a clinical case study. |
Tasks | |
Published | 2016-01-21 |
URL | http://arxiv.org/abs/1601.05593v1 |
http://arxiv.org/pdf/1601.05593v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-3d-modelling-of-craniofacial-form |
Repo | |
Framework | |
Flint Water Crisis: Data-Driven Risk Assessment Via Residential Water Testing
Title | Flint Water Crisis: Data-Driven Risk Assessment Via Residential Water Testing |
Authors | Jacob Abernethy, Cyrus Anderson, Chengyu Dai, Arya Farahi, Linh Nguyen, Adam Rauh, Eric Schwartz, Wenbo Shen, Guangsha Shi, Jonathan Stroud, Xinyu Tan, Jared Webb, Sheng Yang |
Abstract | Recovery from the Flint Water Crisis has been hindered by uncertainty in both the water testing process and the causes of contamination. In this work, we develop an ensemble of predictive models to assess the risk of lead contamination in individual homes and neighborhoods. To train these models, we utilize a wide range of data sources, including voluntary residential water tests, historical records, and city infrastructure data. Additionally, we use our models to identify the most prominent factors that contribute to a high risk of lead contamination. In this analysis, we find that lead service lines are not the only factor that is predictive of the risk of lead contamination of water. These results could be used to guide the long-term recovery efforts in Flint, minimize the immediate damages, and improve resource-allocation decisions for similar water infrastructure crises. |
Tasks | |
Published | 2016-09-30 |
URL | http://arxiv.org/abs/1610.00580v1 |
http://arxiv.org/pdf/1610.00580v1.pdf | |
PWC | https://paperswithcode.com/paper/flint-water-crisis-data-driven-risk |
Repo | |
Framework | |
A Hybrid POMDP-BDI Agent Architecture with Online Stochastic Planning and Plan Caching
Title | A Hybrid POMDP-BDI Agent Architecture with Online Stochastic Planning and Plan Caching |
Authors | Gavin Rens, Deshendran Moodley |
Abstract | This article presents an agent architecture for controlling an autonomous agent in stochastic environments. The architecture combines the partially observable Markov decision process (POMDP) model with the belief-desire-intention (BDI) framework. The Hybrid POMDP-BDI agent architecture takes the best features from the two approaches, that is, the online generation of reward-maximizing courses of action from POMDP theory, and sophisticated multiple goal management from BDI theory. We introduce the advances made since the introduction of the basic architecture, including (i) the ability to pursue multiple goals simultaneously and (ii) a plan library for storing pre-written plans and for storing recently generated plans for future reuse. A version of the architecture without the plan library is implemented and is evaluated using simulations. The results of the simulation experiments indicate that the approach is feasible. |
Tasks | |
Published | 2016-07-03 |
URL | http://arxiv.org/abs/1607.00656v1 |
http://arxiv.org/pdf/1607.00656v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-pomdp-bdi-agent-architecture-with |
Repo | |
Framework | |
How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions
Title | How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions |
Authors | Arun Tejasvi Chaganty, Percy Liang |
Abstract | How much is 131 million US dollars? To help readers put such numbers in context, we propose a new task of automatically generating short descriptions known as perspectives, e.g. “$131 million is about the cost to employ everyone in Texas over a lunch period”. First, we collect a dataset of numeric mentions in news articles, where each mention is labeled with a set of rated perspectives. We then propose a system to generate these descriptions consisting of two steps: formula construction and description generation. In construction, we compose formulae from numeric facts in a knowledge base and rank the resulting formulas based on familiarity, numeric proximity and semantic compatibility. In generation, we convert a formula into natural language using a sequence-to-sequence recurrent neural network. Our system obtains a 15.2% F1 improvement over a non-compositional baseline at formula construction and a 12.5 BLEU point improvement over a baseline description generation. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00070v1 |
http://arxiv.org/pdf/1609.00070v1.pdf | |
PWC | https://paperswithcode.com/paper/how-much-is-131-million-dollars-putting |
Repo | |
Framework | |
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingual Media Monitoring
Title | The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingual Media Monitoring |
Authors | Normunds Gruzitis, Guntis Barzdins |
Abstract | In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts that are rather controlled by nature but still infringe the boundaries of CNL. Although it is arguable if CNL can be exploited to approach the robust wide-coverage semantic parsing for use cases like media monitoring, its potential becomes much more obvious in the opposite direction: generation of story highlights from the summarized AMR graphs, which is in the focus of this position paper. |
Tasks | Abstractive Text Summarization, Semantic Parsing |
Published | 2016-06-20 |
URL | http://arxiv.org/abs/1606.05994v1 |
http://arxiv.org/pdf/1606.05994v1.pdf | |
PWC | https://paperswithcode.com/paper/the-role-of-cnl-and-amr-in-scalable |
Repo | |
Framework | |
Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification
Title | Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification |
Authors | Hà Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino |
Abstract | This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds. Our formulation provides in particular a rich representation of image features by exploiting their non-linear correlations. Theoretically, we provide a finite-dimensional approximation of the Log-Hilbert-Schmidt (Log-HS) distance between covariance operators that is scalable to large datasets, while maintaining an effective discriminating capability. This allows us to efficiently approximate any continuous shift-invariant kernel defined using the Log-HS distance. At the same time, we prove that the Log-HS inner product between covariance operators is only approximable by its finite-dimensional counterpart in a very limited scenario. Consequently, kernels defined using the Log-HS inner product, such as polynomial kernels, are not scalable in the same way as shift-invariant kernels. Computationally, we apply the approximate Log-HS distance formulation to covariance operators of both handcrafted and convolutional features, exploiting both the expressiveness of these features and the power of the covariance representation. Empirically, we tested our framework on the task of image classification on twelve challenging datasets. In almost all cases, the results obtained outperform other state of the art methods, demonstrating the competitiveness and potential of our framework. |
Tasks | Image Classification, Object Recognition |
Published | 2016-09-29 |
URL | http://arxiv.org/abs/1609.09251v1 |
http://arxiv.org/pdf/1609.09251v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-methods-on-approximate-infinite |
Repo | |
Framework | |
A Data-Driven Approach to Estimating the Number of Clusters in Hierarchical Clustering
Title | A Data-Driven Approach to Estimating the Number of Clusters in Hierarchical Clustering |
Authors | Antoine Zambelli |
Abstract | We propose two new methods for estimating the number of clusters in a hierarchical clustering framework in the hopes of creating a fully automated process with no human intervention. The methods are completely data-driven and require no input from the researcher, and as such are fully automated. They are quite easy to implement and not computationally intensive in the least. We analyze performance on several simulated data sets and the Biobase Gene Expression Set, comparing our methods to the established Gap statistic and Elbow methods and outperforming both in multi-cluster scenarios. |
Tasks | |
Published | 2016-08-16 |
URL | http://arxiv.org/abs/1608.04700v1 |
http://arxiv.org/pdf/1608.04700v1.pdf | |
PWC | https://paperswithcode.com/paper/a-data-driven-approach-to-estimating-the |
Repo | |
Framework | |
Deep Adaptive Network: An Efficient Deep Neural Network with Sparse Binary Connections
Title | Deep Adaptive Network: An Efficient Deep Neural Network with Sparse Binary Connections |
Authors | Xichuan Zhou, Shengli Li, Kai Qin, Kunping Li, Fang Tang, Shengdong Hu, Shujun Liu, Zhi Lin |
Abstract | Deep neural networks are state-of-the-art models for understanding the content of images, video and raw input data. However, implementing a deep neural network in embedded systems is a challenging task, because a typical deep neural network, such as a Deep Belief Network using 128x128 images as input, could exhaust Giga bytes of memory and result in bandwidth and computing bottleneck. To address this challenge, this paper presents a hardware-oriented deep learning algorithm, named as the Deep Adaptive Network, which attempts to exploit the sparsity in the neural connections. The proposed method adaptively reduces the weights associated with negligible features to zero, leading to sparse feedforward network architecture. Furthermore, since the small proportion of important weights are significantly larger than zero, they can be robustly thresholded and represented using single-bit integers (-1 and +1), leading to implementations of deep neural networks with sparse and binary connections. Our experiments showed that, for the application of recognizing MNIST handwritten digits, the features extracted by a two-layer Deep Adaptive Network with about 25% reserved important connections achieved 97.2% classification accuracy, which was almost the same with the standard Deep Belief Network (97.3%). Furthermore, for efficient hardware implementations, the sparse-and-binary-weighted deep neural network could save about 99.3% memory and 99.9% computation units without significant loss of classification accuracy for pattern recognition applications. |
Tasks | |
Published | 2016-04-21 |
URL | http://arxiv.org/abs/1604.06154v1 |
http://arxiv.org/pdf/1604.06154v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-adaptive-network-an-efficient-deep |
Repo | |
Framework | |