October 19, 2019

3218 words 16 mins read

Paper Group ANR 144

PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. Gender Effect on Face Recognition for a Large Longitudinal Database. Feature selection in functional data classification with recursive maxima hunting. A Novel Co-design Peta-scale Heterogeneous Cluster for Deep Learning Training. Technique for designing a domain ontolog …

PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems


Title	PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
Authors	Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, Matteo Interlandi
Abstract	Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.
Tasks
Published	2018-10-14
URL	http://arxiv.org/abs/1810.06115v1
PDF	http://arxiv.org/pdf/1810.06115v1.pdf
PWC	https://paperswithcode.com/paper/pretzel-opening-the-black-box-of-machine
Repo
Framework

Gender Effect on Face Recognition for a Large Longitudinal Database


Title	Gender Effect on Face Recognition for a Large Longitudinal Database
Authors	Caroline Werther, Morgan Ferguson, Kevin Park, Troy Kling, Cuixian Chen, Yishi Wang
Abstract	Aging or gender variation can affect the face recognition performance dramatically. While most of the face recognition studies are focused on the variation of pose, illumination and expression, it is important to consider the influence of gender effect and how to design an effective matching framework. In this paper, we address these problems on a very large longitudinal database MORPH-II which contains 55,134 face images of 13,617 individuals. First, we consider four comprehensive experiments with different combination of gender distribution and subset size, including: 1) equal gender distribution; 2) a large highly unbalanced gender distribution; 3) consider different gender combinations, such as male only, female only, or mixed gender; and 4) the effect of subset size in terms of number of individuals. Second, we consider eight nearest neighbor distance metrics and also Support Vector Machine (SVM) for classifiers and test the effect of different classifiers. Last, we consider different fusion techniques for an effective matching framework to improve the recognition performance.
Tasks	Face Recognition
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03680v1
PDF	http://arxiv.org/pdf/1811.03680v1.pdf
PWC	https://paperswithcode.com/paper/gender-effect-on-face-recognition-for-a-large
Repo
Framework

Feature selection in functional data classification with recursive maxima hunting


Title	Feature selection in functional data classification with recursive maxima hunting
Authors	José L. Torrecilla, Alberto Suárez
Abstract	Dimensionality reduction is one of the key issues in the design of effective machine learning methods for automatic induction. In this work, we introduce recursive maxima hunting (RMH) for variable selection in classification problems with functional data. In this context, variable selection techniques are especially attractive because they reduce the dimensionality, facilitate the interpretation and can improve the accuracy of the predictive models. The method, which is a recursive extension of maxima hunting (MH), performs variable selection by identifying the maxima of a relevance function, which measures the strength of the correlation of the predictor functional variable with the class label. At each stage, the information associated with the selected variable is removed by subtracting the conditional expectation of the process. The results of an extensive empirical evaluation are used to illustrate that, in the problems investigated, RMH has comparable or higher predictive accuracy than the standard dimensionality reduction techniques, such as PCA and PLS, and state-of-the-art feature selection methods for functional data, such as maxima hunting.
Tasks	Dimensionality Reduction, Feature Selection
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02922v1
PDF	http://arxiv.org/pdf/1806.02922v1.pdf
PWC	https://paperswithcode.com/paper/feature-selection-in-functional-data
Repo
Framework

A Novel Co-design Peta-scale Heterogeneous Cluster for Deep Learning Training


Title	A Novel Co-design Peta-scale Heterogeneous Cluster for Deep Learning Training
Authors	Xin Chen, Hua Zhou, Yuxiang Gao, Yu Zhu
Abstract	Large scale deep Convolution Neural Networks (CNNs) increasingly demands the computing power. It is key for researchers to own a great powerful computing platform to leverage deep learning (DL) advancing.On the other hand, as the commonly-used accelerator, the commodity GPUs cards of new generations are more and more expensive. Consequently, it is of importance to design an affordable distributed heterogeneous system that provides powerful computational capacity and develop a well-suited software that efficiently utilizes its computational capacity. In this paper, we present our co-design distributed system including a peta-scale GPU cluster, called “Manoa”. Based on properties and topology of Manoa, we first propose job server framework and implement it, named “MiMatrix”. The central node of MiMatrix, referred to as the job server, undertakes all of controlling, scheduling and monitoring, and I/O tasks without weight data transfer for AllReduce processing in each iteration. Therefore, MiMatrix intrinsically solves the bandwidth bottleneck of central node in parameter server framework that is widely used in distributed DL tasks. Meanwhile, we also propose a new AllReduce algorithm, GPUDirect RDMA-Aware AllReduce~(GDRAA), in which both computation and handshake message are O(1) and the number of synchronization is two in each iteration that is a theoretical minimum number. Owe to the dedicated co-design distributed system, MiMatrix efficiently makes use of the Manoa’s computational capacity and bandwidth. We benchmark Manoa Resnet50 and Resenet101 on Imagenet-1K dataset. Some of results have demonstrated state-of-the-art.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02326v3
PDF	http://arxiv.org/pdf/1802.02326v3.pdf
PWC	https://paperswithcode.com/paper/a-novel-co-design-peta-scale-heterogeneous
Repo
Framework

Technique for designing a domain ontology


Title	Technique for designing a domain ontology
Authors	A. V. Palagin, N. G. Petrenko, K. S. Malakhov
Abstract	The article describes the technique for designing a domain ontology, shows the flowchart of algorithm design and example of constructing a fragment of the ontology of the subject area of Computer Science is considered.
Tasks
Published	2018-02-17
URL	http://arxiv.org/abs/1802.06769v1
PDF	http://arxiv.org/pdf/1802.06769v1.pdf
PWC	https://paperswithcode.com/paper/technique-for-designing-a-domain-ontology
Repo
Framework

Why Interpretability in Machine Learning? An Answer Using Distributed Detection and Data Fusion Theory


Title	Why Interpretability in Machine Learning? An Answer Using Distributed Detection and Data Fusion Theory
Authors	Kush R. Varshney, Prashant Khanduri, Pranay Sharma, Shan Zhang, Pramod K. Varshney
Abstract	As artificial intelligence is increasingly affecting all parts of society and life, there is growing recognition that human interpretability of machine learning models is important. It is often argued that accuracy or other similar generalization performance metrics must be sacrificed in order to gain interpretability. Such arguments, however, fail to acknowledge that the overall decision-making system is composed of two entities: the learned model and a human who fuses together model outputs with his or her own information. As such, the relevant performance criteria should be for the entire system, not just for the machine learning component. In this work, we characterize the performance of such two-node tandem data fusion systems using the theory of distributed detection. In doing so, we work in the population setting and model interpretable learned models as multi-level quantizers. We prove that under our abstraction, the overall system of a human with an interpretable classifier outperforms one with a black box classifier.
Tasks	Decision Making
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09710v1
PDF	http://arxiv.org/pdf/1806.09710v1.pdf
PWC	https://paperswithcode.com/paper/why-interpretability-in-machine-learning-an
Repo
Framework

Learning Features and Abstract Actions for Computing Generalized Plans


Title	Learning Features and Abstract Actions for Computing Generalized Plans
Authors	Blai Bonet, Guillem Francès, Hector Geffner
Abstract	Generalized planning is concerned with the computation of plans that solve not one but multiple instances of a planning domain. Recently, it has been shown that generalized plans can be expressed as mappings of feature values into actions, and that they can often be computed with fully observable non-deterministic (FOND) planners. The actions in such plans, however, are not the actions in the instances themselves, which are not necessarily common to other instances, but abstract actions that are defined on a set of common features. The formulation assumes that the features and the abstract actions are given. In this work, we address this limitation by showing how to learn them automatically. The resulting account of generalized planning combines learning and planning in a novel way: a learner, based on a Max SAT formulation, yields the features and abstract actions from sampled state transitions, and a FOND planner uses this information, suitably transformed, to produce the general plans. Correctness guarantees are given and experimental results on several domains are reported.
Tasks
Published	2018-11-17
URL	http://arxiv.org/abs/1811.07231v1
PDF	http://arxiv.org/pdf/1811.07231v1.pdf
PWC	https://paperswithcode.com/paper/learning-features-and-abstract-actions-for
Repo
Framework

Changing Observations in Epistemic Temporal Logic


Title	Changing Observations in Epistemic Temporal Logic
Authors	Aurèle Barrière, Bastien Maubert, Aniello Murano, Sasha Rubin
Abstract	We study dynamic changes of agents’ observational power in logics of knowledge and time. We consider CTLK, the extension of CTL with knowledge operators, and enrich it with a new operator that models a change in an agent’s way of observing the system. We extend the classic semantics of knowledge for perfect-recall agents to account for changes of observation, and we show that this new operator strictly increases the expressivity of CTLK. We reduce the model-checking problem for our logic to that for CTLK, which is known to be decidable. This provides a solution to the model-checking problem for our logic, but its complexity is not optimal. Indeed we provide a direct decision procedure with better complexity.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06881v2
PDF	http://arxiv.org/pdf/1805.06881v2.pdf
PWC	https://paperswithcode.com/paper/changing-observations-in-epistemic-temporal
Repo
Framework

Deep Mask For X-ray Based Heart Disease Classification


Title	Deep Mask For X-ray Based Heart Disease Classification
Authors	Xupeng Chen, Binbin Shi
Abstract	We build a deep learning model to detect and classify heart disease using $X-ray$. We collect data from several hospitals and public datasets. After preprocess we get 3026 images including disease type VSD, ASD, TOF and normal control. The main problem we have to solve is to enable the network to accurately learn the characteristics of the heart, to ensure the reliability of the network while increasing accuracy. By learning the doctor’s diagnostic experience, labeling the image and using tools to extract masks of heart region, we train a U-net to generate a mask to give more attention. It forces the model to focus on the characteristics of the heart region and obtain more reliable results.
Tasks
Published	2018-08-19
URL	http://arxiv.org/abs/1808.08277v1
PDF	http://arxiv.org/pdf/1808.08277v1.pdf
PWC	https://paperswithcode.com/paper/deep-mask-for-x-ray-based-heart-disease
Repo
Framework

Nuclei Detection Using Mixture Density Networks


Title	Nuclei Detection Using Mixture Density Networks
Authors	Navid Alemi Koohababni, Mostafa Jahanifar, Ali Gooya, Nasir Rajpoot
Abstract	Nuclei detection is an important task in the histology domain as it is a main step toward further analysis such as cell counting, cell segmentation, study of cell connections, etc. This is a challenging task due to the complex texture of histology image, variation in shape, and touching cells. To tackle these hurdles, many approaches have been proposed in the literature where deep learning methods stand on top in terms of performance. Hence, in this paper, we propose a novel framework for nuclei detection based on Mixture Density Networks (MDNs). These networks are suitable to map a single input to several possible outputs and we utilize this property to detect multiple seeds in a single image patch. A new modified form of a cost function is proposed for training and handling patches with missing nuclei. The probability maps of the nuclei in the individual patches are next combined to generate the final image-wide result. The experimental results show the state-of-the-art performance on complex colorectal adenocarcinoma dataset.
Tasks	Cell Segmentation
Published	2018-08-22
URL	http://arxiv.org/abs/1808.08279v1
PDF	http://arxiv.org/pdf/1808.08279v1.pdf
PWC	https://paperswithcode.com/paper/nuclei-detection-using-mixture-density
Repo
Framework

Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data


Title	Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data
Authors	Sebastian Bodenstedt, Martin Wagner, Lars Mündermann, Hannes Kenngott, Beat Müller-Stich, Michael Breucha, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel
Abstract	Purpose The course of surgical procedures is often unpredictable, making it difficult to estimate the duration of procedures beforehand. A context-aware method that analyses the workflow of an intervention online and automatically predicts the remaining duration would alleviate these problems. As basis for such an estimate, information regarding the current state of the intervention is required. Methods Today, the operating room contains a diverse range of sensors. During laparoscopic interventions, the endoscopic video stream is an ideal source of such information. Extracting quantitative information from the video is challenging though, due to its high dimensionality. Other surgical devices (e.g. insufflator, lights, etc.) provide data streams which are, in contrast to the video stream, more compact and easier to quantify. Though whether such streams offer sufficient information for estimating the duration of surgery is uncertain. Here, we propose and compare methods, based on convolutional neural networks, for continuously predicting the duration of laparoscopic interventions based on unlabeled data, such as from endoscopic images and surgical device streams. Results The methods are evaluated on 80 laparoscopic interventions of various types, for which surgical device data and the endoscopic video are available. Here the combined method performs best with an overall average error of 37% and an average halftime error of 28%. Conclusion In this paper, we present, to our knowledge, the first approach for online procedure duration prediction using unlabeled endoscopic video data and surgical device data in a laparoscopic setting. We also show that a method incorporating both vision and device data performs better than methods based only on vision, while methods only based on tool usage and surgical device data perform poorly, showing the importance of the visual channel.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03384v2
PDF	http://arxiv.org/pdf/1811.03384v2.pdf
PWC	https://paperswithcode.com/paper/prediction-of-laparoscopic-procedure-duration
Repo
Framework

Quantum-inspired Complex Word Embedding


Title	Quantum-inspired Complex Word Embedding
Authors	Qiuchi Li, Sagar Uprety, Benyou Wang, Dawei Song
Abstract	A challenging task for word embeddings is to capture the emergent meaning or polarity of a combination of individual words. For example, existing approaches in word embeddings will assign high probabilities to the words “Penguin” and “Fly” if they frequently co-occur, but it fails to capture the fact that they occur in an opposite sense - Penguins do not fly. We hypothesize that humans do not associate a single polarity or sentiment to each word. The word contributes to the overall polarity of a combination of words depending upon which other words it is combined with. This is analogous to the behavior of microscopic particles which exist in all possible states at the same time and interfere with each other to give rise to new states depending upon their relative phases. We make use of the Hilbert Space representation of such particles in Quantum Mechanics where we subscribe a relative phase to each word, which is a complex number, and investigate two such quantum inspired models to derive the meaning of a combination of words. The proposed models achieve better performances than state-of-the-art non-quantum models on the binary sentence classification task.
Tasks	Sentence Classification, Word Embeddings
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11351v1
PDF	http://arxiv.org/pdf/1805.11351v1.pdf
PWC	https://paperswithcode.com/paper/quantum-inspired-complex-word-embedding
Repo
Framework

Deep Domain Adaptation under Deep Label Scarcity


Title	Deep Domain Adaptation under Deep Label Scarcity
Authors	Amar Prakash Azad, Dinesh Garg, Priyanka Agrawal, Arun Kumar
Abstract	The goal behind Domain Adaptation (DA) is to leverage the labeled examples from a source domain so as to infer an accurate model in a target domain where labels are not available or in scarce at the best. A state-of-the-art approach for the DA is due to (Ganin et al. 2016), known as DANN, where they attempt to induce a common representation of source and target domains via adversarial training. This approach requires a large number of labeled examples from the source domain to be able to infer a good model for the target domain. However, in many situations obtaining labels in the source domain is expensive which results in deteriorated performance of DANN and limits its applicability in such scenarios. In this paper, we propose a novel approach to overcome this limitation. In our work, we first establish that DANN reduces the original DA problem into a semi-supervised learning problem over the space of common representation. Next, we propose a learning approach, namely TransDANN, that amalgamates adversarial learning and transductive learning to mitigate the detrimental impact of limited source labels and yields improved performance. Experimental results (both on text and images) show a significant boost in the performance of TransDANN over DANN under such scenarios. We also provide theoretical justification for the performance boost.
Tasks	Domain Adaptation
Published	2018-09-20
URL	http://arxiv.org/abs/1809.08097v1
PDF	http://arxiv.org/pdf/1809.08097v1.pdf
PWC	https://paperswithcode.com/paper/deep-domain-adaptation-under-deep-label
Repo
Framework

Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions


Title	Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions
Authors	Milo Honegger
Abstract	From self-driving vehicles and back-flipping robots to virtual assistants who book our next appointment at the hair salon or at that restaurant for dinner - machine learning systems are becoming increasingly ubiquitous. The main reason for this is that these methods boast remarkable predictive capabilities. However, most of these models remain black boxes, meaning that it is very challenging for humans to follow and understand their intricate inner workings. Consequently, interpretability has suffered under this ever-increasing complexity of machine learning models. Especially with regards to new regulations, such as the General Data Protection Regulation (GDPR), the necessity for plausibility and verifiability of predictions made by these black boxes is indispensable. Driven by the needs of industry and practice, the research community has recognised this interpretability problem and focussed on developing a growing number of so-called explanation methods over the past few years. These methods explain individual predictions made by black box machine learning models and help to recover some of the lost interpretability. With the proliferation of these explanation methods, it is, however, often unclear, which explanation method offers a higher explanation quality, or is generally better-suited for the situation at hand. In this thesis, we thus propose an axiomatic framework, which allows comparing the quality of different explanation methods amongst each other. Through experimental validation, we find that the developed framework is useful to assess the explanation quality of different explanation methods and reach conclusions that are consistent with independent research.
Tasks
Published	2018-08-15
URL	http://arxiv.org/abs/1808.05054v1
PDF	http://arxiv.org/pdf/1808.05054v1.pdf
PWC	https://paperswithcode.com/paper/shedding-light-on-black-box-machine-learning
Repo
Framework

Analysis of KNN Information Estimators for Smooth Distributions


Title	Analysis of KNN Information Estimators for Smooth Distributions
Authors	Puning Zhao, Lifeng Lai
Abstract	KSG mutual information estimator, which is based on the distances of each sample to its k-th nearest neighbor, is widely used to estimate mutual information between two continuous random variables. Existing work has analyzed the convergence rate of this estimator for random variables whose densities are bounded away from zero in its support. In practice, however, KSG estimator also performs well for a much broader class of distributions, including not only those with bounded support and densities bounded away from zero, but also those with bounded support but densities approaching zero, and those with unbounded support. In this paper, we analyze the convergence rate of the error of KSG estimator for smooth distributions, whose support of density can be both bounded and unbounded. As KSG mutual information estimator can be viewed as an adaptive recombination of KL entropy estimators, in our analysis, we also provide convergence analysis of KL entropy estimator for a broad class of distributions.
Tasks
Published	2018-10-27
URL	https://arxiv.org/abs/1810.11571v3
PDF	https://arxiv.org/pdf/1810.11571v3.pdf
PWC	https://paperswithcode.com/paper/analysis-of-knn-information-estimators-for
Repo
Framework