January 26, 2020

3609 words 17 mins read

Paper Group ANR 1468

Paper Group ANR 1468

A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure. Deep Learning Multidimensional Projections. Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations. On the relationship between multitask neural networks and multitask Gaussian Processes. Stationary …

A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure

Title A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure
Authors Orestes Manzanilla-Salazar, Filippo Malandra, Hakim Mellah, Constant Wette, Brunilde Sanso
Abstract The smooth operation of largely deployed Internet of Things (IoT) applications will depend on, among other things, effective infrastructure failure detection. Access failures in wireless network Base Stations (BSs) produce a phenomenon called “sleeping cells”, which can render a cell catatonic without triggering any alarms or provoking immediate effects on cell performance, making them difficult to discover. To detect this kind of failure, we propose a Machine Learning (ML) framework based on the use of Key Performance Indicator (KPI) statistics from the BS under study, as well as those of the neighboring BSs with propensity to have their performance affected by the failure. A simple way to define neighbors is to use adjacency in Voronoi diagrams. In this paper, we propose a much more realistic approach based on the nature of radio-propagation and the way devices choose the BS to which they send access requests. We gather data from large-scale simulators that use real location data for BSs and IoT devices and pose the detection problem as a supervised binary classification problem. We measure the effects on the detection performance by the size of time aggregations of the data, the level of traffic and the parameters of the neighborhood definition. The Extra Trees and Naive Bayes classifiers achieve Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) scores of 0.996 and 0.993, respectively, with False Positive Rate (FPR) under 5 %. The proposed framework holds potential for other pattern recognition tasks in smart-city wireless infrastructures, that would enable the monitoring, prediction and improvement of the Quality of Service (QoS) experienced by IoT applications.
Tasks
Published 2019-10-02
URL https://arxiv.org/abs/1910.01092v2
PDF https://arxiv.org/pdf/1910.01092v2.pdf
PWC https://paperswithcode.com/paper/a-machine-learning-framework-for-sleeping
Repo
Framework

Deep Learning Multidimensional Projections

Title Deep Learning Multidimensional Projections
Authors Mateus Espadoto, Nina S. T. Hirata, Alexandru C. Telea
Abstract Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ability to visually separate distinct data clusters. However, such methods are computationally expensive for large datasets, suffer from stability problems, and cannot directly handle out-of-sample data. We propose a learning approach to construct such projections. We train a deep neural network based on a collection of samples from a given data universe, and their corresponding projections, and next use the network to infer projections of data from the same, or similar, universes. Our approach generates projections with similar characteristics as the learned ones, is computationally two to three orders of magnitude faster than SNE-class methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique. We demonstrate our proposal on several real-world high dimensional datasets from machine learning.
Tasks Dimensionality Reduction
Published 2019-02-21
URL http://arxiv.org/abs/1902.07958v1
PDF http://arxiv.org/pdf/1902.07958v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-multidimensional-projections
Repo
Framework

Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

Title Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations
Authors Yuan Li, Benjamin I. P. Rubinstein, Trevor Cohn
Abstract Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of truth inference', as individual workers cannot be wholly trusted to provide reliable annotations. Research into models of annotation aggregation attempts to infer a latent true’ annotation, which has been shown to improve the utility of crowd-sourced data. However, existing techniques beat simple baselines only in low redundancy settings, where the number of annotations per instance is low ($\le 3$), or in situations where workers are unreliable and produce low quality annotations (e.g., through spamming, random, or adversarial behaviours.) As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated ($\ge 5$ annotations per instance), and the vast majority of workers produce high quality outputs. In these settings, the majority vote heuristic performs very well, and most truth inference models underperform this simple baseline. We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. Our technique produces competitive performance to the state-of-the-art benchmark methods, and is the only method that significantly outperforms the majority vote heuristic at one-sided level 0.025, shown by significance tests. Moreover, our technique is simple, is implemented in only 50 lines of code, and trains in seconds.
Tasks
Published 2019-02-24
URL http://arxiv.org/abs/1902.08918v1
PDF http://arxiv.org/pdf/1902.08918v1.pdf
PWC https://paperswithcode.com/paper/truth-inference-at-scale-a-bayesian-model-for
Repo
Framework

On the relationship between multitask neural networks and multitask Gaussian Processes

Title On the relationship between multitask neural networks and multitask Gaussian Processes
Authors Karthikeyan K, Shubham Kumar Bharti, Piyush Rai
Abstract Despite the effectiveness of multitask deep neural network (MTDNN), there is a limited theoretical understanding on how the information is shared across different tasks in MTDNN. In this work, we establish a formal connection between MTDNN with infinitely-wide hidden layers and multitask Gaussian Process (GP). We derive multitask GP kernels corresponding to both single-layer and deep multitask Bayesian neural networks (MTBNN) and show that information among different tasks is shared primarily due to correlation across last layer weights of MTBNN and shared hyper-parameters, which is contrary to the popular hypothesis that information is shared because of shared intermediate layer weights. Our construction enables using multitask GP to perform efficient Bayesian inference for the equivalent MTDNN with infinitely-wide hidden layers. Prior work on the connection between deep neural networks and GP for single task settings can be seen as special cases of our construction. We also present an adaptive multitask neural network architecture that corresponds to a multitask GP with more flexible kernels, such as Linear Model of Coregionalization (LMC) and Cross-Coregionalization (CC) kernels. We provide experimental results to further illustrate these ideas on synthetic and real datasets.
Tasks Bayesian Inference, Gaussian Processes
Published 2019-12-12
URL https://arxiv.org/abs/1912.05723v1
PDF https://arxiv.org/pdf/1912.05723v1.pdf
PWC https://paperswithcode.com/paper/on-the-relationship-between-multitask-neural
Repo
Framework

Stationary Points of Shallow Neural Networks with Quadratic Activation Function

Title Stationary Points of Shallow Neural Networks with Quadratic Activation Function
Authors David Gamarnik, Eren C. Kızıldağ, Ilias Zadik
Abstract We consider the problem of learning shallow neural networks with quadratic activations and planted weight matrix $W^*\in\mathbb{R}^{m\times d}$, where $m$ is the width of the hidden layer and $d\leqslant m$ is the dimension of data having centered i.i.d. coordinates with finite fourth moment. We establish that the landscape of the population risk $\mathcal{L}(W)$ admits an energy barrier separating rank-deficient $W$: if $W\in\mathbb{R}^{m\times d}$ with ${\rm rank}(W)<d$, then $\mathcal{L}(W)$ is bounded away from zero by an amount we quantify. We then establish that all full-rank stationary points of $\mathcal{L}(\cdot)$ are necessarily global optimum. These two results propose a simple explanation for the success of gradient descent in training such networks, when properly initialized: gradient descent algorithm finds a global optimum due to the absence of spurious stationary points within the set of full-rank matrices. We then show that if $W^*\in\mathbb{R}^{m\times d}$ has centered i.i.d. entries with unit variance, finite fourth moment; and is sufficiently wide, that is $m>Cd^2$ for a large $C$, then it is easy to construct a full rank matrix $W$ with population risk below the energy barrier, starting from which gradient descent is guaranteed to converge to a global optimum. Our final focus is on sample complexity: we identify a simple necessary and sufficient geometric condition, not retrospective in manner, on the training data under which any minimizer of the empirical loss has necessarily zero generalization error. We show that as soon as $n\geqslant n^*=d(d+1)/2$, random data enjoys this geometric condition almost surely. At the same time we show that if $n<n^*$, then when the data has centered i.i.d. coordinates, there always exists a matrix $W$ with zero empirical risk, but with population risk bounded away from zero by the same amount as rank deficient matrices.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01599v2
PDF https://arxiv.org/pdf/1912.01599v2.pdf
PWC https://paperswithcode.com/paper/stationary-points-of-shallow-neural-networks
Repo
Framework

Combining crowd-sourcing and deep learning to explore the meso-scale organization of shallow convection

Title Combining crowd-sourcing and deep learning to explore the meso-scale organization of shallow convection
Authors Stephan Rasp, Hauke Schulz, Sandrine Bony, Bjorn Stevens
Abstract Humans excel at detecting interesting patterns in images, for example those taken from satellites. This kind of anecdotal evidence can lead to the discovery of new phenomena. However, it is often difficult to gather enough data of subjective features for significant analysis. This paper presents an example of how crowd-sourcing and deep learning can be combined to explore satellite imagery at scale. In particular, the focus is on the organization of shallow cumulus convection in the trade wind regions. Shallow clouds play a large role in the Earth’s radiation balance yet are poorly represented in climate models. For this project four subjective patterns of organization were defined: Sugar, Flower, Fish and Gravel. On cloud labeling days at two institutes, 67 scientists screened 10,000 satellite images on a crowd-sourcing platform and classified almost 50,000 mesoscale cloud clusters. This dataset is then used as a training dataset for deep learning algorithms that make it possible to automate the pattern detection and create global climatologies of the four patterns. Analysis of the geographical distribution and large-scale environmental conditions indicates that the four patterns have some overlap with established modes of organization, such as open and closed cellular convection, but also differ in important ways. The results and dataset from this project suggests promising research questions. Further, this study illustrates that crowd-sourcing and deep learning complement each other well for the exploration of image datasets.
Tasks
Published 2019-06-05
URL https://arxiv.org/abs/1906.01906v2
PDF https://arxiv.org/pdf/1906.01906v2.pdf
PWC https://paperswithcode.com/paper/combining-crowd-sourcing-and-deep-learning-to
Repo
Framework

Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning

Title Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning
Authors Ying Shi, Wei Wei, Zhiming Zheng
Abstract Zero-shot learning (ZSL) aims to recognize the novel object categories using the semantic representation of categories, and the key idea is to explore the knowledge of how the novel class is semantically related to the familiar classes. Some typical models are to learn the proper embedding between the image feature space and the semantic space, whilst it is important to learn discriminative features and comprise the coarse-to-fine image feature and semantic information. In this paper, we propose a discriminative embedding autoencoder with a regressor feedback model for ZSL. The encoder learns a mapping from the image feature space to the discriminative embedding space, which regulates both inter-class and intra-class distances between the learned features by a margin, making the learned features be discriminative for object recognition. The regressor feedback learns to map the reconstructed samples back to the the discriminative embedding and the semantic embedding, assisting the decoder to improve the quality of the samples and provide a generalization to the unseen classes. The proposed model is validated extensively on four benchmark datasets: SUN, CUB, AWA1, AWA2, the experiment results show that our proposed model outperforms the state-of-the-art models, and especially in the generalized zero-shot learning (GZSL), significant improvements are achieved.
Tasks Object Recognition, Zero-Shot Learning
Published 2019-07-18
URL https://arxiv.org/abs/1907.08070v1
PDF https://arxiv.org/pdf/1907.08070v1.pdf
PWC https://paperswithcode.com/paper/discriminative-embedding-autoencoder-with-a
Repo
Framework

MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds

Title MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds
Authors Ali Thabet, Humam Alwassel, Bernard Ghanem
Abstract We present a self-supervised task on point clouds, in order to learn meaningful point-wise features that encode local structure around each point. Our self-supervised network, named MortonNet, operates directly on unstructured/unordered point clouds. Using a multi-layer RNN, MortonNet predicts the next point in a point sequence created by a popular and fast Space Filling Curve, the Morton-order curve. The final RNN state (coined Morton feature) is versatile and can be used in generic 3D tasks on point clouds. In fact, we show how Morton features can be used to significantly improve performance (+3% for 2 popular semantic segmentation algorithms) in the task of semantic segmentation of point clouds on the challenging and large-scale S3DIS dataset. We also show how MortonNet trained on S3DIS transfers well to another large-scale dataset, vKITTI, leading to an improvement over state-of-the-art of 3.8%. Finally, we use Morton features to train a much simpler and more stable model for part segmentation in ShapeNet. Our results show how our self-supervised task results in features that are useful for 3D segmentation tasks, and generalize well to other datasets.
Tasks Semantic Segmentation
Published 2019-03-30
URL http://arxiv.org/abs/1904.00230v1
PDF http://arxiv.org/pdf/1904.00230v1.pdf
PWC https://paperswithcode.com/paper/mortonnet-self-supervised-learning-of-local
Repo
Framework

SemEval-2015 Task 3: Answer Selection in Community Question Answering

Title SemEval-2015 Task 3: Answer Selection in Community Question Answering
Authors Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree
Abstract Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e.g., the exploitation of the interaction between users and the structure of related posts. In this context, we organized SemEval-2015 Task 3 on “Answer Selection in cQA”, which included two subtasks: (a) classifying answers as “good”, “bad”, or “potentially relevant” with respect to the question, and (b) answering a YES/NO question with “yes”, “no”, or “unsure”, based on the list of all answers. We set subtask A for Arabic and English on two relatively different cQA domains, i.e., the Qatar Living website for English, and a Quran-related website for Arabic. We used crowdsourcing on Amazon Mechanical Turk to label a large English training dataset, which we released to the research community. Thirteen teams participated in the challenge with a total of 61 submissions: 24 primary and 37 contrastive. The best systems achieved an official score (macro-averaged F1) of 57.19 and 63.7 for the English subtasks A and B, and 78.55 for the Arabic subtask A.
Tasks Answer Selection, Community Question Answering, Question Answering
Published 2019-11-26
URL https://arxiv.org/abs/1911.11403v1
PDF https://arxiv.org/pdf/1911.11403v1.pdf
PWC https://paperswithcode.com/paper/semeval-2015-task-3-answer-selection-in-1
Repo
Framework

An End-to-End Framework for Cold Question Routing in Community Question Answering Services

Title An End-to-End Framework for Cold Question Routing in Community Question Answering Services
Authors Jiankai Sun, Jie Zhao, Huan Sun, Srinivasan Parthasarathy
Abstract Routing newly posted questions (a.k.a cold questions) to potential answerers with the suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task. The existing methods either focus only on embedding the graph structural information and are less effective for newly posted questions, or adopt manually engineered feature vectors that are not as representative as the graph embedding methods. Therefore, we propose to address the challenge of leveraging heterogeneous graph and textual information for cold question routing by designing an end-to-end framework that jointly learns CQA node embeddings and finds best answerers for cold questions. We conducted extensive experiments to confirm the usefulness of incorporating the textual information from question tags and demonstrate that an end-2-end framework can achieve promising performances on routing newly posted questions asked by both existing users and newly registered users.
Tasks Community Question Answering, Graph Embedding, Question Answering
Published 2019-11-22
URL https://arxiv.org/abs/1911.11017v1
PDF https://arxiv.org/pdf/1911.11017v1.pdf
PWC https://paperswithcode.com/paper/an-end-to-end-framework-for-cold-question
Repo
Framework

Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications

Title Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications
Authors Hyunseok Seo, Masoud Badiei Khuzani, Varun Vasudevan, Charles Huang, Hongyi Ren, Ruoxiu Xiao, Xiao Jia, Lei Xing
Abstract In recent years, significant progress has been made in developing more accurate and efficient machine learning algorithms for segmentation of medical and natural images. In this review article, we highlight the imperative role of machine learning algorithms in enabling efficient and accurate segmentation in the field of medical imaging. We specifically focus on several key studies pertaining to the application of machine learning methods to biomedical image segmentation. We review classical machine learning algorithms such as Markov random fields, k-means clustering, random forest, etc. Although such classical learning models are often less accurate compared to the deep learning techniques, they are often more sample efficient and have a less complex structure. We also review different deep learning architectures, such as the artificial neural networks (ANNs), the convolutional neural networks (CNNs), and the recurrent neural networks (RNNs), and present the segmentation results attained by those learning models that were published in the past three years. We highlight the successes and limitations of each machine learning paradigm. In addition, we discuss several challenges related to the training of different machine learning models, and we present some heuristics to address those challenges.
Tasks Semantic Segmentation
Published 2019-11-06
URL https://arxiv.org/abs/1911.02521v1
PDF https://arxiv.org/pdf/1911.02521v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-techniques-for-biomedical
Repo
Framework

Collective Mobile Sequential Recommendation: A Recommender System for Multiple Taxicabs

Title Collective Mobile Sequential Recommendation: A Recommender System for Multiple Taxicabs
Authors Tongwen Wu, Zizhen Zhang, Yanzhi Li, Jiahai Wang
Abstract Mobile sequential recommendation was originally designed to find a promising route for a single taxicab. Directly applying it for multiple taxicabs may cause an excessive overlap of recommended routes. The multi-taxicab recommendation problem is challenging and has been less studied. In this paper, we first formalize a collective mobile sequential recommendation problem based on a classic mathematical model, which characterizes time-varying influence among competing taxicabs. Next, we propose a new evaluation metric for a collection of taxicab routes aimed to minimize the sum of potential travel time. We then develop an efficient algorithm to calculate the metric and design a greedy recommendation method to approximate the solution. Finally, numerical experiments show the superiority of our methods. In trace-driven simulation, the set of routes recommended by our method significantly outperforms those obtained by conventional methods.
Tasks Recommendation Systems
Published 2019-06-22
URL https://arxiv.org/abs/1906.09372v1
PDF https://arxiv.org/pdf/1906.09372v1.pdf
PWC https://paperswithcode.com/paper/collective-mobile-sequential-recommendation-a
Repo
Framework

Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero and NE Syrtis

Title Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero and NE Syrtis
Authors Murat Dundar, Bethany L. Ehlmann, Ellen K. Leask
Abstract A hierarchical Bayesian classifier is trained at pixel scale with spectral data from the CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) imagery. Its utility in detecting rare phases is demonstrated with new geologic discoveries near the Mars-2020 rover landing site. Akaganeite is found in sediments on the Jezero crater floor and in fluvial deposits at NE Syrtis. Jarosite and silica are found on the Jezero crater floor while chlorite-smectite and Al phyllosilicates are found in the Jezero crater walls. These detections point to a multi-stage, multi-chemistry history of water in Jezero crater and the surrounding region and provide new information for guiding the Mars-2020 rover’s landed exploration. In particular, the akaganeite, silica, and jarosite in the floor deposits suggest either a later episode of salty, Fe-rich waters that post-date Jezero delta or groundwater alteration of portions of the Jezero sedimentary sequence.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02387v1
PDF https://arxiv.org/pdf/1909.02387v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-driven-new-geologic
Repo
Framework

Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks

Title Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks
Authors Tian Ding, Dawei Li, Ruoyu Sun
Abstract Does over-parameterization eliminate sub-optimal local minima for neural network problems? On one hand, existing positive results do not prove the claim, but often weaker claims. On the other hand, existing negative results have strong assumptions on the activation functions and/or data samples, causing a large gap with positive results. It was unclear before whether there is a clean answer of “yes” or “no”. In this paper, we answer this question with a strong negative result. In particular, we prove that for deep and over-parameterized networks, sub-optimal local minima exist for generic input data samples and generic nonlinear activation. This is the setting widely studied in the global landscape of over-parameterized networks, thus our result corrects a possible misconception that “over-parameterization eliminates sub-optimal local-min”. Our construction is based on fundamental optimization analysis, and thus rather principled.
Tasks
Published 2019-11-04
URL https://arxiv.org/abs/1911.01413v2
PDF https://arxiv.org/pdf/1911.01413v2.pdf
PWC https://paperswithcode.com/paper/sub-optimal-local-minima-exist-for-almost-all
Repo
Framework

Estimating Buildings’ Parameters over Time Including Prior Knowledge

Title Estimating Buildings’ Parameters over Time Including Prior Knowledge
Authors Nilavra Pathak, James Foulds, Nirmalya Roy, Nilanjan Banerjee, Ryan Robucci
Abstract Modeling buildings’ heat dynamics is a complex process which depends on various factors including weather, building thermal capacity, insulation preservation, and residents’ behavior. Gray-box models offer a causal inference of those dynamics expressed in few parameters specific to built environments. These parameters can provide compelling insights into the characteristics of building artifacts and have various applications such as forecasting HVAC usage, indoor temperature control monitoring of built environments, etc. In this paper, we present a systematic study of modeling buildings’ thermal characteristics and thus derive the parameters of built conditions with a Bayesian approach. We build a Bayesian state-space model that can adapt and incorporate buildings’ thermal equations and propose a generalized solution that can easily adapt prior knowledge regarding the parameters. We show that a faster approximate approach using variational inference for parameter estimation can provide similar parameters as that of a more time-consuming Markov Chain Monte Carlo (MCMC) approach. We perform extensive evaluations on two datasets to understand the generative process and show that the Bayesian approach is more interpretable. We further study the effects of prior selection for the model parameters and transfer learning, where we learn parameters from one season and use them to fit the model in the other. We perform extensive evaluations on controlled and real data traces to enumerate buildings’ parameter within a 95% credible interval.
Tasks Causal Inference, Transfer Learning
Published 2019-01-09
URL http://arxiv.org/abs/1901.07469v3
PDF http://arxiv.org/pdf/1901.07469v3.pdf
PWC https://paperswithcode.com/paper/estimating-buildings-parameters-over-time
Repo
Framework
comments powered by Disqus