Paper Group ANR 1468
A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure. Deep Learning Multidimensional Projections. Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations. On the relationship between multitask neural networks and multitask Gaussian Processes. Stationary …
A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure
Title | A Machine Learning framework for Sleeping Cell Detection in a Smart-city IoT Telecommunications Infrastructure |
Authors | Orestes Manzanilla-Salazar, Filippo Malandra, Hakim Mellah, Constant Wette, Brunilde Sanso |
Abstract | The smooth operation of largely deployed Internet of Things (IoT) applications will depend on, among other things, effective infrastructure failure detection. Access failures in wireless network Base Stations (BSs) produce a phenomenon called “sleeping cells”, which can render a cell catatonic without triggering any alarms or provoking immediate effects on cell performance, making them difficult to discover. To detect this kind of failure, we propose a Machine Learning (ML) framework based on the use of Key Performance Indicator (KPI) statistics from the BS under study, as well as those of the neighboring BSs with propensity to have their performance affected by the failure. A simple way to define neighbors is to use adjacency in Voronoi diagrams. In this paper, we propose a much more realistic approach based on the nature of radio-propagation and the way devices choose the BS to which they send access requests. We gather data from large-scale simulators that use real location data for BSs and IoT devices and pose the detection problem as a supervised binary classification problem. We measure the effects on the detection performance by the size of time aggregations of the data, the level of traffic and the parameters of the neighborhood definition. The Extra Trees and Naive Bayes classifiers achieve Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) scores of 0.996 and 0.993, respectively, with False Positive Rate (FPR) under 5 %. The proposed framework holds potential for other pattern recognition tasks in smart-city wireless infrastructures, that would enable the monitoring, prediction and improvement of the Quality of Service (QoS) experienced by IoT applications. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01092v2 |
https://arxiv.org/pdf/1910.01092v2.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-framework-for-sleeping |
Repo | |
Framework | |
Deep Learning Multidimensional Projections
Title | Deep Learning Multidimensional Projections |
Authors | Mateus Espadoto, Nina S. T. Hirata, Alexandru C. Telea |
Abstract | Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ability to visually separate distinct data clusters. However, such methods are computationally expensive for large datasets, suffer from stability problems, and cannot directly handle out-of-sample data. We propose a learning approach to construct such projections. We train a deep neural network based on a collection of samples from a given data universe, and their corresponding projections, and next use the network to infer projections of data from the same, or similar, universes. Our approach generates projections with similar characteristics as the learned ones, is computationally two to three orders of magnitude faster than SNE-class methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique. We demonstrate our proposal on several real-world high dimensional datasets from machine learning. |
Tasks | Dimensionality Reduction |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.07958v1 |
http://arxiv.org/pdf/1902.07958v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-multidimensional-projections |
Repo | |
Framework | |
Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations
Title | Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations |
Authors | Yuan Li, Benjamin I. P. Rubinstein, Trevor Cohn |
Abstract | Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of truth inference', as individual workers cannot be wholly trusted to provide reliable annotations. Research into models of annotation aggregation attempts to infer a latent true’ annotation, which has been shown to improve the utility of crowd-sourced data. However, existing techniques beat simple baselines only in low redundancy settings, where the number of annotations per instance is low ($\le 3$), or in situations where workers are unreliable and produce low quality annotations (e.g., through spamming, random, or adversarial behaviours.) As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated ($\ge 5$ annotations per instance), and the vast majority of workers produce high quality outputs. In these settings, the majority vote heuristic performs very well, and most truth inference models underperform this simple baseline. We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. Our technique produces competitive performance to the state-of-the-art benchmark methods, and is the only method that significantly outperforms the majority vote heuristic at one-sided level 0.025, shown by significance tests. Moreover, our technique is simple, is implemented in only 50 lines of code, and trains in seconds. |
Tasks | |
Published | 2019-02-24 |
URL | http://arxiv.org/abs/1902.08918v1 |
http://arxiv.org/pdf/1902.08918v1.pdf | |
PWC | https://paperswithcode.com/paper/truth-inference-at-scale-a-bayesian-model-for |
Repo | |
Framework | |
On the relationship between multitask neural networks and multitask Gaussian Processes
Title | On the relationship between multitask neural networks and multitask Gaussian Processes |
Authors | Karthikeyan K, Shubham Kumar Bharti, Piyush Rai |
Abstract | Despite the effectiveness of multitask deep neural network (MTDNN), there is a limited theoretical understanding on how the information is shared across different tasks in MTDNN. In this work, we establish a formal connection between MTDNN with infinitely-wide hidden layers and multitask Gaussian Process (GP). We derive multitask GP kernels corresponding to both single-layer and deep multitask Bayesian neural networks (MTBNN) and show that information among different tasks is shared primarily due to correlation across last layer weights of MTBNN and shared hyper-parameters, which is contrary to the popular hypothesis that information is shared because of shared intermediate layer weights. Our construction enables using multitask GP to perform efficient Bayesian inference for the equivalent MTDNN with infinitely-wide hidden layers. Prior work on the connection between deep neural networks and GP for single task settings can be seen as special cases of our construction. We also present an adaptive multitask neural network architecture that corresponds to a multitask GP with more flexible kernels, such as Linear Model of Coregionalization (LMC) and Cross-Coregionalization (CC) kernels. We provide experimental results to further illustrate these ideas on synthetic and real datasets. |
Tasks | Bayesian Inference, Gaussian Processes |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.05723v1 |
https://arxiv.org/pdf/1912.05723v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-relationship-between-multitask-neural |
Repo | |
Framework | |
Stationary Points of Shallow Neural Networks with Quadratic Activation Function
Title | Stationary Points of Shallow Neural Networks with Quadratic Activation Function |
Authors | David Gamarnik, Eren C. Kızıldağ, Ilias Zadik |
Abstract | We consider the problem of learning shallow neural networks with quadratic activations and planted weight matrix $W^*\in\mathbb{R}^{m\times d}$, where $m$ is the width of the hidden layer and $d\leqslant m$ is the dimension of data having centered i.i.d. coordinates with finite fourth moment. We establish that the landscape of the population risk $\mathcal{L}(W)$ admits an energy barrier separating rank-deficient $W$: if $W\in\mathbb{R}^{m\times d}$ with ${\rm rank}(W)<d$, then $\mathcal{L}(W)$ is bounded away from zero by an amount we quantify. We then establish that all full-rank stationary points of $\mathcal{L}(\cdot)$ are necessarily global optimum. These two results propose a simple explanation for the success of gradient descent in training such networks, when properly initialized: gradient descent algorithm finds a global optimum due to the absence of spurious stationary points within the set of full-rank matrices. We then show that if $W^*\in\mathbb{R}^{m\times d}$ has centered i.i.d. entries with unit variance, finite fourth moment; and is sufficiently wide, that is $m>Cd^2$ for a large $C$, then it is easy to construct a full rank matrix $W$ with population risk below the energy barrier, starting from which gradient descent is guaranteed to converge to a global optimum. Our final focus is on sample complexity: we identify a simple necessary and sufficient geometric condition, not retrospective in manner, on the training data under which any minimizer of the empirical loss has necessarily zero generalization error. We show that as soon as $n\geqslant n^*=d(d+1)/2$, random data enjoys this geometric condition almost surely. At the same time we show that if $n<n^*$, then when the data has centered i.i.d. coordinates, there always exists a matrix $W$ with zero empirical risk, but with population risk bounded away from zero by the same amount as rank deficient matrices. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01599v2 |
https://arxiv.org/pdf/1912.01599v2.pdf | |
PWC | https://paperswithcode.com/paper/stationary-points-of-shallow-neural-networks |
Repo | |
Framework | |
Combining crowd-sourcing and deep learning to explore the meso-scale organization of shallow convection
Title | Combining crowd-sourcing and deep learning to explore the meso-scale organization of shallow convection |
Authors | Stephan Rasp, Hauke Schulz, Sandrine Bony, Bjorn Stevens |
Abstract | Humans excel at detecting interesting patterns in images, for example those taken from satellites. This kind of anecdotal evidence can lead to the discovery of new phenomena. However, it is often difficult to gather enough data of subjective features for significant analysis. This paper presents an example of how crowd-sourcing and deep learning can be combined to explore satellite imagery at scale. In particular, the focus is on the organization of shallow cumulus convection in the trade wind regions. Shallow clouds play a large role in the Earth’s radiation balance yet are poorly represented in climate models. For this project four subjective patterns of organization were defined: Sugar, Flower, Fish and Gravel. On cloud labeling days at two institutes, 67 scientists screened 10,000 satellite images on a crowd-sourcing platform and classified almost 50,000 mesoscale cloud clusters. This dataset is then used as a training dataset for deep learning algorithms that make it possible to automate the pattern detection and create global climatologies of the four patterns. Analysis of the geographical distribution and large-scale environmental conditions indicates that the four patterns have some overlap with established modes of organization, such as open and closed cellular convection, but also differ in important ways. The results and dataset from this project suggests promising research questions. Further, this study illustrates that crowd-sourcing and deep learning complement each other well for the exploration of image datasets. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01906v2 |
https://arxiv.org/pdf/1906.01906v2.pdf | |
PWC | https://paperswithcode.com/paper/combining-crowd-sourcing-and-deep-learning-to |
Repo | |
Framework | |
Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning
Title | Discriminative Embedding Autoencoder with a Regressor Feedback for Zero-Shot Learning |
Authors | Ying Shi, Wei Wei, Zhiming Zheng |
Abstract | Zero-shot learning (ZSL) aims to recognize the novel object categories using the semantic representation of categories, and the key idea is to explore the knowledge of how the novel class is semantically related to the familiar classes. Some typical models are to learn the proper embedding between the image feature space and the semantic space, whilst it is important to learn discriminative features and comprise the coarse-to-fine image feature and semantic information. In this paper, we propose a discriminative embedding autoencoder with a regressor feedback model for ZSL. The encoder learns a mapping from the image feature space to the discriminative embedding space, which regulates both inter-class and intra-class distances between the learned features by a margin, making the learned features be discriminative for object recognition. The regressor feedback learns to map the reconstructed samples back to the the discriminative embedding and the semantic embedding, assisting the decoder to improve the quality of the samples and provide a generalization to the unseen classes. The proposed model is validated extensively on four benchmark datasets: SUN, CUB, AWA1, AWA2, the experiment results show that our proposed model outperforms the state-of-the-art models, and especially in the generalized zero-shot learning (GZSL), significant improvements are achieved. |
Tasks | Object Recognition, Zero-Shot Learning |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08070v1 |
https://arxiv.org/pdf/1907.08070v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-embedding-autoencoder-with-a |
Repo | |
Framework | |
MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds
Title | MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds |
Authors | Ali Thabet, Humam Alwassel, Bernard Ghanem |
Abstract | We present a self-supervised task on point clouds, in order to learn meaningful point-wise features that encode local structure around each point. Our self-supervised network, named MortonNet, operates directly on unstructured/unordered point clouds. Using a multi-layer RNN, MortonNet predicts the next point in a point sequence created by a popular and fast Space Filling Curve, the Morton-order curve. The final RNN state (coined Morton feature) is versatile and can be used in generic 3D tasks on point clouds. In fact, we show how Morton features can be used to significantly improve performance (+3% for 2 popular semantic segmentation algorithms) in the task of semantic segmentation of point clouds on the challenging and large-scale S3DIS dataset. We also show how MortonNet trained on S3DIS transfers well to another large-scale dataset, vKITTI, leading to an improvement over state-of-the-art of 3.8%. Finally, we use Morton features to train a much simpler and more stable model for part segmentation in ShapeNet. Our results show how our self-supervised task results in features that are useful for 3D segmentation tasks, and generalize well to other datasets. |
Tasks | Semantic Segmentation |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00230v1 |
http://arxiv.org/pdf/1904.00230v1.pdf | |
PWC | https://paperswithcode.com/paper/mortonnet-self-supervised-learning-of-local |
Repo | |
Framework | |
SemEval-2015 Task 3: Answer Selection in Community Question Answering
Title | SemEval-2015 Task 3: Answer Selection in Community Question Answering |
Authors | Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree |
Abstract | Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e.g., the exploitation of the interaction between users and the structure of related posts. In this context, we organized SemEval-2015 Task 3 on “Answer Selection in cQA”, which included two subtasks: (a) classifying answers as “good”, “bad”, or “potentially relevant” with respect to the question, and (b) answering a YES/NO question with “yes”, “no”, or “unsure”, based on the list of all answers. We set subtask A for Arabic and English on two relatively different cQA domains, i.e., the Qatar Living website for English, and a Quran-related website for Arabic. We used crowdsourcing on Amazon Mechanical Turk to label a large English training dataset, which we released to the research community. Thirteen teams participated in the challenge with a total of 61 submissions: 24 primary and 37 contrastive. The best systems achieved an official score (macro-averaged F1) of 57.19 and 63.7 for the English subtasks A and B, and 78.55 for the Arabic subtask A. |
Tasks | Answer Selection, Community Question Answering, Question Answering |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11403v1 |
https://arxiv.org/pdf/1911.11403v1.pdf | |
PWC | https://paperswithcode.com/paper/semeval-2015-task-3-answer-selection-in-1 |
Repo | |
Framework | |
An End-to-End Framework for Cold Question Routing in Community Question Answering Services
Title | An End-to-End Framework for Cold Question Routing in Community Question Answering Services |
Authors | Jiankai Sun, Jie Zhao, Huan Sun, Srinivasan Parthasarathy |
Abstract | Routing newly posted questions (a.k.a cold questions) to potential answerers with the suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task. The existing methods either focus only on embedding the graph structural information and are less effective for newly posted questions, or adopt manually engineered feature vectors that are not as representative as the graph embedding methods. Therefore, we propose to address the challenge of leveraging heterogeneous graph and textual information for cold question routing by designing an end-to-end framework that jointly learns CQA node embeddings and finds best answerers for cold questions. We conducted extensive experiments to confirm the usefulness of incorporating the textual information from question tags and demonstrate that an end-2-end framework can achieve promising performances on routing newly posted questions asked by both existing users and newly registered users. |
Tasks | Community Question Answering, Graph Embedding, Question Answering |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.11017v1 |
https://arxiv.org/pdf/1911.11017v1.pdf | |
PWC | https://paperswithcode.com/paper/an-end-to-end-framework-for-cold-question |
Repo | |
Framework | |
Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications
Title | Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications |
Authors | Hyunseok Seo, Masoud Badiei Khuzani, Varun Vasudevan, Charles Huang, Hongyi Ren, Ruoxiu Xiao, Xiao Jia, Lei Xing |
Abstract | In recent years, significant progress has been made in developing more accurate and efficient machine learning algorithms for segmentation of medical and natural images. In this review article, we highlight the imperative role of machine learning algorithms in enabling efficient and accurate segmentation in the field of medical imaging. We specifically focus on several key studies pertaining to the application of machine learning methods to biomedical image segmentation. We review classical machine learning algorithms such as Markov random fields, k-means clustering, random forest, etc. Although such classical learning models are often less accurate compared to the deep learning techniques, they are often more sample efficient and have a less complex structure. We also review different deep learning architectures, such as the artificial neural networks (ANNs), the convolutional neural networks (CNNs), and the recurrent neural networks (RNNs), and present the segmentation results attained by those learning models that were published in the past three years. We highlight the successes and limitations of each machine learning paradigm. In addition, we discuss several challenges related to the training of different machine learning models, and we present some heuristics to address those challenges. |
Tasks | Semantic Segmentation |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02521v1 |
https://arxiv.org/pdf/1911.02521v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-techniques-for-biomedical |
Repo | |
Framework | |
Collective Mobile Sequential Recommendation: A Recommender System for Multiple Taxicabs
Title | Collective Mobile Sequential Recommendation: A Recommender System for Multiple Taxicabs |
Authors | Tongwen Wu, Zizhen Zhang, Yanzhi Li, Jiahai Wang |
Abstract | Mobile sequential recommendation was originally designed to find a promising route for a single taxicab. Directly applying it for multiple taxicabs may cause an excessive overlap of recommended routes. The multi-taxicab recommendation problem is challenging and has been less studied. In this paper, we first formalize a collective mobile sequential recommendation problem based on a classic mathematical model, which characterizes time-varying influence among competing taxicabs. Next, we propose a new evaluation metric for a collection of taxicab routes aimed to minimize the sum of potential travel time. We then develop an efficient algorithm to calculate the metric and design a greedy recommendation method to approximate the solution. Finally, numerical experiments show the superiority of our methods. In trace-driven simulation, the set of routes recommended by our method significantly outperforms those obtained by conventional methods. |
Tasks | Recommendation Systems |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09372v1 |
https://arxiv.org/pdf/1906.09372v1.pdf | |
PWC | https://paperswithcode.com/paper/collective-mobile-sequential-recommendation-a |
Repo | |
Framework | |
Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero and NE Syrtis
Title | Machine-Learning-Driven New Geologic Discoveries at Mars Rover Landing Sites: Jezero and NE Syrtis |
Authors | Murat Dundar, Bethany L. Ehlmann, Ellen K. Leask |
Abstract | A hierarchical Bayesian classifier is trained at pixel scale with spectral data from the CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) imagery. Its utility in detecting rare phases is demonstrated with new geologic discoveries near the Mars-2020 rover landing site. Akaganeite is found in sediments on the Jezero crater floor and in fluvial deposits at NE Syrtis. Jarosite and silica are found on the Jezero crater floor while chlorite-smectite and Al phyllosilicates are found in the Jezero crater walls. These detections point to a multi-stage, multi-chemistry history of water in Jezero crater and the surrounding region and provide new information for guiding the Mars-2020 rover’s landed exploration. In particular, the akaganeite, silica, and jarosite in the floor deposits suggest either a later episode of salty, Fe-rich waters that post-date Jezero delta or groundwater alteration of portions of the Jezero sedimentary sequence. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02387v1 |
https://arxiv.org/pdf/1909.02387v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-driven-new-geologic |
Repo | |
Framework | |
Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks
Title | Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks |
Authors | Tian Ding, Dawei Li, Ruoyu Sun |
Abstract | Does over-parameterization eliminate sub-optimal local minima for neural network problems? On one hand, existing positive results do not prove the claim, but often weaker claims. On the other hand, existing negative results have strong assumptions on the activation functions and/or data samples, causing a large gap with positive results. It was unclear before whether there is a clean answer of “yes” or “no”. In this paper, we answer this question with a strong negative result. In particular, we prove that for deep and over-parameterized networks, sub-optimal local minima exist for generic input data samples and generic nonlinear activation. This is the setting widely studied in the global landscape of over-parameterized networks, thus our result corrects a possible misconception that “over-parameterization eliminates sub-optimal local-min”. Our construction is based on fundamental optimization analysis, and thus rather principled. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01413v2 |
https://arxiv.org/pdf/1911.01413v2.pdf | |
PWC | https://paperswithcode.com/paper/sub-optimal-local-minima-exist-for-almost-all |
Repo | |
Framework | |
Estimating Buildings’ Parameters over Time Including Prior Knowledge
Title | Estimating Buildings’ Parameters over Time Including Prior Knowledge |
Authors | Nilavra Pathak, James Foulds, Nirmalya Roy, Nilanjan Banerjee, Ryan Robucci |
Abstract | Modeling buildings’ heat dynamics is a complex process which depends on various factors including weather, building thermal capacity, insulation preservation, and residents’ behavior. Gray-box models offer a causal inference of those dynamics expressed in few parameters specific to built environments. These parameters can provide compelling insights into the characteristics of building artifacts and have various applications such as forecasting HVAC usage, indoor temperature control monitoring of built environments, etc. In this paper, we present a systematic study of modeling buildings’ thermal characteristics and thus derive the parameters of built conditions with a Bayesian approach. We build a Bayesian state-space model that can adapt and incorporate buildings’ thermal equations and propose a generalized solution that can easily adapt prior knowledge regarding the parameters. We show that a faster approximate approach using variational inference for parameter estimation can provide similar parameters as that of a more time-consuming Markov Chain Monte Carlo (MCMC) approach. We perform extensive evaluations on two datasets to understand the generative process and show that the Bayesian approach is more interpretable. We further study the effects of prior selection for the model parameters and transfer learning, where we learn parameters from one season and use them to fit the model in the other. We perform extensive evaluations on controlled and real data traces to enumerate buildings’ parameter within a 95% credible interval. |
Tasks | Causal Inference, Transfer Learning |
Published | 2019-01-09 |
URL | http://arxiv.org/abs/1901.07469v3 |
http://arxiv.org/pdf/1901.07469v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-buildings-parameters-over-time |
Repo | |
Framework | |