Paper Group ANR 734
Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data. Byzantine Fault Tolerant Distributed Linear Regression. Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics. Deep Ranking Based Cost-sensitive Multi-label Learning for Distant Supervision Relation Extraction. Kernels on fuzzy sets: an overv …
Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data
Title | Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data |
Authors | Abhijit Suprem |
Abstract | Concept drift in learning and classification occurs when the statistical properties of either the data features or target change over time; evidence of drift has appeared in search data, medical research, malware, web data, and video. Drift adaptation has not yet been addressed in high dimensional, noisy, low-context data such as streaming text, video, or images due to the unique challenges these domains present. We present a two-fold approach to deal with concept drift in these domains: a density-based clustering approach to deal with virtual concept drift (change in statistical properties of features) and a weak-supervision step to deal with real concept drift (change in statistical properties of target). Our density-based clustering avoids problems posed by the curse of dimensionality to create an evolving ‘map’ of the live data space, thereby addressing virtual drift in features. Our weak-supervision step leverages high-confidence labels (oracle or heuristic labels) to generate weighted training sets to generalize and update existing deep learners to adapt to changing decision boundaries (real drift) and create new deep learners for unseen regions of the data space. Our results show that our two-fold approach performs well with >90% precision in 2018, four years after initial deployment in 2014, without any human intervention. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01064v1 |
https://arxiv.org/pdf/1910.01064v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-drift-detection-and-adaptation-with-1 |
Repo | |
Framework | |
Byzantine Fault Tolerant Distributed Linear Regression
Title | Byzantine Fault Tolerant Distributed Linear Regression |
Authors | Nirupam Gupta, Nitin H. Vaidya |
Abstract | This paper considers the problem of Byzantine fault tolerance in distributed linear regression in a multi-agent system. However, the proposed algorithms are given for a more general class of distributed optimization problems, of which distributed linear regression is a special case. The system comprises of a server and multiple agents, where each agent is holding a certain number of data points and responses that satisfy a linear relationship (could be noisy). The objective of the server is to determine this relationship, given that some of the agents in the system (up to a known number) are Byzantine faulty (aka. actively adversarial). We show that the server can achieve this objective, in a deterministic manner, by robustifying the original distributed gradient descent method using norm based filters, namely ‘norm filtering’ and ‘norm-cap filtering’, incurring an additional log-linear computation cost in each iteration. The proposed algorithms improve upon the existing methods on three levels: i) no assumptions are required on the probability distribution of data points, ii) system can be partially asynchronous, and iii) the computational overhead (in order to handle Byzantine faulty agents) is log-linear in number of agents and linear in dimension of data points. The proposed algorithms differ from each other in the assumptions made for their correctness, and the gradient filter they use. |
Tasks | Distributed Optimization |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08752v2 |
http://arxiv.org/pdf/1903.08752v2.pdf | |
PWC | https://paperswithcode.com/paper/byzantine-fault-tolerant-distributed-linear |
Repo | |
Framework | |
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics
Title | Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics |
Authors | Debjani Saha, Candice Schumann, Duncan C. McElfresh, John P. Dickerson, Michelle L. Mazurek, Michael Carl Tschantz |
Abstract | Bias in machine learning has manifested injustice in several areas, such as medicine, hiring, and criminal justice. In response, computer scientists have developed myriad definitions of fairness to correct this bias in fielded algorithms. While some definitions are based on established legal and ethical norms, others are largely mathematical. It is unclear whether the general public agrees with these fairness definitions, and perhaps more importantly, whether they understand these definitions. We take initial steps toward bridging this gap between ML researchers and the public, by addressing the question: does a lay audience understand a basic definition of ML fairness? We develop a metric to measure comprehension of three such definitions–demographic parity, equal opportunity, and equalized odds. We evaluate this metric using an online survey, and investigate the relationship between comprehension and sentiment, demographics, and the definition itself. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/2001.00089v2 |
https://arxiv.org/pdf/2001.00089v2.pdf | |
PWC | https://paperswithcode.com/paper/human-comprehension-of-fairness-in-machine |
Repo | |
Framework | |
Deep Ranking Based Cost-sensitive Multi-label Learning for Distant Supervision Relation Extraction
Title | Deep Ranking Based Cost-sensitive Multi-label Learning for Distant Supervision Relation Extraction |
Authors | Hai Ye, Zhunchen Luo |
Abstract | Knowledge base provides a potential way to improve the intelligence of information retrieval (IR) systems, for that knowledge base has numerous relations between entities which can help the IR systems to conduct inference from one entity to another entity. Relation extraction is one of the fundamental techniques to construct a knowledge base. Distant supervision is a semi-supervised learning method for relation extraction which learns with labeled and unlabeled data. However, this approach suffers the problem of relation overlapping in which one entity tuple may have multiple relation facts. We believe that relation types can have latent connections, which we call class ties, and can be exploited to enhance relation extraction. However, this property between relation classes has not been fully explored before. In this paper, to exploit class ties between relations to improve relation extraction, we propose a general ranking based multi-label learning framework combined with convolutional neural networks, in which ranking based loss functions with regularization technique are introduced to learn the latent connections between relations. Furthermore, to deal with the problem of class imbalance in distant supervision relation extraction, we further adopt cost-sensitive learning to rescale the costs from the positive and negative labels. Extensive experiments on a widely used dataset show the effectiveness of our model to exploit class ties and to relieve class imbalance problem. |
Tasks | Information Retrieval, Multi-Label Learning, Relation Extraction |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11521v1 |
https://arxiv.org/pdf/1907.11521v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-ranking-based-cost-sensitive-multi-label |
Repo | |
Framework | |
Kernels on fuzzy sets: an overview
Title | Kernels on fuzzy sets: an overview |
Authors | Jorge Guevara, Roberto Hirata Jr, Stéphane Canu |
Abstract | This paper introduces the concept of kernels on fuzzy sets as a similarity measure for $[0,1]$-valued functions, a.k.a. \emph{membership functions of fuzzy sets}. We defined the following classes of kernels: the cross product, the intersection, the non-singleton and the distance-based kernels on fuzzy sets. Applicability of those kernels are on machine learning and data science tasks where uncertainty in data has an ontic or epistemistic interpretation. |
Tasks | |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12991v1 |
https://arxiv.org/pdf/1907.12991v1.pdf | |
PWC | https://paperswithcode.com/paper/kernels-on-fuzzy-sets-an-overview |
Repo | |
Framework | |
E-MIIM: An Ensemble Learning based Context-Aware Mobile Telephony Model for Intelligent Interruption Management
Title | E-MIIM: An Ensemble Learning based Context-Aware Mobile Telephony Model for Intelligent Interruption Management |
Authors | Iqbal H. Sarker, A. S. M. Kayes, Md Hasan Furhad, Mohammad Mainul Islam, Md Shohidul Islam |
Abstract | Nowadays, mobile telephony interruptions in our daily life activities are common because of the inappropriate ringing notifications of incoming phone calls in different contexts. Such interruptions may impact on the work attention not only for the mobile phone owners but also the surrounding people. Decision tree is the most popular machine learning classification technique that is used in existing context-aware mobile intelligent interruption management (MIIM) model to overcome such issues. However, a single decision tree based context-aware model may cause overfitting problem and thus decrease the prediction accuracy of the inferred model. Therefore, in this paper, we propose an ensemble machine learning based context-aware mobile telephony model for the purpose of intelligent interruption management by taking into account multi-dimensional contexts and name it “E-MIIM”. The experimental results on individuals’ real life mobile telephony datasets show that our E-MIIM model is more effective and outperforms existing MIIM model for predicting and managing individual’s mobile telephony interruptions based on their relevant contextual information. |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1909.11029v1 |
https://arxiv.org/pdf/1909.11029v1.pdf | |
PWC | https://paperswithcode.com/paper/e-miim-an-ensemble-learning-based-context |
Repo | |
Framework | |
Co-Attention Based Neural Network for Source-Dependent Essay Scoring
Title | Co-Attention Based Neural Network for Source-Dependent Essay Scoring |
Authors | Haoran Zhang, Diane Litman |
Abstract | This paper presents an investigation of using a co-attention based neural network for source-dependent essay scoring. We use a co-attention mechanism to help the model learn the importance of each part of the essay more accurately. Also, this paper shows that the co-attention based neural network model provides reliable score prediction of source-dependent responses. We evaluate our model on two source-dependent response corpora. Results show that our model outperforms the baseline on both corpora. We also show that the attention of the model is similar to the expert opinions with examples. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.01993v1 |
https://arxiv.org/pdf/1908.01993v1.pdf | |
PWC | https://paperswithcode.com/paper/co-attention-based-neural-network-for-source-2 |
Repo | |
Framework | |
CNN-based RGB-D Salient Object Detection: Learn, Select and Fuse
Title | CNN-based RGB-D Salient Object Detection: Learn, Select and Fuse |
Authors | Hao Chen, Youfu Li |
Abstract | The goal of this work is to present a systematic solution for RGB-D salient object detection, which addresses the following three aspects with a unified framework: modal-specific representation learning, complementary cue selection and cross-modal complement fusion. To learn discriminative modal-specific features, we propose a hierarchical cross-modal distillation scheme, in which the well-learned source modality provides supervisory signals to facilitate the learning process for the new modality. To better extract the complementary cues, we formulate a residual function to incorporate complements from the paired modality adaptively. Furthermore, a top-down fusion structure is constructed for sufficient cross-modal interactions and cross-level transmissions. The experimental results demonstrate the effectiveness of the proposed cross-modal distillation scheme in zero-shot saliency detection and pre-training on a new modality, as well as the advantages in selecting and fusing cross-modal/cross-level complements. |
Tasks | Object Detection, Representation Learning, Saliency Detection, Salient Object Detection |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09309v1 |
https://arxiv.org/pdf/1909.09309v1.pdf | |
PWC | https://paperswithcode.com/paper/cnn-based-rgb-d-salient-object-detection |
Repo | |
Framework | |
Automated Brain Tumour Segmentation Using Deep Fully Residual Convolutional Neural Networks
Title | Automated Brain Tumour Segmentation Using Deep Fully Residual Convolutional Neural Networks |
Authors | Indrajit Mazumdar |
Abstract | Automated brain tumour segmentation has the potential of making a massive improvement in disease diagnosis, surgery, monitoring and surveillance. However, this task is extremely challenging. Here, we describe our automated segmentation method using 2D CNNs that are based on U-Net. To deal with class imbalance effectively, we have formulated a novel weighted Dice loss function. We found that increasing the depth of the ‘U’ shape beyond a certain level results in a decrease in performance, so it is essential to choose an optimum depth. We also found that 3D contextual information cannot be captured by a single 2D network that is trained with patches extracted from multiple views whereas an ensemble of three 2D networks trained in multiple views can effectively capture the information and deliver much better performance. We obtained Dice scores of 0.79 for enhancing tumour, 0.90 for whole tumour, and 0.82 for tumour core on the BraTS 2018 validation set. Our method using 2D network consumes very less time and memory, and is much simpler and easier to implement compared to the state-of-the-art methods that used 3D networks; still, it manages to achieve comparable performance to those methods. |
Tasks | |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.04250v2 |
https://arxiv.org/pdf/1908.04250v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-brain-tumour-segmentation-using |
Repo | |
Framework | |
Exploring Reciprocal Attention for Salient Object Detection by Cooperative Learning
Title | Exploring Reciprocal Attention for Salient Object Detection by Cooperative Learning |
Authors | Changqun Xia, Jia Li, Jinming Su, Yonghong Tian |
Abstract | Typically, objects with the same semantics are not always prominent in images containing different backgrounds. Motivated by this observation that accurately salient object detection is related to both foreground and background, we proposed a novel cooperative attention mechanism that jointly considers reciprocal relationships between background and foreground for efficient salient object detection. Concretely, we first aggregate the features at each side-out of traditional dilated FCN to extract the initial foreground and background local responses respectively. Then taking these responses as input, reciprocal attention module adaptively models the nonlocal dependencies between any two pixels of the foreground and background features, which is then aggregated with local features in a mutual reinforced way so as to enhance each branch to generate more discriminative foreground and background saliency map. Besides, cooperative losses are particularly designed to guide the multi-task learning of foreground and background branches, which encourages our network to obtain more complementary predictions with clear boundaries. At last, a simple but effective fusion strategy is utilized to produce the final saliency map. Comprehensive experimental results on five benchmark datasets demonstrate that our proposed method performs favorably against the state-of-the-art approaches in terms of all compared evaluation metrics. |
Tasks | Multi-Task Learning, Object Detection, Salient Object Detection |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08269v1 |
https://arxiv.org/pdf/1909.08269v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-reciprocal-attention-for-salient |
Repo | |
Framework | |
Masking Salient Object Detection, a Mask Region-based Convolutional Neural Network Analysis for Segmentation of Salient Objects
Title | Masking Salient Object Detection, a Mask Region-based Convolutional Neural Network Analysis for Segmentation of Salient Objects |
Authors | Bruno A. Krinski, Daniel V. Ruiz, Guilherme Z. Machado, Eduardo Todt |
Abstract | In this paper, we propose a broad comparison between Fully Convolutional Networks (FCNs) and Mask Region-based Convolutional Neural Networks (Mask-RCNNs) applied in the Salient Object Detection (SOD) context. Studies in the SOD literature usually explore architectures based in FCNs to detect salient regions and objects in visual scenes. However, besides the promising results achieved, FCNs showed issues in some challenging scenarios. Fairly recently studies in the SOD literature proposed the use of a Mask-RCNN approach to overcome such issues. However, there is no extensive comparison between the two networks in the SOD literature endorsing the effectiveness of Mask-RCNNs over FCN when segmenting salient objects. Aiming to effectively show the superiority of Mask-RCNNs over FCNs in the SOD context, we compare two variations of Mask-RCNNs with two variations of FCNs in eight datasets widely used in the literature and in four metrics. Our findings show that in this context Mask-RCNNs achieved an improvement on the F-measure up to 47% over FCNs. |
Tasks | Object Detection, Salient Object Detection |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.08038v1 |
https://arxiv.org/pdf/1909.08038v1.pdf | |
PWC | https://paperswithcode.com/paper/masking-salient-object-detection-a-mask |
Repo | |
Framework | |
Edge-guided Non-local Fully Convolutional Network for Salient Object Detection
Title | Edge-guided Non-local Fully Convolutional Network for Salient Object Detection |
Authors | Zhengzheng Tu, Yan Ma, Chenglong Li, Jin Tang, Bin Luo |
Abstract | Fully Convolutional Neural Network (FCN) has been widely applied to salient object detection recently by virtue of high-level semantic feature extraction, but existing FCN based methods still suffer from continuous striding and pooling operations leading to loss of spatial structure and blurred edges. To maintain the clear edge structure of salient objects, we propose a novel Edge-guided Non-local FCN (ENFNet) to perform edge guided feature learning for accurate salient object detection. In a specific, we extract hierarchical global and local information in FCN to incorporate non-local features for effective feature representations. To preserve good boundaries of salient objects, we propose a guidance block to embed edge prior knowledge into hierarchical feature maps. The guidance block not only performs feature-wise manipulation but also spatial-wise transformation for effective edge embeddings. Our model is trained on the MSRA-B dataset and tested on five popular benchmark datasets. Comparing with the state-of-the-art methods, the proposed method achieves the best performance on all datasets. |
Tasks | Object Detection, Salient Object Detection |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02460v2 |
https://arxiv.org/pdf/1908.02460v2.pdf | |
PWC | https://paperswithcode.com/paper/edge-guided-non-local-fully-convolutional |
Repo | |
Framework | |
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
Title | What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning |
Authors | Jaejun Lee, Raphael Tang, Jimmy Lin |
Abstract | Pretrained transformer-based language models have achieved state of the art across countless tasks in natural language processing. These models are highly expressive, comprising at least a hundred million parameters and a dozen layers. Recent evidence suggests that only a few of the final layers need to be fine-tuned for high quality on downstream tasks. Naturally, a subsequent research question is, “how many of the last layers do we need to fine-tune?” In this paper, we precisely answer this question. We examine two recent pretrained language models, BERT and RoBERTa, across standard tasks in textual entailment, semantic similarity, sentiment analysis, and linguistic acceptability. We vary the number of final layers that are fine-tuned, then study the resulting change in task-specific effectiveness. We show that only a fourth of the final layers need to be fine-tuned to achieve 90% of the original quality. Surprisingly, we also find that fine-tuning all layers does not always help. |
Tasks | Linguistic Acceptability, Natural Language Inference, Semantic Similarity, Semantic Textual Similarity, Sentiment Analysis |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03090v1 |
https://arxiv.org/pdf/1911.03090v1.pdf | |
PWC | https://paperswithcode.com/paper/what-would-elsa-do-freezing-layers-during |
Repo | |
Framework | |
Characterizing Inter-Layer Functional Mappings of Deep Learning Models
Title | Characterizing Inter-Layer Functional Mappings of Deep Learning Models |
Authors | Donald Waagen, Katie Rainey, Jamie Gantert, David Gray, Megan King, M. Shane Thompson, Jonathan Barton, Will Waldron, Samantha Livingston, Don Hulsey |
Abstract | Deep learning architectures have demonstrated state-of-the-art performance for object classification and have become ubiquitous in commercial products. These methods are often applied without understanding (a) the difficulty of a classification task given the input data, and (b) how a specific deep learning architecture transforms that data. To answer (a) and (b), we illustrate the utility of a multivariate nonparametric estimator of class separation, the Henze-Penrose (HP) statistic, in the original as well as layer-induced representations. Given an $N$-class problem, our contribution defines the $C(N,2)$ combinations of HP statistics as a sample from a distribution of class-pair separations. This allows us to characterize the distributional change to class separation induced at each layer of the model. Fisher permutation tests are used to detect statistically significant changes within a model. By comparing the HP statistic distributions between layers, one can statistically characterize: layer adaptation during training, the contribution of each layer to the classification task, and the presence or absence of consistency between training and validation data. This is demonstrated for a simple deep neural network using CIFAR10 with random-labels, CIFAR10, and MNIST datasets. |
Tasks | Object Classification |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04223v2 |
https://arxiv.org/pdf/1907.04223v2.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-inter-layer-functional |
Repo | |
Framework | |
Lidar-based Object Classification with Explicit Occlusion Modeling
Title | Lidar-based Object Classification with Explicit Occlusion Modeling |
Authors | Xiaoxiang Zhang, Hao Fu, Bin Dai |
Abstract | LIDAR is one of the most important sensors for Unmanned Ground Vehicles (UGV). Object detection and classification based on lidar point cloud is a key technology for UGV. In object detection and classification, the mutual occlusion between neighboring objects is an important factor affecting the accuracy. In this paper, we consider occlusion as an intrinsic property of the point cloud data. We propose a novel approach that explicitly model the occlusion. The occlusion property is then taken into account in the subsequent classification step. We perform experiments on the KITTI dataset. Experimental results indicate that by utilizing the occlusion property that we modeled, the classifier obtains much better performance. |
Tasks | Object Classification, Object Detection |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04057v2 |
https://arxiv.org/pdf/1907.04057v2.pdf | |
PWC | https://paperswithcode.com/paper/lidar-based-object-classification-with |
Repo | |
Framework | |