Paper Group ANR 924
Limitations of Pinned AUC for Measuring Unintended Bias. C2AE: Class Conditioned Auto-Encoder for Open-set Recognition. A Review on Automatic License Plate Recognition System. Exploiting weak ties in trust-based recommender systems using regular equivalence. A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates. Maximum Expected Hit …
Limitations of Pinned AUC for Measuring Unintended Bias
Title | Limitations of Pinned AUC for Measuring Unintended Bias |
Authors | Daniel Borkan, Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, Lucy Vasserman |
Abstract | This report examines the Pinned AUC metric introduced and highlights some of its limitations. Pinned AUC provides a threshold-agnostic measure of unintended bias in a classification model, inspired by the ROC-AUC metric. However, as we highlight in this report, there are ways that the metric can obscure different kinds of unintended biases when the underlying class distributions on which bias is being measured are not carefully controlled. |
Tasks | |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.02088v1 |
http://arxiv.org/pdf/1903.02088v1.pdf | |
PWC | https://paperswithcode.com/paper/limitations-of-pinned-auc-for-measuring |
Repo | |
Framework | |
C2AE: Class Conditioned Auto-Encoder for Open-set Recognition
Title | C2AE: Class Conditioned Auto-Encoder for Open-set Recognition |
Authors | Poojan Oza, Vishal M Patel |
Abstract | Models trained for classification often assume that all testing classes are known while training. As a result, when presented with an unknown class during testing, such closed-set assumption forces the model to classify it as one of the known classes. However, in a real world scenario, classification models are likely to encounter such examples. Hence, identifying those examples as unknown becomes critical to model performance. A potential solution to overcome this problem lies in a class of learning problems known as open-set recognition. It refers to the problem of identifying the unknown classes during testing, while maintaining performance on the known classes. In this paper, we propose an open-set recognition algorithm using class conditioned auto-encoders with novel training and testing methodology. In contrast to previous methods, training procedure is divided in two sub-tasks, 1. closed-set classification and, 2. open-set identification (i.e. identifying a class as known or unknown). Encoder learns the first task following the closed-set classification training pipeline, whereas decoder learns the second task by reconstructing conditioned on class identity. Furthermore, we model reconstruction errors using the Extreme Value Theory of statistical modeling to find the threshold for identifying known/unknown class samples. Experiments performed on multiple image classification datasets show proposed method performs significantly better than state of the art. |
Tasks | Image Classification, Open Set Learning |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01198v1 |
http://arxiv.org/pdf/1904.01198v1.pdf | |
PWC | https://paperswithcode.com/paper/c2ae-class-conditioned-auto-encoder-for-open |
Repo | |
Framework | |
A Review on Automatic License Plate Recognition System
Title | A Review on Automatic License Plate Recognition System |
Authors | Satadal Saha |
Abstract | Automatic License Plate Recognition (ALPR) is a challenging problem to the research community due to its potential applicability in the diverse geographical condition over the globe with varying license plate parameters. Any ALPR system includes three main modules, viz. localization of the license plate, segmentation of the characters therein and recognition of the segmented characters. In real life applications where the images are captured over days and nights in an outdoor environment with varying lighting and weather conditions, varying pollution level and wind turbulences, localization, segmentation and recognition become challenging tasks. The tasks become more complex if the license plate is not in conformity with the standards laid by corresponding Motor Vehicles Department in terms of various features, e.g. area and aspect ratio of the license plate, background color, foreground color, shape, number of lines, font face/ size of characters, spacing between characters etc. Besides, license plates are often dirty or broken or having scratches or bent or tilted at its position. All these add to the challenges in developing an effective ALPR system. |
Tasks | License Plate Recognition |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09385v1 |
http://arxiv.org/pdf/1902.09385v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-on-automatic-license-plate |
Repo | |
Framework | |
Exploiting weak ties in trust-based recommender systems using regular equivalence
Title | Exploiting weak ties in trust-based recommender systems using regular equivalence |
Authors | Tomislav Duricic, Emanuel Lacic, Dominik Kowald, Elisabeth Lex |
Abstract | User-based Collaborative Filtering (CF) is one of the most popular approaches to create recommender systems. CF, however, suffers from data sparsity and the cold-start problem since users often rate only a small fraction of available items. One solution is to incorporate additional information into the recommendation process such as explicit trust scores that are assigned by users to others or implicit trust relationships that result from social connections between users. Such relationships typically form a very sparse trust network, which can be utilized to generate recommendations for users based on people they trust. In our work, we explore the use of regular equivalence applied to a trust network to generate a similarity matrix that is used for selecting k-nearest neighbors used for item recommendation. Two vertices in a network are regularly equivalent if their neighbors are themselves equivalent and by using the iterative approach of calculating regular equivalence, we can study the impact of strong and weak ties on item recommendation. We evaluate our approach on cold-start users on a dataset crawled from Epinions and find that by using weak ties in addition to strong ties, we can improve the performance of a trust-based recommender in terms of recommendation accuracy. |
Tasks | Recommendation Systems |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1907.11620v1 |
https://arxiv.org/pdf/1907.11620v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-weak-ties-in-trust-based |
Repo | |
Framework | |
A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates
Title | A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates |
Authors | Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang |
Abstract | We study the algorithmic problem of estimating the mean of heavy-tailed random vector in $\mathbb{R}^d$, given $n$ i.i.d. samples. The goal is to design an efficient estimator that attains the optimal sub-gaussian error bound, only assuming that the random vector has bounded mean and covariance. Polynomial-time solutions to this problem are known but have high runtime due to their use of semi-definite programming (SDP). Conceptually, it remains open whether convex relaxation is truly necessary for this problem. In this work, we show that it is possible to go beyond SDP and achieve better computational efficiency. In particular, we provide a spectral algorithm that achieves the optimal statistical performance and runs in time $\widetilde O\left(n^2 d \right)$, improving upon the previous fastest runtime $\widetilde O\left(n^{3.5}+ n^2d\right)$ by Cherapanamjeri el al. (COLT ‘19). Our algorithm is spectral in that it only requires (approximate) eigenvector computations, which can be implemented very efficiently by, for example, power iteration or the Lanczos method. At the core of our algorithm is a novel connection between the furthest hyperplane problem introduced by Karnin et al. (COLT ‘12) and a structural lemma on heavy-tailed distributions by Lugosi and Mendelson (Ann. Stat. ‘19). This allows us to iteratively reduce the estimation error at a geometric rate using only the information derived from the top singular vector of the data matrix, leading to a significantly faster running time. |
Tasks | |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04468v2 |
https://arxiv.org/pdf/1908.04468v2.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-spectral-algorithm-for-mean-estimation |
Repo | |
Framework | |
Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards
Title | Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards |
Authors | Falcon Z. Dai, Matthew R. Walter |
Abstract | We propose a new complexity measure for Markov decision processes (MDPs), the maximum expected hitting cost (MEHC). This measure tightens the closely related notion of diameter [JOA10] by accounting for the reward structure. We show that this parameter replaces diameter in the upper bound on the optimal value span of an extended MDP, thus refining the associated upper bounds on the regret of several UCRL2-like algorithms. Furthermore, we show that potential-based reward shaping [NHR99] can induce equivalent reward functions with varying informativeness, as measured by MEHC. We further establish that shaping can reduce or increase MEHC by at most a factor of two in a large class of MDPs with finite MEHC and unsaturated optimal average rewards. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.02114v2 |
https://arxiv.org/pdf/1907.02114v2.pdf | |
PWC | https://paperswithcode.com/paper/maximum-expected-hitting-cost-of-a-markov |
Repo | |
Framework | |
Using Arabic Tweets to Understand Drug Selling Behaviors
Title | Using Arabic Tweets to Understand Drug Selling Behaviors |
Authors | Wesam Alruwaili, Bradley Protano, Tejasvi Sirigiriraju, Hamed Alhoori |
Abstract | Twitter is a popular platform for e-commerce in the Arab region including the sale of illegal goods and services. Social media platforms present multiple opportunities to mine information about behaviors pertaining to both illicit and pharmaceutical drugs and likewise to legal prescription drugs sold without a prescription, i.e., illegally. Recognized as a public health risk, the sale and use of illegal drugs, counterfeit versions of legal drugs, and legal drugs sold without a prescription constitute a widespread problem that is reflected in and facilitated by social media. Twitter provides a crucial resource for monitoring legal and illegal drug sales in order to support the larger goal of finding ways to protect patient safety. We collected our dataset using Arabic keywords. We then categorized the data using four machine learning classifiers. Based on a comparison of the respective results, we assessed the accuracy of each classifier in predicting two important considerations in analysing the extent to which drugs are available on social media: references to drugs for sale and the legality/illegality of the drugs thus advertised. For predicting tweets selling drugs, Support Vector Machine, yielded the highest accuracy rate (96%), whereas for predicting the legality of the advertised drugs, the Naive Bayes, classifier yielded the highest accuracy rate (85%). |
Tasks | |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1911.01275v1 |
https://arxiv.org/pdf/1911.01275v1.pdf | |
PWC | https://paperswithcode.com/paper/using-arabic-tweets-to-understand-drug |
Repo | |
Framework | |
Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation
Title | Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation |
Authors | Thomas Drugman, Baris Bozkurt, Thierry Dutoit |
Abstract | Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the proposed method has the potential to be used for voice quality analysis. |
Tasks | |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.12843v1 |
https://arxiv.org/pdf/1912.12843v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-anticausal-decomposition-of-speech |
Repo | |
Framework | |
Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Title | Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction |
Authors | Vishal Jain, William Fedus, Hugo Larochelle, Doina Precup, Marc G. Bellemare |
Abstract | Text-based games are a natural challenge domain for deep reinforcement learning algorithms. Their state and action spaces are combinatorially large, their reward function is sparse, and they are partially observable: the agent is informed of the consequences of its actions through textual feedback. In this paper we emphasize this latter point and consider the design of a deep reinforcement learning agent that can play from feedback alone. Our design recognizes and takes advantage of the structural characteristics of text-based games. We first propose a contextualisation mechanism, based on accumulated reward, which simplifies the learning problem and mitigates partial observability. We then study different methods that rely on the notion that most actions are ineffectual in any given situation, following Zahavy et al.‘s idea of an admissible action. We evaluate these techniques in a series of text-based games of increasing difficulty based on the TextWorld framework, as well as the iconic game Zork. Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games. |
Tasks | |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12511v1 |
https://arxiv.org/pdf/1911.12511v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-improvements-for-deep |
Repo | |
Framework | |
On Arrhythmia Detection by Deep Learning and Multidimensional Representation
Title | On Arrhythmia Detection by Deep Learning and Multidimensional Representation |
Authors | K. S. Rajput, S. Wibowo, C. Hao, M. Majmudar |
Abstract | An electrocardiogram (ECG) is a time-series signal that is represented by one-dimensional (1-D) data. Higher dimensional representation contains more information that is accessible for feature extraction. Hidden variables such as frequency relation and morphology of segment is not directly accessible in the time domain. In this paper, 1-D time series data is converted into multi-dimensional representation in the form of multichannel 2-D images. Following that, deep learning was used to train a deep neural network based classifier to detect arrhythmias. The results of simulation on testing database demonstrate the effectiveness of the proposed methodology by showing an outstanding classification performance compared to other existing methods and hand-crafted annotations made by certified cardiologists. |
Tasks | Arrhythmia Detection, Electrocardiography (ECG), Time Series |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00138v4 |
http://arxiv.org/pdf/1904.00138v4.pdf | |
PWC | https://paperswithcode.com/paper/on-arrhythmia-detection-by-deep-learning-and |
Repo | |
Framework | |
Fast Machine Learning with Byzantine Workers and Servers
Title | Fast Machine Learning with Byzantine Workers and Servers |
Authors | El Mahdi El Mhamdi, Rachid Guerraoui, Arsany Guirguis |
Abstract | Machine Learning (ML) solutions are nowadays distributed and are prone to various types of component failures, which can be encompassed in so-called Byzantine behavior. This paper introduces LiuBei, a Byzantine-resilient ML algorithm that does not trust any individual component in the network (neither workers nor servers), nor does it induce additional communication rounds (on average), compared to standard non-Byzantine resilient algorithms. LiuBei builds upon gradient aggregation rules (GARs) to tolerate a minority of Byzantine workers. Besides, LiuBei replicates the parameter server on multiple machines instead of trusting it. We introduce a novel filtering mechanism that enables workers to filter out replies from Byzantine server replicas without requiring communication with all servers. Such a filtering mechanism is based on network synchrony, Lipschitz continuity of the loss function, and the GAR used to aggregate workers’ gradients. We also introduce a protocol, scatter/gather, to bound drifts between models on correct servers with a small number of communication messages. We theoretically prove that LiuBei achieves Byzantine resilience to both servers and workers and guarantees convergence. We build LiuBei using TensorFlow, and we show that LiuBei tolerates Byzantine behavior with an accuracy loss of around 5% and around 24% convergence overhead compared to vanilla TensorFlow. We moreover show that the throughput gain of LiuBei compared to another state-of-the-art Byzantine-resilient ML algorithm (that assumes network asynchrony) is 70%. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07537v1 |
https://arxiv.org/pdf/1911.07537v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-machine-learning-with-byzantine-workers-1 |
Repo | |
Framework | |
Classification-Specific Parts for Improving Fine-Grained Visual Categorization
Title | Classification-Specific Parts for Improving Fine-Grained Visual Categorization |
Authors | Dimitri Korsch, Paul Bodesheim, Joachim Denzler |
Abstract | Fine-grained visual categorization is a classification task for distinguishing categories with high intra-class and small inter-class variance. While global approaches aim at using the whole image for performing the classification, part-based solutions gather additional local information in terms of attentions or parts. We propose a novel classification-specific part estimation that uses an initial prediction as well as back-propagation of feature importance via gradient computations in order to estimate relevant image regions. The subsequently detected parts are then not only selected by a-posteriori classification knowledge, but also have an intrinsic spatial extent that is determined automatically. This is in contrast to most part-based approaches and even to available ground-truth part annotations, which only provide point coordinates and no additional scale information. We show in our experiments on various widely-used fine-grained datasets the effectiveness of the mentioned part selection method in conjunction with the extracted part features. |
Tasks | Feature Importance, Fine-Grained Visual Categorization |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07075v1 |
https://arxiv.org/pdf/1909.07075v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-specific-parts-for-improving |
Repo | |
Framework | |
SciLens: Evaluating the Quality of Scientific News Articles Using Social Media and Scientific Literature Indicators
Title | SciLens: Evaluating the Quality of Scientific News Articles Using Social Media and Scientific Literature Indicators |
Authors | Panayiotis Smeros, Carlos Castillo, Karl Aberer |
Abstract | This paper describes, develops, and validates SciLens, a method to evaluate the quality of scientific news articles. The starting point for our work are structured methodologies that define a series of quality aspects for manually evaluating news. Based on these aspects, we describe a series of indicators of news quality. According to our experiments, these indicators help non-experts evaluate more accurately the quality of a scientific news article, compared to non-experts that do not have access to these indicators. Furthermore, SciLens can also be used to produce a completely automated quality score for an article, which agrees more with expert evaluators than manual evaluations done by non-experts. One of the main elements of SciLens is the focus on both content and context of articles, where context is provided by (1) explicit and implicit references on the article to scientific literature, and (2) reactions in social media referencing the article. We show that both contextual elements can be valuable sources of information for determining article quality. The validation of SciLens, done through a combination of expert and non-expert annotation, demonstrates its effectiveness for both semi-automatic and automatic quality evaluation of scientific news. |
Tasks | |
Published | 2019-03-13 |
URL | http://arxiv.org/abs/1903.05538v2 |
http://arxiv.org/pdf/1903.05538v2.pdf | |
PWC | https://paperswithcode.com/paper/scilens-evaluating-the-quality-of-scientific |
Repo | |
Framework | |
Evaluating the Effectiveness of Common Technical Trading Models
Title | Evaluating the Effectiveness of Common Technical Trading Models |
Authors | Joseph Attia |
Abstract | How effective are the most common trading models? The answer may help investors realize upsides to using each model, act as a segue for investors into more complex financial analysis and machine learning, and to increase financial literacy amongst students. Creating original versions of popular models, like linear regression, K-Nearest Neighbor, and moving average crossovers, we can test how each model performs on the most popular stocks and largest indexes. With the results for each, we can compare the models, and understand which model reliably increases performance. The trials showed that while all three models reduced losses on stocks with strong overall downward trends, the two machine learning models did not work as well to increase profits. Moving averages crossovers outperformed a continuous investment every time, although did result in a more volatile investment as well. Furthermore, once finished creating the program that implements moving average crossover, what are the optimal periods to use? A massive test consisting of 169,880 trials, showed the best periods to use to increase investment performance (5,10) and to decrease volatility (33,44). In addition, the data showed numerous trends such as a smaller short SMA period is accompanied by higher performance. Plotting volatility against performance shows that the high risk, high reward saying holds true and shows that for investments, as the volatility increases so does its performance. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.10407v1 |
https://arxiv.org/pdf/1907.10407v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-the-effectiveness-of-common |
Repo | |
Framework | |
Knowledge distillation for semi-supervised domain adaptation
Title | Knowledge distillation for semi-supervised domain adaptation |
Authors | Mauricio Orbes-Arteaga, Jorge Cardoso, Lauge Sørensen, Christian Igel, Sebastien Ourselin, Marc Modat, Mads Nielsen, Akshay Pai |
Abstract | In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) – an efficient way of transferring knowledge between different DNNs – for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores. |
Tasks | Domain Adaptation |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.07355v1 |
https://arxiv.org/pdf/1908.07355v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-distillation-for-semi-supervised |
Repo | |
Framework | |