Paper Group ANR 1398
CN-CELEB: a challenging Chinese speaker recognition dataset. Deep Learning in Video Multi-Object Tracking: A Survey. Neural Language Model for Automated Classification of Electronic Medical Records at the Emergency Room. The Significant Benefit of Unsupervised Generative Pre-training. On Distance and Kernel Measures of Conditional Independence. Fac …
CN-CELEB: a challenging Chinese speaker recognition dataset
Title | CN-CELEB: a challenging Chinese speaker recognition dataset |
Authors | Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang |
Abstract | Recently, researchers set an ambitious goal of conducting speaker recognition in unconstrained conditions where the variations on ambient, channel and emotion could be arbitrary. However, most publicly available datasets are collected under constrained environments, i.e., with little noise and limited channel variation. These datasets tend to deliver over optimistic performance and do not meet the request of research on speaker recognition in unconstrained conditions. In this paper, we present CN-Celeb, a large-scale speaker recognition dataset collected `in the wild’. This dataset contains more than 130,000 utterances from 1,000 Chinese celebrities, and covers 11 different genres in real world. Experiments conducted with two state-of-the-art speaker recognition approaches (i-vector and x-vector) show that the performance on CN-Celeb is far inferior to the one obtained on VoxCeleb, a widely used speaker recognition dataset. This result demonstrates that in real-life conditions, the performance of existing techniques might be much worse than it was thought. Our database is free for researchers and can be downloaded from http://project.cslt.org. | |
Tasks | Speaker Recognition |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1911.01799v1 |
https://arxiv.org/pdf/1911.01799v1.pdf | |
PWC | https://paperswithcode.com/paper/cn-celeb-a-challenging-chinese-speaker |
Repo | |
Framework | |
Deep Learning in Video Multi-Object Tracking: A Survey
Title | Deep Learning in Video Multi-Object Tracking: A Survey |
Authors | Gioele Ciaparrone, Francisco Luque Sánchez, Siham Tabik, Luigi Troiano, Roberto Tagliaferri, Francisco Herrera |
Abstract | The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions. |
Tasks | Multi-Object Tracking, Multiple Object Tracking, Object Tracking |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.12740v4 |
https://arxiv.org/pdf/1907.12740v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-in-video-multi-object-tracking |
Repo | |
Framework | |
Neural Language Model for Automated Classification of Electronic Medical Records at the Emergency Room. The Significant Benefit of Unsupervised Generative Pre-training
Title | Neural Language Model for Automated Classification of Electronic Medical Records at the Emergency Room. The Significant Benefit of Unsupervised Generative Pre-training |
Authors | Binbin Xu, Cédric Gil-Jardiné, Frantz Thiessard, Eric Tellier, Marta Avalos, Emmanuel Lagarde |
Abstract | In order to build a national injury surveillance system based on emergency room (ER) visits we are developing a coding system to classify their causes from clinical notes in free-text. Supervised learning techniques have shown good results in this area but require large number of annotated dataset. New levels of performance have been recently achieved in neural language models (NLM) with models based on the Transformer architecture incorporating an unsupervised generative pre-training step. Our hypothesis is that methods involving a generative self-supervised pre-training step can significantly reduce the required number of annotated samples for supervised fine-tuning. In this case study, we assessed whether we could predict from free-text clinical notes whether a visit was the consequence of a traumatic or non-traumatic event. Using fully re-trained GPT-2 models (without OpenAI pre-trained weightings), we compared two scenarios: Scenario A (26 study cases of different training data sizes) consisted in training the GPT-2 on the trauma/non-trauma labeled (up to 161 930) clinical notes. In Scenario B (19 study cases), a first step of self-supervised pre-training phase with unlabeled (up to 151 930) notes and the second step of supervised fine-tuning with labeled (up to 10 000) notes. Results showed that, Scenario A needed to process >6 000 notes to achieve good performance (AUC>0.95), Scenario B needed only 600 notes, gain of a factor 10. At the end case of both scenarios, for 16 times more data (161 930 vs. 10 000), the gain from Scenario A compared to Scenario B is only an improvement of 0.89% in AUC and 2.12% in F1 score. To conclude, it is possible to adapt a multi-purpose NLM model such as the GPT-2 to create a powerful tool for classification of free-text notes with only very small number of labeled samples. |
Tasks | Language Modelling |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.01136v4 |
https://arxiv.org/pdf/1909.01136v4.pdf | |
PWC | https://paperswithcode.com/paper/neural-language-model-for-automated |
Repo | |
Framework | |
On Distance and Kernel Measures of Conditional Independence
Title | On Distance and Kernel Measures of Conditional Independence |
Authors | Tianhong Sheng, Bharath K. Sriperumbudur |
Abstract | Measuring conditional independence is one of the important tasks in statistical inference and is fundamental in causal discovery, feature selection, dimensionality reduction, Bayesian network learning, and others. In this work, we explore the connection between conditional independence measures induced by distances on a metric space and reproducing kernels associated with a reproducing kernel Hilbert space (RKHS). For certain distance and kernel pairs, we show the distance-based conditional independence measures to be equivalent to that of kernel-based measures. On the other hand, we also show that some popular—in machine learning—kernel conditional independence measures based on the Hilbert-Schmidt norm of a certain cross-conditional covariance operator, do not have a simple distance representation, except in some limiting cases. This paper, therefore, shows the distance and kernel measures of conditional independence to be not quite equivalent unlike in the case of joint independence as shown by Sejdinovic et al. (2013). |
Tasks | Causal Discovery, Dimensionality Reduction, Feature Selection |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.01103v1 |
https://arxiv.org/pdf/1912.01103v1.pdf | |
PWC | https://paperswithcode.com/paper/on-distance-and-kernel-measures-of |
Repo | |
Framework | |
Facility Location Problem in Differential Privacy Model Revisited
Title | Facility Location Problem in Differential Privacy Model Revisited |
Authors | Yunus Esencayi, Marco Gaboardi, Shi Li, Di Wang |
Abstract | In this paper we study the uncapacitated facility location problem in the model of differential privacy (DP) with uniform facility cost. Specifically, we first show that, under the hierarchically well-separated tree (HST) metrics and the super-set output setting that was introduced in Gupta et. al., there is an $\epsilon$-DP algorithm that achieves an $O(\frac{1}{\epsilon})$(expected multiplicative) approximation ratio; this implies an $O(\frac{\log n}{\epsilon})$ approximation ratio for the general metric case, where $n$ is the size of the input metric. These bounds improve the best-known results given by Gupta et. al. In particular, our approximation ratio for HST-metrics is independent of $n$, and the ratio for general metrics is independent of the aspect ratio of the input metric. On the negative side, we show that the approximation ratio of any $\epsilon$-DP algorithm is lower bounded by $\Omega(\frac{1}{\sqrt{\epsilon}})$, even for instances on HST metrics with uniform facility cost, under the super-set output setting. The lower bound shows that the dependence of the approximation ratio for HST metrics on $\epsilon$ can not be removed or greatly improved. Our novel methods and techniques for both the upper and lower bound may find additional applications. |
Tasks | |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.12050v1 |
https://arxiv.org/pdf/1910.12050v1.pdf | |
PWC | https://paperswithcode.com/paper/facility-location-problem-in-differential |
Repo | |
Framework | |
Movable-Object-Aware Visual SLAM via Weakly Supervised Semantic Segmentation
Title | Movable-Object-Aware Visual SLAM via Weakly Supervised Semantic Segmentation |
Authors | Ting Sun, Yuxiang Sun, Ming Liu, Dit-Yan Yeung |
Abstract | Moving objects can greatly jeopardize the performance of a visual simultaneous localization and mapping (vSLAM) system which relies on the static-world assumption. Motion removal have seen successful on solving this problem. Two main streams of solutions are based on either geometry constraints or deep semantic segmentation neural network. The former rely on static majority assumption, and the latter require labor-intensive pixel-wise annotations. In this paper we propose to adopt a novel weakly-supervised semantic segmentation method. The segmentation mask is obtained from a CNN pre-trained with image-level class labels only. Thus, we leverage the power of deep semantic segmentation CNNs, while avoid requiring expensive annotations for training. We integrate our motion removal approach with the ORB-SLAM2 system. Experimental results on the TUM RGB-D and the KITTI stereo datasets demonstrate our superiority over the state-of-the-art. |
Tasks | Semantic Segmentation, Simultaneous Localization and Mapping, Weakly-Supervised Semantic Segmentation |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03629v2 |
https://arxiv.org/pdf/1906.03629v2.pdf | |
PWC | https://paperswithcode.com/paper/movable-object-aware-visual-slam-via-weakly |
Repo | |
Framework | |
Bayesian optimization with local search
Title | Bayesian optimization with local search |
Authors | Yuzhou Gao, Tengchao Yu, Jinglai Li |
Abstract | Global optimization finds applications in a wide range of real world problems. The multi-start methods are a popular class of global optimization techniques, which are based on the ideas of conducting local searches at multiple starting points, and then sequentially determine the starting points according to some prescribed rules. In this work we propose a new multi-start algorithm where the starting points are determined in a Bayesian optimization framework. Specifically, the method can be understood as to construct a new function by conducting local searches of the original objective function, where the new function attains the same global optima as the original one. Bayesian optimization is then applied to find the global optima of the new local search based function. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.09159v1 |
https://arxiv.org/pdf/1911.09159v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-with-local-search |
Repo | |
Framework | |
On Node Features for Graph Neural Networks
Title | On Node Features for Graph Neural Networks |
Authors | Chi Thang Duong, Thanh Dat Hoang, Ha The Hien Dang, Quoc Viet Hung Nguyen, Karl Aberer |
Abstract | Graph neural network (GNN) is a deep model for graph representation learning. One advantage of graph neural network is its ability to incorporate node features into the learning process. However, this prevents graph neural network from being applied into featureless graphs. In this paper, we first analyze the effects of node features on the performance of graph neural network. We show that GNNs work well if there is a strong correlation between node features and node labels. Based on these results, we propose new feature initialization methods that allows to apply graph neural network to non-attributed graphs. Our experimental results show that the artificial features are highly competitive with real features. |
Tasks | Graph Representation Learning, Representation Learning |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08795v1 |
https://arxiv.org/pdf/1911.08795v1.pdf | |
PWC | https://paperswithcode.com/paper/on-node-features-for-graph-neural-networks |
Repo | |
Framework | |
Improving classification performance by feature space transformations and model selection
Title | Improving classification performance by feature space transformations and model selection |
Authors | Jose Ortiz-Bejar, Eric S. Tellez, Mario Graff |
Abstract | Improving the performance of classifiers is the realm of feature mapping, prototype selection, and kernel function transformations; these techniques aim for reducing the complexity, and also, improving the accuracy of models. In particular, our objective is to combine them to transform data’s shape into another more convenient distribution; such that some simple algorithms, such as Na"ive Bayes or k-Nearest Neighbors, can produce competitive classifiers. In this paper, we introduce a family of classifiers based on feature mapping and kernel functions, orchestrated by a model selection scheme that excels in performance. We provide an extensive experimental comparison of our methods with sixteen popular classifiers on more than thirty benchmarks supporting our claims. In addition to their competitive performance, our statistical tests also found that our methods are different among them, supporting our claim of a compelling family of classifiers. |
Tasks | Model Selection |
Published | 2019-07-14 |
URL | https://arxiv.org/abs/1907.06258v3 |
https://arxiv.org/pdf/1907.06258v3.pdf | |
PWC | https://paperswithcode.com/paper/feature-space-transformations-and-model |
Repo | |
Framework | |
Software defect prediction with zero-inflated Poisson models
Title | Software defect prediction with zero-inflated Poisson models |
Authors | Daniel Rodriguez, Javier Dolado, Javier Tuya, Dietmar Pfahl |
Abstract | In this work we apply several Poisson and zero-inflated models for software defect prediction. We apply different functions from several R packages such as pscl, MASS, R2Jags and the recent glmmTMB. We test the functions using the Equinox dataset. The results show that Zero-inflated models, fitted with either maximum likelihood estimation or with Bayesian approach, are slightly better than other models, using the AIC as selection criterion. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13717v1 |
https://arxiv.org/pdf/1910.13717v1.pdf | |
PWC | https://paperswithcode.com/paper/software-defect-prediction-with-zero-inflated |
Repo | |
Framework | |
Revisiting Classical Bagging with Modern Transfer Learning for On-the-fly Disaster Damage Detector
Title | Revisiting Classical Bagging with Modern Transfer Learning for On-the-fly Disaster Damage Detector |
Authors | Junghoon Seo, Seungwon Lee, Beomsu Kim, Taegyun Jeon |
Abstract | Automatic post-disaster damage detection using aerial imagery is crucial for quick assessment of damage caused by disaster and development of a recovery plan. The main problem preventing us from creating an applicable model in practice is that damaged (positive) examples we are trying to detect are much harder to obtain than undamaged (negative) examples, especially in short time. In this paper, we revisit the classical bootstrap aggregating approach in the context of modern transfer learning for data-efficient disaster damage detection. Unlike previous classical ensemble learning articles, our work points out the effectiveness of simple bagging in deep transfer learning that has been underestimated in the context of imbalanced classification. Benchmark results on the AIST Building Change Detection dataset show that our approach significantly outperforms existing methodologies, including the recently proposed disentanglement learning. |
Tasks | Transfer Learning |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01911v1 |
https://arxiv.org/pdf/1910.01911v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-classical-bagging-with-modern |
Repo | |
Framework | |
Decision Procedures for Guarded Logics
Title | Decision Procedures for Guarded Logics |
Authors | Kevin Kappelmann |
Abstract | An important class of decidable first-order logic fragments are those satisfying a guardedness condition, such as the guarded fragment (GF). Usually, decidability for these logics is closely linked to the tree-like model property - the fact that satisfying models can be taken to have tree-like form. Decision procedures for the guarded fragment based on the tree-like model property are difficult to implement. An alternative approach, based on restricting first-order resolution, has been proposed, and this shows more promise from the point of view of implementation. In this work, we connect the tree-like model property of the guarded fragment with the resolution-based approach. We derive efficient resolution-based rewriting algorithms that solve the Quantifier-Free Query Answering Problem under Guarded Tuple Generating Dependencies (GTGDs) and Disjunctive Guarded Tuple Generating Dependencies (DisGTGDs). The Query Answering Problem for these classes subsumes many cases of GF satisfiability. Our algorithms, in addition to making the connection to the tree-like model property clear, give a natural account of the selection and ordering strategies used by resolution procedures for the guarded fragment. We also believe that our rewriting algorithm for the special case of GTGDs may prove itself valuable in practice as it does not require any Skolemisation step and its theoretical runtime outperforms those of known GF resolution procedures in case of fixed dependencies. Moreover, we show a novel normalisation procedure for the widely used chase procedure in case of (disjunctive) GTGDs, which could be useful for future studies. |
Tasks | |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03679v1 |
https://arxiv.org/pdf/1911.03679v1.pdf | |
PWC | https://paperswithcode.com/paper/decision-procedures-for-guarded-logics |
Repo | |
Framework | |
HCFContext: Smartphone Context Inference via Sequential History-based Collaborative Filtering
Title | HCFContext: Smartphone Context Inference via Sequential History-based Collaborative Filtering |
Authors | Vidyasagar Sadhu, Saman Zonouz, Vincent Sritapan, Dario Pompili |
Abstract | Mobile context determination is an important step for many context aware services such as location-based services, enterprise policy enforcement, building or room occupancy detection for power or HVAC operation, etc. Especially in enterprise scenarios where policies (e.g., attending a confidential meeting only when the user is in “Location X”) are defined based on mobile context, it is paramount to verify the accuracy of the mobile context. To this end, two stochastic models based on the theory of Hidden Markov Models (HMMs) to obtain mobile context are proposed-personalized model (HPContext) and collaborative filtering model (HCFContext). The former predicts the current context using sequential history of the user’s past context observations, the latter enhances HPContext with collaborative filtering features, which enables it to predict the current context of the primary user based on the context observations of users related to the primary user, e.g., same team colleagues in company, gym friends, family members, etc. Each of the proposed models can also be used to enhance or complement the context obtained from sensors. Furthermore, since privacy is a concern in collaborative filtering, a privacy-preserving method is proposed to derive HCFContext model parameters based on the concepts of homomorphic encryption. Finally, these models are thoroughly validated on a real-life dataset. |
Tasks | |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09631v2 |
http://arxiv.org/pdf/1904.09631v2.pdf | |
PWC | https://paperswithcode.com/paper/hcfcontext-smartphone-context-inference-via |
Repo | |
Framework | |
ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation
Title | ESNet: An Efficient Symmetric Network for Real-time Semantic Segmentation |
Authors | Yu Wang, Quan Zhou, Xiaofu Wu |
Abstract | The recent years have witnessed great advances for semantic segmentation using deep convolutional neural networks (DCNNs). However, a large number of convolutional layers and feature channels lead to semantic segmentation as a computationally heavy task, which is disadvantage to the scenario with limited resources. In this paper, we design an efficient symmetric network, called (ESNet), to address this problem. The whole network has nearly symmetric architecture, which is mainly composed of a series of factorized convolution unit (FCU) and its parallel counterparts (PFCU). On one hand, the FCU adopts a widely-used 1D factorized convolution in residual layers. On the other hand, the parallel version employs a transform-split-transform-merge strategy in the designment of residual module, where the split branch adopts dilated convolutions with different rate to enlarge receptive field. Our model has nearly 1.6M parameters, and is able to be performed over 62 FPS on a single GTX 1080Ti GPU. The experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-off for real-time semantic segmentation on CityScapes dataset. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09826v1 |
https://arxiv.org/pdf/1906.09826v1.pdf | |
PWC | https://paperswithcode.com/paper/esnet-an-efficient-symmetric-network-for-real |
Repo | |
Framework | |
Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing
Title | Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing |
Authors | Andrea Galassi, Marco Lippi, Paolo Torroni |
Abstract | Attention is an increasingly popular mechanism used in a wide range of neural architectures. Because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures for natural language processing, with a focus on architectures designed to work with vector representation of the textual data. We discuss the dimensions along which proposals differ, the possible uses of attention, and chart the major research activities and open challenges in the area. |
Tasks | |
Published | 2019-02-04 |
URL | http://arxiv.org/abs/1902.02181v1 |
http://arxiv.org/pdf/1902.02181v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-please-a-critical-review-of-neural |
Repo | |
Framework | |