Paper Group ANR 93
Improving Question Answering by Commonsense-Based Pre-Training. LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models. A matching based clustering algorithm for categorical data. Machine learning for Internet of Things data analysis: A survey. Catalog of quasars from the Kilo-Degree Survey Data Release 3. Negative …
Improving Question Answering by Commonsense-Based Pre-Training
Title | Improving Question Answering by Commonsense-Based Pre-Training |
Authors | Wanjun Zhong, Duyu Tang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin |
Abstract | Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge. We believe the main reason is the lack of commonsense \mbox{connections} between concepts. To remedy this, we provide a simple and effective method that leverages external commonsense knowledge base such as ConceptNet. We pre-train direct and indirect relational functions between concepts, and show that these pre-trained functions could be easily added to existing neural network models. Results show that incorporating commonsense-based function improves the baseline on three question answering tasks that require commonsense reasoning. Further analysis shows that our system \mbox{discovers} and leverages useful evidence from an external commonsense knowledge base, which is missing in existing neural network models and help derive the correct answer. |
Tasks | Question Answering |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.03568v3 |
http://arxiv.org/pdf/1809.03568v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-question-answering-by-commonsense |
Repo | |
Framework | |
LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models
Title | LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models |
Authors | Ajaya Adhikari, D. M. J Tax, Riccardo Satta, Matthias Fath |
Abstract | As machine learning models become more accurate, they typically become more complex and uninterpretable by humans. The black-box character of these models holds back its acceptance in practice, especially in high-risk domains where the consequences of failure could be catastrophic such as health-care or defense. Providing understandable and useful explanations behind ML models or predictions can increase the trust of the user. Example-based reasoning, which entails leveraging previous experience with analogous tasks to make a decision, is a well known strategy for problem solving and justification. This work presents a new explanation extraction method called LEAFAGE, for a prediction made by any black-box ML model. The explanation consists of the visualization of similar examples from the training set and the importance of each feature. Moreover, these explanations are contrastive which aims to take the expectations of the user into account. LEAFAGE is evaluated in terms of fidelity to the underlying black-box model and usefulness to the user. The results showed that LEAFAGE performs overall better than the current state-of-the-art method LIME in terms of fidelity, on ML models with non-linear decision boundary. A user-study was conducted which focused on revealing the differences between example-based and feature importance-based explanations. It showed that example-based explanations performed significantly better than feature importance-based explanation, in terms of perceived transparency, information sufficiency, competence and confidence. Counter-intuitively, when the gained knowledge of the participants was tested, it showed that they learned less about the black-box model after seeing a feature importance-based explanation than seeing no explanation at all. The participants found feature importance-based explanation vague and hard to generalize it to other instances. |
Tasks | Feature Importance |
Published | 2018-12-21 |
URL | https://arxiv.org/abs/1812.09044v3 |
https://arxiv.org/pdf/1812.09044v3.pdf | |
PWC | https://paperswithcode.com/paper/example-and-feature-importance-based |
Repo | |
Framework | |
A matching based clustering algorithm for categorical data
Title | A matching based clustering algorithm for categorical data |
Authors | Ruben A. Gevorgyan, Yenok B. Hakobyan |
Abstract | Cluster analysis is one of the essential tasks in data mining and knowledge discovery. Each type of data poses unique challenges in achieving relatively efficient partitioning of the data into homogeneous groups. While the algorithms for numeric data are relatively well studied in the literature, there are still challenges to address in case of categorical data. The main issue is the unordered structure of categorical data, which makes the implementation of the standard concepts of clustering algorithms difficult. For instance, the assessment of distance between objects, the selection of representatives for categorical data is not as straightforward as for continuous data. Therefore, this paper presents a new framework for partitioning categorical data, which does not use the distance measure as a key concept. The Matching based clustering algorithm is designed based on the similarity matrix and a framework for updating the latter using the feature importance criteria. The experimental results show this algorithm can serve as an alternative to existing ones and can be an efficient knowledge discovery tool. |
Tasks | Feature Importance |
Published | 2018-12-09 |
URL | http://arxiv.org/abs/1812.03469v1 |
http://arxiv.org/pdf/1812.03469v1.pdf | |
PWC | https://paperswithcode.com/paper/a-matching-based-clustering-algorithm-for |
Repo | |
Framework | |
Machine learning for Internet of Things data analysis: A survey
Title | Machine learning for Internet of Things data analysis: A survey |
Authors | Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Mohammadamin Barekatain, Peyman Adibi, Payam Barnaghi, Amit P. Sheth |
Abstract | Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration. |
Tasks | |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06305v1 |
http://arxiv.org/pdf/1802.06305v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-internet-of-things-data |
Repo | |
Framework | |
Catalog of quasars from the Kilo-Degree Survey Data Release 3
Title | Catalog of quasars from the Kilo-Degree Survey Data Release 3 |
Authors | S. Nakoneczny, M. Bilicki, A. Solarz, A. Pollo, N. Maddox, C. Spiniello, M. Brescia, N. R. Napolitano |
Abstract | We present a catalog of quasars selected from broad-band photometric ugri data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are identified by the random forest (RF) supervised machine learning model, trained on SDSS DR14 spectroscopic data. We first cleaned the input KiDS data from entries with excessively noisy, missing or otherwise problematic measurements. Applying a feature importance analysis, we then tune the algorithm and identify in the KiDS multiband catalog the 17 most useful features for the classification, namely magnitudes, colors, magnitude ratios, and the stellarity index. We used the t-SNE algorithm to map the multi-dimensional photometric data onto 2D planes and compare the coverage of the training and inference sets. We limited the inference set to r<22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the random forest identified 190,000 quasar candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from WISE. An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8 is optimal for purity, whereas p(QSO)>0.7 is preferable for better completeness. Our study presents the first comprehensive quasar selection from deep high-quality KiDS data and will serve as the basis for versatile studies of the QSO population detected by this survey. |
Tasks | Feature Importance |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03084v2 |
http://arxiv.org/pdf/1812.03084v2.pdf | |
PWC | https://paperswithcode.com/paper/catalog-of-quasars-from-the-kilo-degree |
Repo | |
Framework | |
Negative results for approximation using single layer and multilayer feedforward neural networks
Title | Negative results for approximation using single layer and multilayer feedforward neural networks |
Authors | J. M. Almira, P. E. Lopez-de-Teruel, D. J. Romero-Lopez, F. Voigtlaender |
Abstract | We prove a negative result for the approximation of functions defined on compact subsets of $\mathbb{R}^d$ (where $d \geq 2$) using single layer feedforward neural networks with arbitrary continuous activation function. In a nutshell, this result claims the existence of target functions which are as difficult to approximate using these neural networks as one may want. We also demonstrate an analogous result (for general $d \in \mathbb{N}$) for neural networks with an arbitrary number of layers, for activation functions which are either rational functions, or continuous splines with finitely many pieces. |
Tasks | |
Published | 2018-10-23 |
URL | https://arxiv.org/abs/1810.10032v3 |
https://arxiv.org/pdf/1810.10032v3.pdf | |
PWC | https://paperswithcode.com/paper/some-negative-results-for-single-layer-and |
Repo | |
Framework | |
Privacy-Protective-GAN for Face De-identification
Title | Privacy-Protective-GAN for Face De-identification |
Authors | Yifan Wu, Fan Yang, Haibin Ling |
Abstract | Face de-identification has become increasingly important as the image sources are explosively growing and easily accessible. The advance of new face recognition techniques also arises people’s concern regarding the privacy leakage. The mainstream pipelines of face de-identification are mostly based on the k-same framework, which bears critiques of low effectiveness and poor visual quality. In this paper, we propose a new framework called Privacy-Protective-GAN (PP-GAN) that adapts GAN with novel verificator and regulator modules specially designed for the face de-identification problem to ensure generating de-identified output with retained structure similarity according to a single input. We evaluate the proposed approach in terms of privacy protection, utility preservation, and structure similarity. Our approach not only outperforms existing face de-identification techniques but also provides a practical framework of adapting GAN with priors of domain knowledge. |
Tasks | Face Recognition |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.08906v1 |
http://arxiv.org/pdf/1806.08906v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-protective-gan-for-face-de |
Repo | |
Framework | |
seq2graph: Discovering Dynamic Dependencies from Multivariate Time Series with Multi-level Attention
Title | seq2graph: Discovering Dynamic Dependencies from Multivariate Time Series with Multi-level Attention |
Authors | Xuan-Hong Dang, Syed Yousaf Shah, Petros Zerfos |
Abstract | Discovering temporal lagged and inter-dependencies in multivariate time series data is an important task. However, in many real-world applications, such as commercial cloud management, manufacturing predictive maintenance, and portfolios performance analysis, such dependencies can be non-linear and time-variant, which makes it more challenging to extract such dependencies through traditional methods such as Granger causality or clustering. In this work, we present a novel deep learning model that uses multiple layers of customized gated recurrent units (GRUs) for discovering both time lagged behaviors as well as inter-timeseries dependencies in the form of directed weighted graphs. We introduce a key component of Dual-purpose recurrent neural network that decodes information in the temporal domain to discover lagged dependencies within each time series, and encodes them into a set of vectors which, collected from all component time series, form the informative inputs to discover inter-dependencies. Though the discovery of two types of dependencies are separated at different hierarchical levels, they are tightly connected and jointly trained in an end-to-end manner. With this joint training, learning of one type of dependency immediately impacts the learning of the other one, leading to overall accurate dependencies discovery. We empirically test our model on synthetic time series data in which the exact form of (non-linear) dependencies is known. We also evaluate its performance on two real-world applications, (i) performance monitoring data from a commercial cloud provider, which exhibit highly dynamic, non-linear, and volatile behavior and, (ii) sensor data from a manufacturing plant. We further show how our approach is able to capture these dependency behaviors via intuitive and interpretable dependency graphs and use them to generate highly accurate forecasts. |
Tasks | Time Series |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.04448v1 |
http://arxiv.org/pdf/1812.04448v1.pdf | |
PWC | https://paperswithcode.com/paper/seq2graph-discovering-dynamic-dependencies |
Repo | |
Framework | |
IoU is not submodular
Title | IoU is not submodular |
Authors | Tanguy Kerdoncuff, Rémi Emonet |
Abstract | This short article aims at demonstrate that the Intersection over Union (or Jaccard index) is not a submodular function. This mistake has been made in an article which is cited and used as a foundation in another article. The Intersection of Union is widely used in machine learning as a cost function especially for imbalance data and semantic segmentation. |
Tasks | Semantic Segmentation |
Published | 2018-09-03 |
URL | http://arxiv.org/abs/1809.00593v1 |
http://arxiv.org/pdf/1809.00593v1.pdf | |
PWC | https://paperswithcode.com/paper/iou-is-not-submodular |
Repo | |
Framework | |
DUNet: A deformable network for retinal vessel segmentation
Title | DUNet: A deformable network for retinal vessel segmentation |
Authors | Qiangguo Jin, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su |
Abstract | Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels’ local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutional networks, we integrate the deformable convolution into the proposed network. The DUNet, with upsampling operators to increase the output resolution, is designed to extract context information and enable precise localization by combining low-level feature maps with high-level ones. Furthermore, DUNet captures the retinal vessels at various shapes and scales by adaptively adjusting the receptive fields according to vessels’ scales and shapes. Three public datasets DRIVE, STARE and CHASE_DB1 are used to train and test our model. Detailed comparisons between the proposed network and the deformable neural network, U-Net are provided in our study. Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0.9697/0.9722/0.9724 and AUC of 0.9856/0.9868/0.9863 on DRIVE, STARE and CHASE_DB1 respectively. Moreover, to show the generalization ability of the DUNet, we used another two retinal vessel data sets, one is named WIDE and the other is a synthetic data set with diverse styles, named SYNTHE, to qualitatively and quantitatively analyzed and compared with other methods. Results indicates that DUNet outperforms other state-of-the-arts. |
Tasks | Retinal Vessel Segmentation |
Published | 2018-11-03 |
URL | http://arxiv.org/abs/1811.01206v1 |
http://arxiv.org/pdf/1811.01206v1.pdf | |
PWC | https://paperswithcode.com/paper/dunet-a-deformable-network-for-retinal-vessel |
Repo | |
Framework | |
Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data
Title | Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data |
Authors | Patrick Schratz, Jannes Muenchow, Eugenia Iturritxa, Jakob Richter, Alexander Brenning |
Abstract | Machine-learning algorithms have gained popularity in recent years in the field of ecological modeling due to their promising results in predictive performance of classification problems. While the application of such algorithms has been highly simplified in the last years due to their well-documented integration in commonly used statistical programming languages such as R, there are several practical challenges in the field of ecological modeling related to unbiased performance estimation, optimization of algorithms using hyperparameter tuning and spatial autocorrelation. We address these issues in the comparison of several widely used machine-learning algorithms such as Boosted Regression Trees (BRT), k-Nearest Neighbor (WKNN), Random Forest (RF) and Support Vector Machine (SVM) to traditional parametric algorithms such as logistic regression (GLM) and semi-parametric ones like generalized additive models (GAM). Different nested cross-validation methods including hyperparameter tuning methods are used to evaluate model performances with the aim to receive bias-reduced performance estimates. As a case study the spatial distribution of forest disease Diplodia sapinea in the Basque Country in Spain is investigated using common environmental variables such as temperature, precipitation, soil or lithology as predictors. Results show that GAM and RF (mean AUROC estimates 0.708 and 0.699) outperform all other methods in predictive accuracy. The effect of hyperparameter tuning saturates at around 50 iterations for this data set. The AUROC differences between the bias-reduced (spatial cross-validation) and overoptimistic (non-spatial cross-validation) performance estimates of the GAM and RF are 0.167 (24%) and 0.213 (30%), respectively. It is recommended to also use spatial partitioning for cross-validation hyperparameter tuning of spatial data. |
Tasks | |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.11266v1 |
http://arxiv.org/pdf/1803.11266v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-evaluation-and-hyperparameter |
Repo | |
Framework | |
A Systematic Comparison of Deep Learning Architectures in an Autonomous Vehicle
Title | A Systematic Comparison of Deep Learning Architectures in an Autonomous Vehicle |
Authors | Michael Teti, William Edward Hahn, Shawn Martin, Christopher Teti, Elan Barenholtz |
Abstract | Self-driving technology is advancing rapidly — albeit with significant challenges and limitations. This progress is largely due to recent developments in deep learning algorithms. To date, however, there has been no systematic comparison of how different deep learning architectures perform at such tasks, or an attempt to determine a correlation between classification performance and performance in an actual vehicle, a potentially critical factor in developing self-driving systems. Here, we introduce the first controlled comparison of multiple deep-learning architectures in an end-to-end autonomous driving task across multiple testing conditions. We compared performance, under identical driving conditions, across seven architectures including a fully-connected network, a simple 2 layer CNN, AlexNet, VGG-16, Inception-V3, ResNet, and an LSTM by assessing the number of laps each model was able to successfully complete without crashing while traversing an indoor racetrack. We compared performance across models when the conditions exactly matched those in training as well as when the local environment and track were configured differently and objects that were not included in the training dataset were placed on the track in various positions. In addition, we considered performance using several different data types for training and testing including single grayscale and color frames, and multiple grayscale frames stacked together in sequence. With the exception of a fully-connected network, all models performed reasonably well (around or above 80%) and most very well (~95%) on at least one input type but with considerable variation across models and inputs. Overall, AlexNet, operating on single color frames as input, achieved the best level of performance (100% success rate in phase one and 55% in phase two) while VGG-16 performed well most consistently across image types. |
Tasks | Autonomous Driving |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09386v2 |
http://arxiv.org/pdf/1803.09386v2.pdf | |
PWC | https://paperswithcode.com/paper/a-systematic-comparison-of-deep-learning |
Repo | |
Framework | |
Novel View Synthesis for Large-scale Scene using Adversarial Loss
Title | Novel View Synthesis for Large-scale Scene using Adversarial Loss |
Authors | Xiaochuan Yin, Henglai Wei, Penghong lin, Xiangwei Wang, Qijun Chen |
Abstract | Novel view synthesis aims to synthesize new images from different viewpoints of given images. Most of previous works focus on generating novel views of certain objects with a fixed background. However, for some applications, such as virtual reality or robotic manipulations, large changes in background may occur due to the egomotion of the camera. Generated images of a large-scale environment from novel views may be distorted if the structure of the environment is not considered. In this work, we propose a novel fully convolutional network, that can take advantage of the structural information explicitly by incorporating the inverse depth features. The inverse depth features are obtained from CNNs trained with sparse labeled depth values. This framework can easily fuse multiple images from different viewpoints. To fill the missing textures in the generated image, adversarial loss is applied, which can also improve the overall image quality. Our method is evaluated on the KITTI dataset. The results show that our method can generate novel views of large-scale scene without distortion. The effectiveness of our approach is demonstrated through qualitative and quantitative evaluation. |
Tasks | Novel View Synthesis |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07064v1 |
http://arxiv.org/pdf/1802.07064v1.pdf | |
PWC | https://paperswithcode.com/paper/novel-view-synthesis-for-large-scale-scene |
Repo | |
Framework | |
Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time
Title | Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time |
Authors | Asish Ghoshal, Jean Honorio |
Abstract | MAP perturbation models have emerged as a powerful framework for inference in structured prediction. Such models provide a way to efficiently sample from the Gibbs distribution and facilitate predictions that are robust to random noise. In this paper, we propose a provably polynomial time randomized algorithm for learning the parameters of perturbed MAP predictors. Our approach is based on minimizing a novel Rademacher-based generalization bound on the expected loss of a perturbed MAP predictor, which can be computed in polynomial time. We obtain conditions under which our randomized learning algorithm can guarantee generalization to unseen examples. |
Tasks | Structured Prediction |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08196v1 |
http://arxiv.org/pdf/1805.08196v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-maximum-a-posteriori-perturbation |
Repo | |
Framework | |
Multi-view Models for Political Ideology Detection of News Articles
Title | Multi-view Models for Political Ideology Detection of News Articles |
Authors | Vivek Kulkarni, Junting Ye, Steven Skiena, William Yang Wang |
Abstract | A news article’s title, content and link structure often reveal its political ideology. However, most existing works on automatic political ideology detection only leverage textual cues. Drawing inspiration from recent advances in neural inference, we propose a novel attention based multi-view model to leverage cues from all of the above views to identify the ideology evinced by a news article. Our model draws on advances in representation learning in natural language processing and network science to capture cues from both textual content and the network structure of news articles. We empirically evaluate our model against a battery of baselines and show that our model outperforms state of the art by 10 percentage points F1 score. |
Tasks | Representation Learning |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03485v1 |
http://arxiv.org/pdf/1809.03485v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-models-for-political-ideology |
Repo | |
Framework | |