October 20, 2019

3493 words 17 mins read

Paper Group ANR 93

Improving Question Answering by Commonsense-Based Pre-Training. LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models. A matching based clustering algorithm for categorical data. Machine learning for Internet of Things data analysis: A survey. Catalog of quasars from the Kilo-Degree Survey Data Release 3. Negative …

Improving Question Answering by Commonsense-Based Pre-Training


Title	Improving Question Answering by Commonsense-Based Pre-Training
Authors	Wanjun Zhong, Duyu Tang, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
Abstract	Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge. We believe the main reason is the lack of commonsense \mbox{connections} between concepts. To remedy this, we provide a simple and effective method that leverages external commonsense knowledge base such as ConceptNet. We pre-train direct and indirect relational functions between concepts, and show that these pre-trained functions could be easily added to existing neural network models. Results show that incorporating commonsense-based function improves the baseline on three question answering tasks that require commonsense reasoning. Further analysis shows that our system \mbox{discovers} and leverages useful evidence from an external commonsense knowledge base, which is missing in existing neural network models and help derive the correct answer.
Tasks	Question Answering
Published	2018-09-05
URL	http://arxiv.org/abs/1809.03568v3
PDF	http://arxiv.org/pdf/1809.03568v3.pdf
PWC	https://paperswithcode.com/paper/improving-question-answering-by-commonsense
Repo
Framework

LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models


Title	LEAFAGE: Example-based and Feature importance-based Explanationsfor Black-box ML models
Authors	Ajaya Adhikari, D. M. J Tax, Riccardo Satta, Matthias Fath
Abstract	As machine learning models become more accurate, they typically become more complex and uninterpretable by humans. The black-box character of these models holds back its acceptance in practice, especially in high-risk domains where the consequences of failure could be catastrophic such as health-care or defense. Providing understandable and useful explanations behind ML models or predictions can increase the trust of the user. Example-based reasoning, which entails leveraging previous experience with analogous tasks to make a decision, is a well known strategy for problem solving and justification. This work presents a new explanation extraction method called LEAFAGE, for a prediction made by any black-box ML model. The explanation consists of the visualization of similar examples from the training set and the importance of each feature. Moreover, these explanations are contrastive which aims to take the expectations of the user into account. LEAFAGE is evaluated in terms of fidelity to the underlying black-box model and usefulness to the user. The results showed that LEAFAGE performs overall better than the current state-of-the-art method LIME in terms of fidelity, on ML models with non-linear decision boundary. A user-study was conducted which focused on revealing the differences between example-based and feature importance-based explanations. It showed that example-based explanations performed significantly better than feature importance-based explanation, in terms of perceived transparency, information sufficiency, competence and confidence. Counter-intuitively, when the gained knowledge of the participants was tested, it showed that they learned less about the black-box model after seeing a feature importance-based explanation than seeing no explanation at all. The participants found feature importance-based explanation vague and hard to generalize it to other instances.
Tasks	Feature Importance
Published	2018-12-21
URL	https://arxiv.org/abs/1812.09044v3
PDF	https://arxiv.org/pdf/1812.09044v3.pdf
PWC	https://paperswithcode.com/paper/example-and-feature-importance-based
Repo
Framework

A matching based clustering algorithm for categorical data


Title	A matching based clustering algorithm for categorical data
Authors	Ruben A. Gevorgyan, Yenok B. Hakobyan
Abstract	Cluster analysis is one of the essential tasks in data mining and knowledge discovery. Each type of data poses unique challenges in achieving relatively efficient partitioning of the data into homogeneous groups. While the algorithms for numeric data are relatively well studied in the literature, there are still challenges to address in case of categorical data. The main issue is the unordered structure of categorical data, which makes the implementation of the standard concepts of clustering algorithms difficult. For instance, the assessment of distance between objects, the selection of representatives for categorical data is not as straightforward as for continuous data. Therefore, this paper presents a new framework for partitioning categorical data, which does not use the distance measure as a key concept. The Matching based clustering algorithm is designed based on the similarity matrix and a framework for updating the latter using the feature importance criteria. The experimental results show this algorithm can serve as an alternative to existing ones and can be an efficient knowledge discovery tool.
Tasks	Feature Importance
Published	2018-12-09
URL	http://arxiv.org/abs/1812.03469v1
PDF	http://arxiv.org/pdf/1812.03469v1.pdf
PWC	https://paperswithcode.com/paper/a-matching-based-clustering-algorithm-for
Repo
Framework

Machine learning for Internet of Things data analysis: A survey


Title	Machine learning for Internet of Things data analysis: A survey
Authors	Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Mohammadamin Barekatain, Peyman Adibi, Payam Barnaghi, Amit P. Sheth
Abstract	Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.
Tasks
Published	2018-02-17
URL	http://arxiv.org/abs/1802.06305v1
PDF	http://arxiv.org/pdf/1802.06305v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-internet-of-things-data
Repo
Framework

Catalog of quasars from the Kilo-Degree Survey Data Release 3


Title	Catalog of quasars from the Kilo-Degree Survey Data Release 3
Authors	S. Nakoneczny, M. Bilicki, A. Solarz, A. Pollo, N. Maddox, C. Spiniello, M. Brescia, N. R. Napolitano
Abstract	We present a catalog of quasars selected from broad-band photometric ugri data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are identified by the random forest (RF) supervised machine learning model, trained on SDSS DR14 spectroscopic data. We first cleaned the input KiDS data from entries with excessively noisy, missing or otherwise problematic measurements. Applying a feature importance analysis, we then tune the algorithm and identify in the KiDS multiband catalog the 17 most useful features for the classification, namely magnitudes, colors, magnitude ratios, and the stellarity index. We used the t-SNE algorithm to map the multi-dimensional photometric data onto 2D planes and compare the coverage of the training and inference sets. We limited the inference set to r<22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the random forest identified 190,000 quasar candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from WISE. An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8 is optimal for purity, whereas p(QSO)>0.7 is preferable for better completeness. Our study presents the first comprehensive quasar selection from deep high-quality KiDS data and will serve as the basis for versatile studies of the QSO population detected by this survey.
Tasks	Feature Importance
Published	2018-12-07
URL	http://arxiv.org/abs/1812.03084v2
PDF	http://arxiv.org/pdf/1812.03084v2.pdf
PWC	https://paperswithcode.com/paper/catalog-of-quasars-from-the-kilo-degree
Repo
Framework

Negative results for approximation using single layer and multilayer feedforward neural networks


Title	Negative results for approximation using single layer and multilayer feedforward neural networks
Authors	J. M. Almira, P. E. Lopez-de-Teruel, D. J. Romero-Lopez, F. Voigtlaender
Abstract	We prove a negative result for the approximation of functions defined on compact subsets of $\mathbb{R}^d$ (where $d \geq 2$) using single layer feedforward neural networks with arbitrary continuous activation function. In a nutshell, this result claims the existence of target functions which are as difficult to approximate using these neural networks as one may want. We also demonstrate an analogous result (for general $d \in \mathbb{N}$) for neural networks with an arbitrary number of layers, for activation functions which are either rational functions, or continuous splines with finitely many pieces.
Tasks
Published	2018-10-23
URL	https://arxiv.org/abs/1810.10032v3
PDF	https://arxiv.org/pdf/1810.10032v3.pdf
PWC	https://paperswithcode.com/paper/some-negative-results-for-single-layer-and
Repo
Framework

Privacy-Protective-GAN for Face De-identification


Title	Privacy-Protective-GAN for Face De-identification
Authors	Yifan Wu, Fan Yang, Haibin Ling
Abstract	Face de-identification has become increasingly important as the image sources are explosively growing and easily accessible. The advance of new face recognition techniques also arises people’s concern regarding the privacy leakage. The mainstream pipelines of face de-identification are mostly based on the k-same framework, which bears critiques of low effectiveness and poor visual quality. In this paper, we propose a new framework called Privacy-Protective-GAN (PP-GAN) that adapts GAN with novel verificator and regulator modules specially designed for the face de-identification problem to ensure generating de-identified output with retained structure similarity according to a single input. We evaluate the proposed approach in terms of privacy protection, utility preservation, and structure similarity. Our approach not only outperforms existing face de-identification techniques but also provides a practical framework of adapting GAN with priors of domain knowledge.
Tasks	Face Recognition
Published	2018-06-23
URL	http://arxiv.org/abs/1806.08906v1
PDF	http://arxiv.org/pdf/1806.08906v1.pdf
PWC	https://paperswithcode.com/paper/privacy-protective-gan-for-face-de
Repo
Framework

seq2graph: Discovering Dynamic Dependencies from Multivariate Time Series with Multi-level Attention


Title	seq2graph: Discovering Dynamic Dependencies from Multivariate Time Series with Multi-level Attention
Authors	Xuan-Hong Dang, Syed Yousaf Shah, Petros Zerfos
Abstract	Discovering temporal lagged and inter-dependencies in multivariate time series data is an important task. However, in many real-world applications, such as commercial cloud management, manufacturing predictive maintenance, and portfolios performance analysis, such dependencies can be non-linear and time-variant, which makes it more challenging to extract such dependencies through traditional methods such as Granger causality or clustering. In this work, we present a novel deep learning model that uses multiple layers of customized gated recurrent units (GRUs) for discovering both time lagged behaviors as well as inter-timeseries dependencies in the form of directed weighted graphs. We introduce a key component of Dual-purpose recurrent neural network that decodes information in the temporal domain to discover lagged dependencies within each time series, and encodes them into a set of vectors which, collected from all component time series, form the informative inputs to discover inter-dependencies. Though the discovery of two types of dependencies are separated at different hierarchical levels, they are tightly connected and jointly trained in an end-to-end manner. With this joint training, learning of one type of dependency immediately impacts the learning of the other one, leading to overall accurate dependencies discovery. We empirically test our model on synthetic time series data in which the exact form of (non-linear) dependencies is known. We also evaluate its performance on two real-world applications, (i) performance monitoring data from a commercial cloud provider, which exhibit highly dynamic, non-linear, and volatile behavior and, (ii) sensor data from a manufacturing plant. We further show how our approach is able to capture these dependency behaviors via intuitive and interpretable dependency graphs and use them to generate highly accurate forecasts.
Tasks	Time Series
Published	2018-12-07
URL	http://arxiv.org/abs/1812.04448v1
PDF	http://arxiv.org/pdf/1812.04448v1.pdf
PWC	https://paperswithcode.com/paper/seq2graph-discovering-dynamic-dependencies
Repo
Framework

IoU is not submodular


Title	IoU is not submodular
Authors	Tanguy Kerdoncuff, Rémi Emonet
Abstract	This short article aims at demonstrate that the Intersection over Union (or Jaccard index) is not a submodular function. This mistake has been made in an article which is cited and used as a foundation in another article. The Intersection of Union is widely used in machine learning as a cost function especially for imbalance data and semantic segmentation.
Tasks	Semantic Segmentation
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00593v1
PDF	http://arxiv.org/pdf/1809.00593v1.pdf
PWC	https://paperswithcode.com/paper/iou-is-not-submodular
Repo
Framework

DUNet: A deformable network for retinal vessel segmentation


Title	DUNet: A deformable network for retinal vessel segmentation
Authors	Qiangguo Jin, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su
Abstract	Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels’ local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutional networks, we integrate the deformable convolution into the proposed network. The DUNet, with upsampling operators to increase the output resolution, is designed to extract context information and enable precise localization by combining low-level feature maps with high-level ones. Furthermore, DUNet captures the retinal vessels at various shapes and scales by adaptively adjusting the receptive fields according to vessels’ scales and shapes. Three public datasets DRIVE, STARE and CHASE_DB1 are used to train and test our model. Detailed comparisons between the proposed network and the deformable neural network, U-Net are provided in our study. Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0.9697/0.9722/0.9724 and AUC of 0.9856/0.9868/0.9863 on DRIVE, STARE and CHASE_DB1 respectively. Moreover, to show the generalization ability of the DUNet, we used another two retinal vessel data sets, one is named WIDE and the other is a synthetic data set with diverse styles, named SYNTHE, to qualitatively and quantitatively analyzed and compared with other methods. Results indicates that DUNet outperforms other state-of-the-arts.
Tasks	Retinal Vessel Segmentation
Published	2018-11-03
URL	http://arxiv.org/abs/1811.01206v1
PDF	http://arxiv.org/pdf/1811.01206v1.pdf
PWC	https://paperswithcode.com/paper/dunet-a-deformable-network-for-retinal-vessel
Repo
Framework

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data


Title	Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data
Authors	Patrick Schratz, Jannes Muenchow, Eugenia Iturritxa, Jakob Richter, Alexander Brenning
Abstract	Machine-learning algorithms have gained popularity in recent years in the field of ecological modeling due to their promising results in predictive performance of classification problems. While the application of such algorithms has been highly simplified in the last years due to their well-documented integration in commonly used statistical programming languages such as R, there are several practical challenges in the field of ecological modeling related to unbiased performance estimation, optimization of algorithms using hyperparameter tuning and spatial autocorrelation. We address these issues in the comparison of several widely used machine-learning algorithms such as Boosted Regression Trees (BRT), k-Nearest Neighbor (WKNN), Random Forest (RF) and Support Vector Machine (SVM) to traditional parametric algorithms such as logistic regression (GLM) and semi-parametric ones like generalized additive models (GAM). Different nested cross-validation methods including hyperparameter tuning methods are used to evaluate model performances with the aim to receive bias-reduced performance estimates. As a case study the spatial distribution of forest disease Diplodia sapinea in the Basque Country in Spain is investigated using common environmental variables such as temperature, precipitation, soil or lithology as predictors. Results show that GAM and RF (mean AUROC estimates 0.708 and 0.699) outperform all other methods in predictive accuracy. The effect of hyperparameter tuning saturates at around 50 iterations for this data set. The AUROC differences between the bias-reduced (spatial cross-validation) and overoptimistic (non-spatial cross-validation) performance estimates of the GAM and RF are 0.167 (24%) and 0.213 (30%), respectively. It is recommended to also use spatial partitioning for cross-validation hyperparameter tuning of spatial data.
Tasks
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11266v1
PDF	http://arxiv.org/pdf/1803.11266v1.pdf
PWC	https://paperswithcode.com/paper/performance-evaluation-and-hyperparameter
Repo
Framework

A Systematic Comparison of Deep Learning Architectures in an Autonomous Vehicle


Title	A Systematic Comparison of Deep Learning Architectures in an Autonomous Vehicle
Authors	Michael Teti, William Edward Hahn, Shawn Martin, Christopher Teti, Elan Barenholtz
Abstract	Self-driving technology is advancing rapidly — albeit with significant challenges and limitations. This progress is largely due to recent developments in deep learning algorithms. To date, however, there has been no systematic comparison of how different deep learning architectures perform at such tasks, or an attempt to determine a correlation between classification performance and performance in an actual vehicle, a potentially critical factor in developing self-driving systems. Here, we introduce the first controlled comparison of multiple deep-learning architectures in an end-to-end autonomous driving task across multiple testing conditions. We compared performance, under identical driving conditions, across seven architectures including a fully-connected network, a simple 2 layer CNN, AlexNet, VGG-16, Inception-V3, ResNet, and an LSTM by assessing the number of laps each model was able to successfully complete without crashing while traversing an indoor racetrack. We compared performance across models when the conditions exactly matched those in training as well as when the local environment and track were configured differently and objects that were not included in the training dataset were placed on the track in various positions. In addition, we considered performance using several different data types for training and testing including single grayscale and color frames, and multiple grayscale frames stacked together in sequence. With the exception of a fully-connected network, all models performed reasonably well (around or above 80%) and most very well (~95%) on at least one input type but with considerable variation across models and inputs. Overall, AlexNet, operating on single color frames as input, achieved the best level of performance (100% success rate in phase one and 55% in phase two) while VGG-16 performed well most consistently across image types.
Tasks	Autonomous Driving
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09386v2
PDF	http://arxiv.org/pdf/1803.09386v2.pdf
PWC	https://paperswithcode.com/paper/a-systematic-comparison-of-deep-learning
Repo
Framework

Novel View Synthesis for Large-scale Scene using Adversarial Loss


Title	Novel View Synthesis for Large-scale Scene using Adversarial Loss
Authors	Xiaochuan Yin, Henglai Wei, Penghong lin, Xiangwei Wang, Qijun Chen
Abstract	Novel view synthesis aims to synthesize new images from different viewpoints of given images. Most of previous works focus on generating novel views of certain objects with a fixed background. However, for some applications, such as virtual reality or robotic manipulations, large changes in background may occur due to the egomotion of the camera. Generated images of a large-scale environment from novel views may be distorted if the structure of the environment is not considered. In this work, we propose a novel fully convolutional network, that can take advantage of the structural information explicitly by incorporating the inverse depth features. The inverse depth features are obtained from CNNs trained with sparse labeled depth values. This framework can easily fuse multiple images from different viewpoints. To fill the missing textures in the generated image, adversarial loss is applied, which can also improve the overall image quality. Our method is evaluated on the KITTI dataset. The results show that our method can generate novel views of large-scale scene without distortion. The effectiveness of our approach is demonstrated through qualitative and quantitative evaluation.
Tasks	Novel View Synthesis
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07064v1
PDF	http://arxiv.org/pdf/1802.07064v1.pdf
PWC	https://paperswithcode.com/paper/novel-view-synthesis-for-large-scale-scene
Repo
Framework

Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time


Title	Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time
Authors	Asish Ghoshal, Jean Honorio
Abstract	MAP perturbation models have emerged as a powerful framework for inference in structured prediction. Such models provide a way to efficiently sample from the Gibbs distribution and facilitate predictions that are robust to random noise. In this paper, we propose a provably polynomial time randomized algorithm for learning the parameters of perturbed MAP predictors. Our approach is based on minimizing a novel Rademacher-based generalization bound on the expected loss of a perturbed MAP predictor, which can be computed in polynomial time. We obtain conditions under which our randomized learning algorithm can guarantee generalization to unseen examples.
Tasks	Structured Prediction
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08196v1
PDF	http://arxiv.org/pdf/1805.08196v1.pdf
PWC	https://paperswithcode.com/paper/learning-maximum-a-posteriori-perturbation
Repo
Framework

Multi-view Models for Political Ideology Detection of News Articles


Title	Multi-view Models for Political Ideology Detection of News Articles
Authors	Vivek Kulkarni, Junting Ye, Steven Skiena, William Yang Wang
Abstract	A news article’s title, content and link structure often reveal its political ideology. However, most existing works on automatic political ideology detection only leverage textual cues. Drawing inspiration from recent advances in neural inference, we propose a novel attention based multi-view model to leverage cues from all of the above views to identify the ideology evinced by a news article. Our model draws on advances in representation learning in natural language processing and network science to capture cues from both textual content and the network structure of news articles. We empirically evaluate our model against a battery of baselines and show that our model outperforms state of the art by 10 percentage points F1 score.
Tasks	Representation Learning
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03485v1
PDF	http://arxiv.org/pdf/1809.03485v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-models-for-political-ideology
Repo
Framework