October 17, 2019

3382 words 16 mins read

Paper Group ANR 894

A Framework in CRM Customer Lifecycle: Identify Downward Trend and Potential Issues Detection. Distributed Nonparametric Regression under Communication Constraints. Interpretable Latent Spaces for Learning from Demonstration. RARD II: The 94 Million Related-Article Recommendation Dataset. Nonnegative Matrix Factorization for Signal and Data Analyti …

A Framework in CRM Customer Lifecycle: Identify Downward Trend and Potential Issues Detection


Title	A Framework in CRM Customer Lifecycle: Identify Downward Trend and Potential Issues Detection
Authors	Kun Hu, Zhe Li, Ying Liu, Luyin Cheng, Qi Yang, Yan Li
Abstract	Customer retention is one of the primary goals in the area of customer relationship management. A mass of work exists in which machine learning models or business rules are established to predict churn. However, targeting users at an early stage when they start to show a downward trend is a better strategy. In downward trend prediction, the reasons why customers show a downward trend is of great interest in the industry as it helps the business to understand the pain points that customers suffer and to take early action to prevent them from churning. A commonly used method is to collect feedback from customers by either aggressively reaching out to them or by passively hearing from them. However, it is believed that there are a large number of customers who have unpleasant experiences and never speak out. In the literature, there is limited research work that provides a comprehensive and scientific approach to identify these “silent suffers”. In this study, we propose a novel two-part framework: developing the downward prediction process and establishing the methodology to identify the reasons why customers are in the downward trend. In the first prediction part, we focus on predicting the downward trend, which is an earlier stage of the customer lifecycle compared to churn. In the second part, we propose an approach to figuring out the cause (of the downward trend) based on a causal inference method and semi-supervised learning. The proposed approach is capable of identifying potential silent sufferers. We take bad shopping experiences as inputs to develop the framework and validate it via a marketing A/B test in the real world. The test readout demonstrates the effectiveness of the framework by driving 88.5% incremental lift in purchase volume.
Tasks	Causal Inference
Published	2018-02-25
URL	http://arxiv.org/abs/1802.08974v1
PDF	http://arxiv.org/pdf/1802.08974v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-in-crm-customer-lifecycle
Repo
Framework

Distributed Nonparametric Regression under Communication Constraints


Title	Distributed Nonparametric Regression under Communication Constraints
Authors	Yuancheng Zhu, John Lafferty
Abstract	This paper studies the problem of nonparametric estimation of a smooth function with data distributed across multiple machines. We assume an independent sample from a white noise model is collected at each machine, and an estimator of the underlying true function needs to be constructed at a central machine. We place limits on the number of bits that each machine can use to transmit information to the central machine. Our results give both asymptotic lower bounds and matching upper bounds on the statistical risk under various settings. We identify three regimes, depending on the relationship among the number of machines, the size of the data available at each machine, and the communication budget. When the communication budget is small, the statistical risk depends solely on this communication bottleneck, regardless of the sample size. In the regime where the communication budget is large, the classic minimax risk in the non-distributed estimation setting is recovered. In an intermediate regime, the statistical risk depends on both the sample size and the communication budget.
Tasks
Published	2018-03-04
URL	http://arxiv.org/abs/1803.01302v2
PDF	http://arxiv.org/pdf/1803.01302v2.pdf
PWC	https://paperswithcode.com/paper/distributed-nonparametric-regression-under
Repo
Framework

Interpretable Latent Spaces for Learning from Demonstration


Title	Interpretable Latent Spaces for Learning from Demonstration
Authors	Yordan Hristov, Alex Lascarides, Subramanian Ramamoorthy
Abstract	Effective human-robot interaction, such as in robot learning from human demonstration, requires the learning agent to be able to ground abstract concepts (such as those contained within instructions) in a corresponding high-dimensional sensory input stream from the world. Models such as deep neural networks, with high capacity through their large parameter spaces, can be used to compress the high-dimensional sensory data to lower dimensional representations. These low-dimensional representations facilitate symbol grounding, but may not guarantee that the representation would be human-interpretable. We propose a method which utilises the grouping of user-defined symbols and their corresponding sensory observations in order to align the learnt compressed latent representation with the semantic notions contained in the abstract labels. We demonstrate this through experiments with both simulated and real-world object data, showing that such alignment can be achieved in a process of physical symbol grounding.
Tasks
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06583v2
PDF	http://arxiv.org/pdf/1807.06583v2.pdf
PWC	https://paperswithcode.com/paper/interpretable-latent-spaces-for-learning-from
Repo
Framework


Title	RARD II: The 94 Million Related-Article Recommendation Dataset
Authors	Joeran Beel, Barry Smyth, Andrew Collins
Abstract	The main contribution of this paper is to introduce and describe a new recommender-systems dataset (RARD II). It is based on data from Mr. DLib, a recommender-system as-a-service in the digital library and reference-management-software domain. As such, RARD II complements datasets from other domains such as books, movies, and music. The dataset encompasses 94m recommendations, delivered in the two years from September 2016 to September 2018. The dataset covers an item-space of 24m unique items. RARD II provides a range of rich recommendation data, beyond conventional ratings. For example, in addition to the usual (implicit) ratings matrices, RARD II includes the original recommendation logs, which provide a unique insight into many aspects of the algorithms that generated the recommendations. The logs enable researchers to conduct various analyses about a real-world recommender system. This includes the evaluation of meta-learning approaches for predicting algorithm performance. In this paper, we summarise the key features of this dataset release, describe how it was generated and discuss some of its unique features. Compared to its predecessor RARD, RARD II contains 64% more recommendations, 187% more features (algorithms, parameters, and statistics), 50% more clicks, 140% more documents, and one additional service partner (JabRef).
Tasks	Meta-Learning, Recommendation Systems
Published	2018-07-18
URL	https://arxiv.org/abs/1807.06918v3
PDF	https://arxiv.org/pdf/1807.06918v3.pdf
PWC	https://paperswithcode.com/paper/rard-ii-the-2nd-related-article
Repo
Framework

Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications


Title	Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications
Authors	Xiao Fu, Kejun Huang, Nicholas D. Sidiropoulos, Wing-Kin Ma
Abstract	Nonnegative matrix factorization (NMF) has become a workhorse for signal and data analytics, triggered by its model parsimony and interpretability. Perhaps a bit surprisingly, the understanding to its model identifiability—the major reason behind the interpretability in many applications such as topic mining and hyperspectral imaging—had been rather limited until recent years. Beginning from the 2010s, the identifiability research of NMF has progressed considerably: Many interesting and important results have been discovered by the signal processing (SP) and machine learning (ML) communities. NMF identifiability has a great impact on many aspects in practice, such as ill-posed formulation avoidance and performance-guaranteed algorithm design. On the other hand, there is no tutorial paper that introduces NMF from an identifiability viewpoint. In this paper, we aim at filling this gap by offering a comprehensive and deep tutorial on model identifiability of NMF as well as the connections to algorithms and applications. This tutorial will help researchers and graduate students grasp the essence and insights of NMF, thereby avoiding typical `pitfalls’ that are often times due to unidentifiable NMF formulations. This paper will also help practitioners pick/design suitable factorization tools for their own problems. \|
Tasks
Published	2018-03-03
URL	http://arxiv.org/abs/1803.01257v4
PDF	http://arxiv.org/pdf/1803.01257v4.pdf
PWC	https://paperswithcode.com/paper/nonnegative-matrix-factorization-for-signal
Repo
Framework

Reconstruction Loss Minimized FCN for Single Image Dehazing


Title	Reconstruction Loss Minimized FCN for Single Image Dehazing
Authors	Shirsendu Sukanta Halder, Sanchayan Santra, Bhabatosh Chanda
Abstract	Haze and fog reduce the visibility of outdoor scenes as a veil like semi-transparent layer appears over the objects. As a result, images captured under such conditions lack contrast. Image dehazing methods try to alleviate this problem by recovering a clear version of the image. In this paper, we propose a Fully Convolutional Neural Network based model to recover the clear scene radiance by estimating the environmental illumination and the scene transmittance jointly from a hazy image. The method uses a relaxed haze imaging model to allow for the situations with non-uniform illumination. We have trained the network by minimizing a custom-defined loss that measures the error of reconstructing the hazy image in three different ways. Additionally, we use a multilevel approach to determine the scene transmittance and the environmental illumination in order to reduce the dependence of the estimate on image scale. Evaluations show that our model performs well compared to the existing state-of-the-art methods. It also verifies the potential of our model in diverse situations and various lighting conditions.
Tasks	Image Dehazing, Single Image Dehazing
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10788v1
PDF	http://arxiv.org/pdf/1811.10788v1.pdf
PWC	https://paperswithcode.com/paper/reconstruction-loss-minimized-fcn-for-single
Repo
Framework

An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation


Title	An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation
Authors	Bijoyan Das, Sarit Chakraborty
Abstract	With the rapid growth of Text sentiment analysis, the demand for automatic classification of electronic documents has increased by leaps and bound. The paradigm of text classification or text mining has been the subject of many research works in recent time. In this paper we propose a technique for text sentiment classification using term frequency- inverse document frequency (TF-IDF) along with Next Word Negation (NWN). We have also compared the performances of binary bag of words model, TF-IDF model and TF-IDF with next word negation (TF-IDF-NWN) model for text classification. Our proposed model is then applied on three different text mining algorithms and we found the Linear Support vector machine (LSVM) is the most appropriate to work with our proposed model. The achieved results show significant increase in accuracy compared to earlier methods.
Tasks	Sentiment Analysis, Text Classification
Published	2018-06-17
URL	http://arxiv.org/abs/1806.06407v1
PDF	http://arxiv.org/pdf/1806.06407v1.pdf
PWC	https://paperswithcode.com/paper/an-improved-text-sentiment-classification
Repo
Framework

Gated Fusion Network for Single Image Dehazing


Title	Gated Fusion Network for Single Image Dehazing
Authors	Wenqi Ren, Lin Ma, Jiawei Zhang, Jinshan Pan, Xiaochun Cao, Wei Liu, Ming-Hsuan Yang
Abstract	In this paper, we propose an efficient algorithm to directly restore a clear image from a hazy input. The proposed algorithm hinges on an end-to-end trainable neural network that consists of an encoder and a decoder. The encoder is exploited to capture the context of the derived input images, while the decoder is employed to estimate the contribution of each input to the final dehazed result using the learned representations attributed to the encoder. The constructed network adopts a novel fusion-based strategy which derives three inputs from an original hazy image by applying White Balance (WB), Contrast Enhancing (CE), and Gamma Correction (GC). We compute pixel-wise confidence maps based on the appearance differences between these different inputs to blend the information of the derived inputs and preserve the regions with pleasant visibility. The final dehazed image is yielded by gating the important features of the derived inputs. To train the network, we introduce a multi-scale approach such that the halo artifacts can be avoided. Extensive experimental results on both synthetic and real-world images demonstrate that the proposed algorithm performs favorably against the state-of-the-art algorithms.
Tasks	Image Dehazing, Single Image Dehazing
Published	2018-03-31
URL	http://arxiv.org/abs/1804.00213v1
PDF	http://arxiv.org/pdf/1804.00213v1.pdf
PWC	https://paperswithcode.com/paper/gated-fusion-network-for-single-image
Repo
Framework

A Cascaded Convolutional Neural Network for Single Image Dehazing


Title	A Cascaded Convolutional Neural Network for Single Image Dehazing
Authors	Chongyi Li, Jichang Guo, Fatih Porikli, Huazhu Fu, Yanwei Pang
Abstract	Images captured under outdoor scenes usually suffer from low contrast and limited visibility due to suspended atmospheric particles, which directly affects the quality of photos. Despite numerous image dehazing methods have been proposed, effective hazy image restoration remains a challenging problem. Existing learning-based methods usually predict the medium transmission by Convolutional Neural Networks (CNNs), but ignore the key global atmospheric light. Different from previous learning-based methods, we propose a flexible cascaded CNN for single hazy image restoration, which considers the medium transmission and global atmospheric light jointly by two task-driven subnetworks. Specifically, the medium transmission estimation subnetwork is inspired by the densely connected CNN while the global atmospheric light estimation subnetwork is a light-weight CNN. Besides, these two subnetworks are cascaded by sharing the common features. Finally, with the estimated model parameters, the haze-free image is obtained by the atmospheric scattering model inversion, which achieves more accurate and effective restoration performance. Qualitatively and quantitatively experimental results on the synthetic and real-world hazy images demonstrate that the proposed method effectively removes haze from such images, and outperforms several state-of-the-art dehazing methods.
Tasks	Image Dehazing, Image Restoration, Single Image Dehazing
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07955v1
PDF	http://arxiv.org/pdf/1803.07955v1.pdf
PWC	https://paperswithcode.com/paper/a-cascaded-convolutional-neural-network-for
Repo
Framework

Constructionist Steps Towards an Autonomously Empathetic System


Title	Constructionist Steps Towards an Autonomously Empathetic System
Authors	Trevor Buteau, Damian Lyons
Abstract	Prior efforts to create an autonomous computer system capable of predicting what a human being is thinking or feeling from facial expression data have been largely based on outdated, inaccurate models of how emotions work that rely on many scientifically questionable assumptions. In our research, we are creating an empathetic system that incorporates the latest provable scientific understanding of emotions: that they are constructs of the human mind, rather than universal expressions of distinct internal states. Thus, our system uses a user-dependent method of analysis and relies heavily on contextual information to make predictions about what subjects are experiencing. Our system’s accuracy and therefore usefulness are built on provable ground truths that prohibit the drawing of inaccurate conclusions that other systems could too easily make.
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00981v1
PDF	http://arxiv.org/pdf/1808.00981v1.pdf
PWC	https://paperswithcode.com/paper/constructionist-steps-towards-an-autonomously
Repo
Framework

C2MSNet: A Novel approach for single image haze removal


Title	C2MSNet: A Novel approach for single image haze removal
Authors	Akshay Dudhane, Subrahmanyam Murala
Abstract	Degradation of image quality due to the presence of haze is a very common phenomenon. Existing DehazeNet [3], MSCNN [11] tackled the drawbacks of hand crafted haze relevant features. However, these methods have the problem of color distortion in gloomy (poor illumination) environment. In this paper, a cardinal (red, green and blue) color fusion network for single image haze removal is proposed. In first stage, network fusses color information present in hazy images and generates multi-channel depth maps. The second stage estimates the scene transmission map from generated dark channels using multi channel multi scale convolutional neural network (McMs-CNN) to recover the original scene. To train the proposed network, we have used two standard datasets namely: ImageNet [5] and D-HAZY [1]. Performance evaluation of the proposed approach has been carried out using structural similarity index (SSIM), mean square error (MSE) and peak signal to noise ratio (PSNR). Performance analysis shows that the proposed approach outperforms the existing state-of-the-art methods for single image dehazing.
Tasks	Image Dehazing, Single Image Dehazing, Single Image Haze Removal
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08406v1
PDF	http://arxiv.org/pdf/1801.08406v1.pdf
PWC	https://paperswithcode.com/paper/c2msnet-a-novel-approach-for-single-image
Repo
Framework

Generation of Synthetic Electronic Medical Record Text


Title	Generation of Synthetic Electronic Medical Record Text
Authors	Jiaqi Guan, Runzhe Li, Sheng Yu, Xuegong Zhang
Abstract	Machine learning (ML) and Natural Language Processing (NLP) have achieved remarkable success in many fields and have brought new opportunities and high expectation in the analyses of medical data. The most common type of medical data is the massive free-text electronic medical records (EMR). It is widely regarded that mining such massive data can bring up important information for improving medical practices as well as for possible new discoveries on complex diseases. However, the free EMR texts are lacking consistent standards, rich of private information, and limited in availability. Also, as they are accumulated from everyday practices, it is often hard to have a balanced number of samples for the types of diseases under study. These problems hinder the development of ML and NLP methods for EMR data analysis. To tackle these problems, we developed a model to generate synthetic text of EMRs called Medical Text Generative Adversarial Network or mtGAN. It is based on the GAN framework and is trained by the REINFORCE algorithm. It takes disease features as inputs and generates synthetic texts as EMRs for the corresponding diseases. We evaluate the model from micro-level, macro-level and application-level on a Chinese EMR text dataset. The results show that the method has a good capacity to fit real data and can generate realistic and diverse EMR samples. This provides a novel way to avoid potential leakage of patient privacy while still supply sufficient well-controlled cohort data for developing downstream ML and NLP methods. It can also be used as a data augmentation method to assist studies based on real EMR data.
Tasks	Data Augmentation
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02793v1
PDF	http://arxiv.org/pdf/1812.02793v1.pdf
PWC	https://paperswithcode.com/paper/generation-of-synthetic-electronic-medical
Repo
Framework

Learning from Web Data: the Benefit of Unsupervised Object Localization


Title	Learning from Web Data: the Benefit of Unsupervised Object Localization
Authors	Xiaoxiao Sun, Liang Zheng, Yu-Kun Lai, Jufeng Yang
Abstract	Annotating a large number of training images is very time-consuming. In this background, this paper focuses on learning from easy-to-acquire web data and utilizes the learned model for fine-grained image classification in labeled datasets. Currently, the performance gain from training with web data is incremental, like a common saying “better than nothing, but not by much”. Conventionally, the community looks to correcting the noisy web labels to select informative samples. In this work, we first systematically study the built-in gap between the web and standard datasets, i.e. different data distributions between the two kinds of data. Then, in addition to using web labels, we present an unsupervised object localization method, which provides critical insights into the object density and scale in web images. Specifically, we design two constraints on web data to substantially reduce the difference of data distributions for the web and standard datasets. First, we present a method to control the scale, localization and number of objects in the detected region. Second, we propose to select the regions containing objects that are consistent with the web tag. Based on the two constraints, we are able to process web images to reduce the gap, and the processed web data is used to better assist the standard dataset to train CNNs. Experiments on several fine-grained image classification datasets confirm that our method performs favorably against the state-of-the-art methods.
Tasks	Fine-Grained Image Classification, Image Classification, Object Localization, Unsupervised Object Localization
Published	2018-12-21
URL	http://arxiv.org/abs/1812.09232v1
PDF	http://arxiv.org/pdf/1812.09232v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-web-data-the-benefit-of
Repo
Framework

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification


Title	Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
Authors	Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang
Abstract	Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine. Compared with MRC on a single passage, multi-passage MRC is more challenging, since we are likely to get multiple confusing answer candidates from different passages. To address this problem, we propose an end-to-end neural model that enables those answer candidates from different passages to verify each other based on their content representations. Specifically, we jointly train three modules that can predict the final answer based on three factors: the answer boundary, the answer content and the cross-passage answer verification. The experimental results show that our method outperforms the baseline by a large margin and achieves the state-of-the-art performance on the English MS-MARCO dataset and the Chinese DuReader dataset, both of which are designed for MRC in real-world settings.
Tasks	Machine Reading Comprehension, Question Answering, Reading Comprehension
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02220v2
PDF	http://arxiv.org/pdf/1805.02220v2.pdf
PWC	https://paperswithcode.com/paper/multi-passage-machine-reading-comprehension
Repo
Framework

A machine learning model for identifying cyclic alternating patterns in the sleeping brain


Title	A machine learning model for identifying cyclic alternating patterns in the sleeping brain
Authors	Aditya Chindhade, Abhijeet Alshi, Aakash Bhatia, Kedar Dabhadkar, Pranav Sivadas Menon
Abstract	Electroencephalography (EEG) is a method to record the electrical signals in the brain. Recognizing the EEG patterns in the sleeping brain gives insights into the understanding of sleeping disorders. The dataset under consideration contains EEG data points associated with various physiological conditions. This study attempts to generalize the detection of particular patterns associated with the Non-Rapid Eye Movement (NREM) sleep cycle of the brain using a machine learning model. The proposed model uses additional feature engineering to incorporate sequential information for training a classifier to predict the occurrence of Cyclic Alternating Pattern (CAP) sequences in the sleep cycle, which are often associated with sleep disorders.
Tasks	EEG, Feature Engineering
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08750v1
PDF	http://arxiv.org/pdf/1804.08750v1.pdf
PWC	https://paperswithcode.com/paper/a-machine-learning-model-for-identifying
Repo
Framework