October 18, 2019

3186 words 15 mins read

Paper Group ANR 564

EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search. Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning. Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions. Controlling Decoding for More Abstractive Summaries with Copy-Based Networks. PACT: Parameterized Clipping A …

EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search


Title	EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search
Authors	Wenjin Wu, Guojun Liu, Hui Ye, Chenshuang Zhang, Tianshu Wu, Daorui Xiao, Wei Lin, Xiaoyu Zhu
Abstract	E-commerce sponsored search contributes an important part of revenue for the e-commerce company. In consideration of effectiveness and efficiency, a large-scale sponsored search system commonly adopts a multi-stage architecture. We name these stages as ad retrieval, ad pre-ranking and ad ranking. Ad retrieval and ad pre-ranking are collectively referred to as ad matching in this paper. We propose an end-to-end neural matching framework (EENMF) to model two tasks—vector-based ad retrieval and neural networks based ad pre-ranking. Under the deep matching framework, vector-based ad retrieval harnesses user recent behavior sequence to retrieve relevant ad candidates without the constraint of keyword bidding. Simultaneously, the deep model is employed to perform the global pre-ranking of ad candidates from multiple retrieval paths effectively and efficiently. Besides, the proposed model tries to optimize the pointwise cross-entropy loss which is consistent with the objective of predict models in the ranking stage. We conduct extensive evaluation to validate the performance of the proposed framework. In the real traffic of a large-scale e-commerce sponsored search, the proposed approach significantly outperforms the baseline.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01190v4
PDF	http://arxiv.org/pdf/1812.01190v4.pdf
PWC	https://paperswithcode.com/paper/eenmf-an-end-to-end-neural-matching-framework
Repo
Framework

Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning


Title	Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning
Authors	Zecheng He, Aswin Raghavan, Guangyuan Hu, Sek Chai, Ruby Lee
Abstract	Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection. In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (>99.9%), nearly zero false positives and short (<360ms) latency.
Tasks	Anomaly Detection
Published	2018-06-18
URL	https://arxiv.org/abs/1806.06496v3
PDF	https://arxiv.org/pdf/1806.06496v3.pdf
PWC	https://paperswithcode.com/paper/detecting-zero-day-controller-hijacking
Repo
Framework

Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions


Title	Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions
Authors	Tatsuya Suzuki, Takao Yamanaka
Abstract	In recent years, the deep learning techniques have been applied to the estimation of saliency maps, which represent probability density functions of fixations when people look at the images. Although the methods of saliency-map estimation have been actively studied for 2-dimensional planer images, the methods for omni-directional images to be utilized in virtual environments had not been studied, until a competition of saliency-map estimation for the omni-directional images was held in ICME2017. In this paper, novel methods for estimating saliency maps for the omni-directional images are proposed considering the properties of prior distributions for fixations in the planar images and the omni-directional images.
Tasks
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06329v1
PDF	http://arxiv.org/pdf/1807.06329v1.pdf
PWC	https://paperswithcode.com/paper/saliency-map-estimation-for-omni-directional
Repo
Framework

Controlling Decoding for More Abstractive Summaries with Copy-Based Networks


Title	Controlling Decoding for More Abstractive Summaries with Copy-Based Networks
Authors	Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Kyunghyun Cho
Abstract	Attention-based neural abstractive summarization systems equipped with copy mechanisms have shown promising results. Despite this success, it has been noticed that such a system generates a summary by mostly, if not entirely, copying over phrases, sentences, and sometimes multiple consecutive sentences from an input paragraph, effectively performing extractive summarization. In this paper, we verify this behavior using the latest neural abstractive summarization system - a pointer-generator network. We propose a simple baseline method that allows us to control the amount of copying without retraining. Experiments indicate that the method provides a strong baseline for abstractive systems looking to obtain high ROUGE scores while minimizing overlap with the source article, substantially reducing the n-gram overlap with the original article while keeping within 2 points of the original model’s ROUGE score.
Tasks	Abstractive Text Summarization
Published	2018-03-19
URL	http://arxiv.org/abs/1803.07038v2
PDF	http://arxiv.org/pdf/1803.07038v2.pdf
PWC	https://paperswithcode.com/paper/controlling-decoding-for-more-abstractive
Repo
Framework

PACT: Parameterized Clipping Activation for Quantized Neural Networks


Title	PACT: Parameterized Clipping Activation for Quantized Neural Networks
Authors	Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan
Abstract	Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $\alpha$ that is optimized during training to find the right quantization scale. PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes. We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets. We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance due to a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories.
Tasks	Quantization
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06085v2
PDF	http://arxiv.org/pdf/1805.06085v2.pdf
PWC	https://paperswithcode.com/paper/pact-parameterized-clipping-activation-for
Repo
Framework


Title	Simple Attention-Based Representation Learning for Ranking Short Social Media Posts
Authors	Peng Shi, Jinfeng Rao, Jimmy Lin
Abstract	This paper explores the problem of ranking short social media posts with respect to user queries using neural networks. Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic “soft” matches between query and post tokens. Extensive experiments on datasets from the TREC Microblog Tracks show that our simple models not only achieve better effectiveness than existing approaches that are far more complex or exploit a more diverse set of relevance signals, but are also much faster. Implementations of our samCNN (Simple Attention-based Matching CNN) models are shared with the community to support future work.
Tasks	Representation Learning
Published	2018-11-02
URL	https://arxiv.org/abs/1811.01013v2
PDF	https://arxiv.org/pdf/1811.01013v2.pdf
PWC	https://paperswithcode.com/paper/simple-attention-based-representation
Repo
Framework

A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control


Title	A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control
Authors	Yuan Jin, Mark Carman, Ye Zhu, Yong Xiang
Abstract	Online crowdsourcing provides a scalable and inexpensive means to collect knowledge (e.g. labels) about various types of data items (e.g. text, audio, video). However, it is also known to result in large variance in the quality of recorded responses which often cannot be directly used for training machine learning systems. To resolve this issue, a lot of work has been conducted to control the response quality such that low-quality responses cannot adversely affect the performance of the machine learning systems. Such work is referred to as the quality control for crowdsourcing. Past quality control research can be divided into two major branches: quality control mechanism design and statistical models. The first branch focuses on designing measures, thresholds, interfaces and workflows for payment, gamification, question assignment and other mechanisms that influence workers’ behaviour. The second branch focuses on developing statistical models to perform effective aggregation of responses to infer correct responses. The two branches are connected as statistical models (i) provide parameter estimates to support the measure and threshold calculation, and (ii) encode modelling assumptions used to derive (theoretical) performance guarantees for the mechanisms. There are surveys regarding each branch but they lack technical details about the other branch. Our survey is the first to bridge the two branches by providing technical details on how they work together under frameworks that systematically unify crowdsourcing aspects modelled by both of them to determine the response quality. We are also the first to provide taxonomies of quality control papers based on the proposed frameworks. Finally, we specify the current limitations and the corresponding future directions for the quality control research.
Tasks
Published	2018-12-05
URL	http://arxiv.org/abs/1812.02736v1
PDF	http://arxiv.org/pdf/1812.02736v1.pdf
PWC	https://paperswithcode.com/paper/a-technical-survey-on-statistical-modelling
Repo
Framework

Scaling associative classification for very large datasets


Title	Scaling associative classification for very large datasets
Authors	Luca Venturini, Elena Baralis, Paolo Garza
Abstract	Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, massive datasets with a number of large-domain categorical features are a difficult challenge for any classifier. Most off-the-shelf solutions cannot cope with this problem. In this work we introduce DAC, a Distributed Associative Classifier. DAC exploits ensemble learning to distribute the training of an associative classifier among parallel workers and improve the final quality of the model. Furthermore, it adopts several novel techniques to reach high scalability without sacrificing quality, among which a preventive pruning of classification rules in the extraction phase based on Gini impurity. We ran experiments on Apache Spark, on a real large-scale dataset with more than 4 billion records and 800 million distinct categories. The results showed that DAC improves on a state-of-the-art solution in both prediction quality and execution time. Since the generated model is human-readable, it can not only classify new records, but also allow understanding both the logic behind the prediction and the properties of the model, becoming a useful aid for decision makers.
Tasks
Published	2018-05-10
URL	http://arxiv.org/abs/1805.03887v1
PDF	http://arxiv.org/pdf/1805.03887v1.pdf
PWC	https://paperswithcode.com/paper/scaling-associative-classification-for-very
Repo
Framework

Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations


Title	Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations
Authors	Chen Liang, Jianbo Ye, Han Zhao, Bart Pursel, C. Lee Giles
Abstract	Strict partial order is a mathematical structure commonly seen in relational data. One obstacle to extracting such type of relations at scale is the lack of large-scale labels for building effective data-driven solutions. We develop an active learning framework for mining such relations subject to a strict order. Our approach incorporates relational reasoning not only in finding new unlabeled pairs whose labels can be deduced from an existing label set, but also in devising new query strategies that consider the relational structure of labels. Our experiments on concept prerequisite relations show our proposed framework can substantially improve the classification performance with the same query budget compared to other baseline approaches.
Tasks	Active Learning, Relational Reasoning
Published	2018-01-19
URL	http://arxiv.org/abs/1801.06481v1
PDF	http://arxiv.org/pdf/1801.06481v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-of-strict-partial-orders-a
Repo
Framework

Model-Based Clustering and Classification of Functional Data


Title	Model-Based Clustering and Classification of Functional Data
Authors	Faicel Chamroukhi, Hien D. Nguyen
Abstract	The problem of complex data analysis is a central topic of modern statistical science and learning systems and is becoming of broader interest with the increasing prevalence of high-dimensional data. The challenge is to develop statistical models and autonomous algorithms that are able to acquire knowledge from raw data for exploratory analysis, which can be achieved through clustering techniques or to make predictions of future data via classification (i.e., discriminant analysis) techniques. Latent data models, including mixture model-based approaches are one of the most popular and successful approaches in both the unsupervised context (i.e., clustering) and the supervised one (i.e, classification or discrimination). Although traditionally tools of multivariate analysis, they are growing in popularity when considered in the framework of functional data analysis (FDA). FDA is the data analysis paradigm in which the individual data units are functions (e.g., curves, surfaces), rather than simple vectors. In many areas of application, the analyzed data are indeed often available in the form of discretized values of functions or curves (e.g., time series, waveforms) and surfaces (e.g., 2d-images, spatio-temporal data). This functional aspect of the data adds additional difficulties compared to the case of a classical multivariate (non-functional) data analysis. We review and present approaches for model-based clustering and classification of functional data. We derive well-established statistical models along with efficient algorithmic tools to address problems regarding the clustering and the classification of these high-dimensional data, including their heterogeneity, missing information, and dynamical hidden structure. The presented models and algorithms are illustrated on real-world functional data analysis problems from several application area.
Tasks	Time Series
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00276v2
PDF	http://arxiv.org/pdf/1803.00276v2.pdf
PWC	https://paperswithcode.com/paper/model-based-clustering-and-classification-of
Repo
Framework

Forward-Backward Reinforcement Learning


Title	Forward-Backward Reinforcement Learning
Authors	Ashley D. Edwards, Laura Downs, James C. Davidson
Abstract	Goals for reinforcement learning problems are typically defined through hand-specified rewards. To design such problems, developers of learning algorithms must inherently be aware of what the task goals are, yet we often require agents to discover them on their own without any supervision beyond these sparse rewards. While much of the power of reinforcement learning derives from the concept that agents can learn with little guidance, this requirement greatly burdens the training process. If we relax this one restriction and endow the agent with knowledge of the reward function, and in particular of the goal, we can leverage backwards induction to accelerate training. To achieve this, we propose training a model to learn to take imagined reversal steps from known goal states. Rather than training an agent exclusively to determine how to reach a goal while moving forwards in time, our approach travels backwards to jointly predict how we got there. We evaluate our work in Gridworld and Towers of Hanoi and empirically demonstrate that it yields better performance than standard DDQN.
Tasks
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10227v1
PDF	http://arxiv.org/pdf/1803.10227v1.pdf
PWC	https://paperswithcode.com/paper/forward-backward-reinforcement-learning
Repo
Framework

Subset selection in sparse matrices


Title	Subset selection in sparse matrices
Authors	Alberto Del Pia, Santanu S. Dey, Robert Weismantel
Abstract	In subset selection we search for the best linear predictor that involves a small subset of variables. From a computational complexity viewpoint, subset selection is NP-hard and few classes are known to be solvable in polynomial time. Using mainly tools from discrete geometry, we show that some sparsity conditions on the original data matrix allow us to solve the problem in polynomial time.
Tasks
Published	2018-10-05
URL	https://arxiv.org/abs/1810.02757v2
PDF	https://arxiv.org/pdf/1810.02757v2.pdf
PWC	https://paperswithcode.com/paper/subset-selection-in-sparse-matrices
Repo
Framework

Comparative Study of ECO and CFNet Trackers in Noisy Environment


Title	Comparative Study of ECO and CFNet Trackers in Noisy Environment
Authors	Mustansar Fiaz, Sajid Javed, Arif Mahmood, Soon Ki Jung
Abstract	Object tracking is one of the most challenging task and has secured significant attention of computer vision researchers in the past two decades. Recent deep learning based trackers have shown good performance on various tracking challenges. A tracking method should track objects in sequential frames accurately in challenges such as deformation, low resolution, occlusion, scale and light variations. Most trackers achieve good performance on specific challenges instead of all tracking problems, hence there is a lack of general purpose tracking algorithms that can perform well in all conditions. Moreover, performance of tracking techniques has not been evaluated in noisy environments. Visual object tracking has real world applications and there is good chance that noise may get added during image acquisition in surveillance cameras. We aim to study the robustness of two state of the art trackers in the presence of noise including Efficient Convolutional Operators (ECO) and Correlation Filter Network (CFNet). Our study demonstrates that the performance of these trackers degrades as the noise level increases, which demonstrate the need to design more robust tracking algorithms.
Tasks	Object Tracking, Visual Object Tracking
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09360v1
PDF	http://arxiv.org/pdf/1801.09360v1.pdf
PWC	https://paperswithcode.com/paper/comparative-study-of-eco-and-cfnet-trackers
Repo
Framework

Automatic Stroke Lesions Segmentation in Diffusion-Weighted MRI


Title	Automatic Stroke Lesions Segmentation in Diffusion-Weighted MRI
Authors	Noranart Vesdapunt, Nongluk Covavisaruch
Abstract	Diffusion-Weighted Magnetic Resonance Imaging (DWI) is widely used for early cerebral infarct detection caused by ischemic stroke. Manual segmentation is done by a radiologist as a common clinical process, nonetheless, challenges of cerebral infarct segmentation come from low resolution and uncertain boundaries. Many segmentation techniques have been proposed and proved by manual segmentation as gold standard. In order to reduce human error in research operation and clinical process, we adopt a semi-automatic segmentation as gold standard using Fluid-Attenuated Inversion-Recovery (FLAIR) Magnetic Resonance Image (MRI) from the same patient under controlled environment. Extensive testing is performed on popular segmentation algorithms including Otsu method, Fuzzy C-means, Hill-climbing based segmentation, and Growcut. The selected segmentation techniques have been validated by accuracy, sensitivity, and specificity using leave-one-out cross-validation to determine the possibility of each techniques first then maximizes the accuracy from the training set. Our experimental results demonstrate the effectiveness of selected methods.
Tasks
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10385v1
PDF	http://arxiv.org/pdf/1803.10385v1.pdf
PWC	https://paperswithcode.com/paper/automatic-stroke-lesions-segmentation-in
Repo
Framework

Resource Constrained Deep Reinforcement Learning


Title	Resource Constrained Deep Reinforcement Learning
Authors	Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar
Abstract	In urban environments, supply resources have to be constantly matched to the “right” locations (where customer demand is present) so as to improve quality of life. For instance, ambulances have to be matched to base stations regularly so as to reduce response time for emergency incidents in EMS (Emergency Management Systems); vehicles (cars, bikes, scooters etc.) have to be matched to docking stations so as to reduce lost demand in shared mobility systems. Such problem domains are challenging owing to the demand uncertainty, combinatorial action spaces (due to allocation) and constraints on allocation of resources (e.g., total resources, minimum and maximum number of resources at locations and regions). Existing systems typically employ myopic and greedy optimization approaches to optimize allocation of supply resources to locations. Such approaches typically are unable to handle surges or variances in demand patterns well. Recent research has demonstrated the ability of Deep RL methods in adapting well to highly uncertain environments. However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources. To that end, we have developed three approaches on top of the well known actor critic approach, DDPG (Deep Deterministic Policy Gradient) that are able to handle constraints on resource allocation. More importantly, we demonstrate that they are able to outperform leading approaches on simulators validated on semi-real and real data sets.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00600v1
PDF	http://arxiv.org/pdf/1812.00600v1.pdf
PWC	https://paperswithcode.com/paper/resource-constrained-deep-reinforcement
Repo
Framework