Paper Group ANR 564
EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search. Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning. Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions. Controlling Decoding for More Abstractive Summaries with Copy-Based Networks. PACT: Parameterized Clipping A …
EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search
Title | EENMF: An End-to-End Neural Matching Framework for E-Commerce Sponsored Search |
Authors | Wenjin Wu, Guojun Liu, Hui Ye, Chenshuang Zhang, Tianshu Wu, Daorui Xiao, Wei Lin, Xiaoyu Zhu |
Abstract | E-commerce sponsored search contributes an important part of revenue for the e-commerce company. In consideration of effectiveness and efficiency, a large-scale sponsored search system commonly adopts a multi-stage architecture. We name these stages as ad retrieval, ad pre-ranking and ad ranking. Ad retrieval and ad pre-ranking are collectively referred to as ad matching in this paper. We propose an end-to-end neural matching framework (EENMF) to model two tasks—vector-based ad retrieval and neural networks based ad pre-ranking. Under the deep matching framework, vector-based ad retrieval harnesses user recent behavior sequence to retrieve relevant ad candidates without the constraint of keyword bidding. Simultaneously, the deep model is employed to perform the global pre-ranking of ad candidates from multiple retrieval paths effectively and efficiently. Besides, the proposed model tries to optimize the pointwise cross-entropy loss which is consistent with the objective of predict models in the ranking stage. We conduct extensive evaluation to validate the performance of the proposed framework. In the real traffic of a large-scale e-commerce sponsored search, the proposed approach significantly outperforms the baseline. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01190v4 |
http://arxiv.org/pdf/1812.01190v4.pdf | |
PWC | https://paperswithcode.com/paper/eenmf-an-end-to-end-neural-matching-framework |
Repo | |
Framework | |
Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning
Title | Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning |
Authors | Zecheng He, Aswin Raghavan, Guangyuan Hu, Sek Chai, Ruby Lee |
Abstract | Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection. In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (>99.9%), nearly zero false positives and short (<360ms) latency. |
Tasks | Anomaly Detection |
Published | 2018-06-18 |
URL | https://arxiv.org/abs/1806.06496v3 |
https://arxiv.org/pdf/1806.06496v3.pdf | |
PWC | https://paperswithcode.com/paper/detecting-zero-day-controller-hijacking |
Repo | |
Framework | |
Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions
Title | Saliency Map Estimation for Omni-Directional Image Considering Prior Distributions |
Authors | Tatsuya Suzuki, Takao Yamanaka |
Abstract | In recent years, the deep learning techniques have been applied to the estimation of saliency maps, which represent probability density functions of fixations when people look at the images. Although the methods of saliency-map estimation have been actively studied for 2-dimensional planer images, the methods for omni-directional images to be utilized in virtual environments had not been studied, until a competition of saliency-map estimation for the omni-directional images was held in ICME2017. In this paper, novel methods for estimating saliency maps for the omni-directional images are proposed considering the properties of prior distributions for fixations in the planar images and the omni-directional images. |
Tasks | |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06329v1 |
http://arxiv.org/pdf/1807.06329v1.pdf | |
PWC | https://paperswithcode.com/paper/saliency-map-estimation-for-omni-directional |
Repo | |
Framework | |
Controlling Decoding for More Abstractive Summaries with Copy-Based Networks
Title | Controlling Decoding for More Abstractive Summaries with Copy-Based Networks |
Authors | Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Kyunghyun Cho |
Abstract | Attention-based neural abstractive summarization systems equipped with copy mechanisms have shown promising results. Despite this success, it has been noticed that such a system generates a summary by mostly, if not entirely, copying over phrases, sentences, and sometimes multiple consecutive sentences from an input paragraph, effectively performing extractive summarization. In this paper, we verify this behavior using the latest neural abstractive summarization system - a pointer-generator network. We propose a simple baseline method that allows us to control the amount of copying without retraining. Experiments indicate that the method provides a strong baseline for abstractive systems looking to obtain high ROUGE scores while minimizing overlap with the source article, substantially reducing the n-gram overlap with the original article while keeping within 2 points of the original model’s ROUGE score. |
Tasks | Abstractive Text Summarization |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.07038v2 |
http://arxiv.org/pdf/1803.07038v2.pdf | |
PWC | https://paperswithcode.com/paper/controlling-decoding-for-more-abstractive |
Repo | |
Framework | |
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Title | PACT: Parameterized Clipping Activation for Quantized Neural Networks |
Authors | Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan |
Abstract | Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $\alpha$ that is optimized during training to find the right quantization scale. PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes. We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets. We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance due to a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories. |
Tasks | Quantization |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06085v2 |
http://arxiv.org/pdf/1805.06085v2.pdf | |
PWC | https://paperswithcode.com/paper/pact-parameterized-clipping-activation-for |
Repo | |
Framework | |
Simple Attention-Based Representation Learning for Ranking Short Social Media Posts
Title | Simple Attention-Based Representation Learning for Ranking Short Social Media Posts |
Authors | Peng Shi, Jinfeng Rao, Jimmy Lin |
Abstract | This paper explores the problem of ranking short social media posts with respect to user queries using neural networks. Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic “soft” matches between query and post tokens. Extensive experiments on datasets from the TREC Microblog Tracks show that our simple models not only achieve better effectiveness than existing approaches that are far more complex or exploit a more diverse set of relevance signals, but are also much faster. Implementations of our samCNN (Simple Attention-based Matching CNN) models are shared with the community to support future work. |
Tasks | Representation Learning |
Published | 2018-11-02 |
URL | https://arxiv.org/abs/1811.01013v2 |
https://arxiv.org/pdf/1811.01013v2.pdf | |
PWC | https://paperswithcode.com/paper/simple-attention-based-representation |
Repo | |
Framework | |
A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control
Title | A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control |
Authors | Yuan Jin, Mark Carman, Ye Zhu, Yong Xiang |
Abstract | Online crowdsourcing provides a scalable and inexpensive means to collect knowledge (e.g. labels) about various types of data items (e.g. text, audio, video). However, it is also known to result in large variance in the quality of recorded responses which often cannot be directly used for training machine learning systems. To resolve this issue, a lot of work has been conducted to control the response quality such that low-quality responses cannot adversely affect the performance of the machine learning systems. Such work is referred to as the quality control for crowdsourcing. Past quality control research can be divided into two major branches: quality control mechanism design and statistical models. The first branch focuses on designing measures, thresholds, interfaces and workflows for payment, gamification, question assignment and other mechanisms that influence workers’ behaviour. The second branch focuses on developing statistical models to perform effective aggregation of responses to infer correct responses. The two branches are connected as statistical models (i) provide parameter estimates to support the measure and threshold calculation, and (ii) encode modelling assumptions used to derive (theoretical) performance guarantees for the mechanisms. There are surveys regarding each branch but they lack technical details about the other branch. Our survey is the first to bridge the two branches by providing technical details on how they work together under frameworks that systematically unify crowdsourcing aspects modelled by both of them to determine the response quality. We are also the first to provide taxonomies of quality control papers based on the proposed frameworks. Finally, we specify the current limitations and the corresponding future directions for the quality control research. |
Tasks | |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.02736v1 |
http://arxiv.org/pdf/1812.02736v1.pdf | |
PWC | https://paperswithcode.com/paper/a-technical-survey-on-statistical-modelling |
Repo | |
Framework | |
Scaling associative classification for very large datasets
Title | Scaling associative classification for very large datasets |
Authors | Luca Venturini, Elena Baralis, Paolo Garza |
Abstract | Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, massive datasets with a number of large-domain categorical features are a difficult challenge for any classifier. Most off-the-shelf solutions cannot cope with this problem. In this work we introduce DAC, a Distributed Associative Classifier. DAC exploits ensemble learning to distribute the training of an associative classifier among parallel workers and improve the final quality of the model. Furthermore, it adopts several novel techniques to reach high scalability without sacrificing quality, among which a preventive pruning of classification rules in the extraction phase based on Gini impurity. We ran experiments on Apache Spark, on a real large-scale dataset with more than 4 billion records and 800 million distinct categories. The results showed that DAC improves on a state-of-the-art solution in both prediction quality and execution time. Since the generated model is human-readable, it can not only classify new records, but also allow understanding both the logic behind the prediction and the properties of the model, becoming a useful aid for decision makers. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.03887v1 |
http://arxiv.org/pdf/1805.03887v1.pdf | |
PWC | https://paperswithcode.com/paper/scaling-associative-classification-for-very |
Repo | |
Framework | |
Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations
Title | Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations |
Authors | Chen Liang, Jianbo Ye, Han Zhao, Bart Pursel, C. Lee Giles |
Abstract | Strict partial order is a mathematical structure commonly seen in relational data. One obstacle to extracting such type of relations at scale is the lack of large-scale labels for building effective data-driven solutions. We develop an active learning framework for mining such relations subject to a strict order. Our approach incorporates relational reasoning not only in finding new unlabeled pairs whose labels can be deduced from an existing label set, but also in devising new query strategies that consider the relational structure of labels. Our experiments on concept prerequisite relations show our proposed framework can substantially improve the classification performance with the same query budget compared to other baseline approaches. |
Tasks | Active Learning, Relational Reasoning |
Published | 2018-01-19 |
URL | http://arxiv.org/abs/1801.06481v1 |
http://arxiv.org/pdf/1801.06481v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-of-strict-partial-orders-a |
Repo | |
Framework | |
Model-Based Clustering and Classification of Functional Data
Title | Model-Based Clustering and Classification of Functional Data |
Authors | Faicel Chamroukhi, Hien D. Nguyen |
Abstract | The problem of complex data analysis is a central topic of modern statistical science and learning systems and is becoming of broader interest with the increasing prevalence of high-dimensional data. The challenge is to develop statistical models and autonomous algorithms that are able to acquire knowledge from raw data for exploratory analysis, which can be achieved through clustering techniques or to make predictions of future data via classification (i.e., discriminant analysis) techniques. Latent data models, including mixture model-based approaches are one of the most popular and successful approaches in both the unsupervised context (i.e., clustering) and the supervised one (i.e, classification or discrimination). Although traditionally tools of multivariate analysis, they are growing in popularity when considered in the framework of functional data analysis (FDA). FDA is the data analysis paradigm in which the individual data units are functions (e.g., curves, surfaces), rather than simple vectors. In many areas of application, the analyzed data are indeed often available in the form of discretized values of functions or curves (e.g., time series, waveforms) and surfaces (e.g., 2d-images, spatio-temporal data). This functional aspect of the data adds additional difficulties compared to the case of a classical multivariate (non-functional) data analysis. We review and present approaches for model-based clustering and classification of functional data. We derive well-established statistical models along with efficient algorithmic tools to address problems regarding the clustering and the classification of these high-dimensional data, including their heterogeneity, missing information, and dynamical hidden structure. The presented models and algorithms are illustrated on real-world functional data analysis problems from several application area. |
Tasks | Time Series |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00276v2 |
http://arxiv.org/pdf/1803.00276v2.pdf | |
PWC | https://paperswithcode.com/paper/model-based-clustering-and-classification-of |
Repo | |
Framework | |
Forward-Backward Reinforcement Learning
Title | Forward-Backward Reinforcement Learning |
Authors | Ashley D. Edwards, Laura Downs, James C. Davidson |
Abstract | Goals for reinforcement learning problems are typically defined through hand-specified rewards. To design such problems, developers of learning algorithms must inherently be aware of what the task goals are, yet we often require agents to discover them on their own without any supervision beyond these sparse rewards. While much of the power of reinforcement learning derives from the concept that agents can learn with little guidance, this requirement greatly burdens the training process. If we relax this one restriction and endow the agent with knowledge of the reward function, and in particular of the goal, we can leverage backwards induction to accelerate training. To achieve this, we propose training a model to learn to take imagined reversal steps from known goal states. Rather than training an agent exclusively to determine how to reach a goal while moving forwards in time, our approach travels backwards to jointly predict how we got there. We evaluate our work in Gridworld and Towers of Hanoi and empirically demonstrate that it yields better performance than standard DDQN. |
Tasks | |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10227v1 |
http://arxiv.org/pdf/1803.10227v1.pdf | |
PWC | https://paperswithcode.com/paper/forward-backward-reinforcement-learning |
Repo | |
Framework | |
Subset selection in sparse matrices
Title | Subset selection in sparse matrices |
Authors | Alberto Del Pia, Santanu S. Dey, Robert Weismantel |
Abstract | In subset selection we search for the best linear predictor that involves a small subset of variables. From a computational complexity viewpoint, subset selection is NP-hard and few classes are known to be solvable in polynomial time. Using mainly tools from discrete geometry, we show that some sparsity conditions on the original data matrix allow us to solve the problem in polynomial time. |
Tasks | |
Published | 2018-10-05 |
URL | https://arxiv.org/abs/1810.02757v2 |
https://arxiv.org/pdf/1810.02757v2.pdf | |
PWC | https://paperswithcode.com/paper/subset-selection-in-sparse-matrices |
Repo | |
Framework | |
Comparative Study of ECO and CFNet Trackers in Noisy Environment
Title | Comparative Study of ECO and CFNet Trackers in Noisy Environment |
Authors | Mustansar Fiaz, Sajid Javed, Arif Mahmood, Soon Ki Jung |
Abstract | Object tracking is one of the most challenging task and has secured significant attention of computer vision researchers in the past two decades. Recent deep learning based trackers have shown good performance on various tracking challenges. A tracking method should track objects in sequential frames accurately in challenges such as deformation, low resolution, occlusion, scale and light variations. Most trackers achieve good performance on specific challenges instead of all tracking problems, hence there is a lack of general purpose tracking algorithms that can perform well in all conditions. Moreover, performance of tracking techniques has not been evaluated in noisy environments. Visual object tracking has real world applications and there is good chance that noise may get added during image acquisition in surveillance cameras. We aim to study the robustness of two state of the art trackers in the presence of noise including Efficient Convolutional Operators (ECO) and Correlation Filter Network (CFNet). Our study demonstrates that the performance of these trackers degrades as the noise level increases, which demonstrate the need to design more robust tracking algorithms. |
Tasks | Object Tracking, Visual Object Tracking |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09360v1 |
http://arxiv.org/pdf/1801.09360v1.pdf | |
PWC | https://paperswithcode.com/paper/comparative-study-of-eco-and-cfnet-trackers |
Repo | |
Framework | |
Automatic Stroke Lesions Segmentation in Diffusion-Weighted MRI
Title | Automatic Stroke Lesions Segmentation in Diffusion-Weighted MRI |
Authors | Noranart Vesdapunt, Nongluk Covavisaruch |
Abstract | Diffusion-Weighted Magnetic Resonance Imaging (DWI) is widely used for early cerebral infarct detection caused by ischemic stroke. Manual segmentation is done by a radiologist as a common clinical process, nonetheless, challenges of cerebral infarct segmentation come from low resolution and uncertain boundaries. Many segmentation techniques have been proposed and proved by manual segmentation as gold standard. In order to reduce human error in research operation and clinical process, we adopt a semi-automatic segmentation as gold standard using Fluid-Attenuated Inversion-Recovery (FLAIR) Magnetic Resonance Image (MRI) from the same patient under controlled environment. Extensive testing is performed on popular segmentation algorithms including Otsu method, Fuzzy C-means, Hill-climbing based segmentation, and Growcut. The selected segmentation techniques have been validated by accuracy, sensitivity, and specificity using leave-one-out cross-validation to determine the possibility of each techniques first then maximizes the accuracy from the training set. Our experimental results demonstrate the effectiveness of selected methods. |
Tasks | |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10385v1 |
http://arxiv.org/pdf/1803.10385v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-stroke-lesions-segmentation-in |
Repo | |
Framework | |
Resource Constrained Deep Reinforcement Learning
Title | Resource Constrained Deep Reinforcement Learning |
Authors | Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar |
Abstract | In urban environments, supply resources have to be constantly matched to the “right” locations (where customer demand is present) so as to improve quality of life. For instance, ambulances have to be matched to base stations regularly so as to reduce response time for emergency incidents in EMS (Emergency Management Systems); vehicles (cars, bikes, scooters etc.) have to be matched to docking stations so as to reduce lost demand in shared mobility systems. Such problem domains are challenging owing to the demand uncertainty, combinatorial action spaces (due to allocation) and constraints on allocation of resources (e.g., total resources, minimum and maximum number of resources at locations and regions). Existing systems typically employ myopic and greedy optimization approaches to optimize allocation of supply resources to locations. Such approaches typically are unable to handle surges or variances in demand patterns well. Recent research has demonstrated the ability of Deep RL methods in adapting well to highly uncertain environments. However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources. To that end, we have developed three approaches on top of the well known actor critic approach, DDPG (Deep Deterministic Policy Gradient) that are able to handle constraints on resource allocation. More importantly, we demonstrate that they are able to outperform leading approaches on simulators validated on semi-real and real data sets. |
Tasks | |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00600v1 |
http://arxiv.org/pdf/1812.00600v1.pdf | |
PWC | https://paperswithcode.com/paper/resource-constrained-deep-reinforcement |
Repo | |
Framework | |