Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Gradient boosting learning for fraudulent publisher detection in online advertising

Gradient boosting learning for fraudulent publisher detection in online advertising Analysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click dataset and eight other imbalanced datasets to test the GTB's generalizing behavior, while training and testing were done using 10-fold cross-validation. The performance of GTB was evaluated using average precision, recall and f-measure. The performance of GTB learning was also compared with eleven other state-of-the-art individual and ensemble classification models. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Data Technologies and Applications Emerald Publishing

Gradient boosting learning for fraudulent publisher detection in online advertising

Loading next page...
 
/lp/emerald-publishing/gradient-boosting-learning-for-fraudulent-publisher-detection-in-kmL9pDP641
Publisher
Emerald Publishing
Copyright
© Emerald Publishing Limited
ISSN
2514-9288
DOI
10.1108/dta-04-2020-0093
Publisher site
See Article on Publisher Site

Abstract

Analysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.Design/methodology/approachIn this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.FindingsThe results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).Originality/valueThe experiments were conducted using publicly available multiclass raw user click dataset and eight other imbalanced datasets to test the GTB's generalizing behavior, while training and testing were done using 10-fold cross-validation. The performance of GTB was evaluated using average precision, recall and f-measure. The performance of GTB learning was also compared with eleven other state-of-the-art individual and ensemble classification models.

Journal

Data Technologies and ApplicationsEmerald Publishing

Published: Apr 12, 2021

Keywords: Pay-per-click; Fraudulent publishers; Click fraud; Gradient tree boosting; Online advertising

References