Clustering short temporal behaviour sequences for customer
segmentation using LDA
R8D Department, Flytxt, Trivandrum, India
Department of Electrical Engineering, Indian
Institute of Technology Delhi, New Delhi, India
Jobin Wilson, R8D Department, Flytxt,
Customer segmentation based on temporal variation of subscriber preferences is useful for
communication service providers (CSPs) in applications such as targeted campaign design, churn
prediction, and fraud detection. Traditional clustering algorithms are inadequate in this context, as
a multidimensional feature vector represents a subscriber profile at an instant of time, and
grouping of subscribers needs to consider variation of subscriber preferences across time.
Clustering in this case usually requires complex multivariate time series analysis‐based models.
Because conventional time series clustering models have limitations around scalability and ability
to accurately represent temporal behaviour sequences (TBS) of users, that may be short, noisy,
and non‐stationary, we propose a latent Dirichlet allocation (LDA) based model to represent
temporal behaviour of mobile subscribers as compact and interpretable profiles. Our model
makes use of the structural regularity within the observable data corresponding to a large number
of user profiles and relaxes the strict temporal ordering of user preferences in TBS clustering. We
use mean‐shift clustering to segment subscribers based on their discovered profiles. Further, we
mine segment‐specific association rules from the discovered TBS clusters, to aid marketers in
designing intelligent campaigns that match segment preferences. Our experiments on real world
data collected from a popular Asian communication service provider gave encouraging results.
association rule mining, LDA, marketing campaign optimization, mobile subscriber segmentation,
temporal behaviour clustering
Customer segmentation considering temporal variation of user preferences, followed by segment‐specific association rule mining is a powerful
technique for marketers, to improve campaign targeting and conversions. Clustering techniques have been used conventionally for targeted
promotions and personalization, to maximize relevance of marketing campaigns to subscribers, along with maintaining profitability for marketers
(Ben Schafer, Konstan, & Riedl, 1999; Schlee, 2013). Key challenges in this context are to identify advertisements that match user preferences
as well as identifying potential target segments for a given marketing campaign (Goyal & Lakshmanan, 2012). There have been studies such as
Mobasher, Dai, Luo, and Nakagawa (2001) and Sandvig, Mobasher, and Burke (2007), which attempt to discover association rules from customer
data and utilize it to provide personalization.
Conventional approaches generally ignore customer's time‐variant preferences, which are crucial in optimally matching users to marketing
campaigns in certain domains. For instance, if we need to group people on the basis of their navigation pattern on a website (identified from
server logs) to provide customized offers or to identify if a new user is likely to make a specific purchase, based on her navigation pattern on an
e‐commerce website, conventional user profiling techniques become inadequate. This calls for an improved representation of user profiles so that
they are no more a static vector of preferences but a sequence of such preference vectors varying over time, represented by a temporal behaviour
In this paper, we device a customer segmentation procedure based on temporal variations of user preferences, using TBS clustering, and utilize
the discovered clusters to learn association rules specific to each clusters, to improve campaign targeting.
Received: 13 March 2017 Revised: 2 August 2017 Accepted: 8 September 2017
Expert Systems. 2018;35:e12250.
Copyright © 2017 John Wiley & Sons, Ltdwileyonlinelibrary.com/journal/exsy 1of16