An efficient algorithm for mining top-k on-shelf high utility itemsets

An efficient algorithm for mining top-k on-shelf high utility itemsets High on-shelf utility itemset (HOU) mining is an emerging data mining task which consists of discovering sets of items generating a high profit in transaction databases. The task of HOU mining is more difficult than traditional high utility itemset (HUI) mining, because it also considers the shelf time of items, and items having negative unit profits. HOU mining can be used to discover more useful and interesting patterns in real-life applications than traditional HUI mining. Several algorithms have been proposed for this task. However, a major drawback of these algorithms is that it is difficult for users to find a suitable value for the minimum utility threshold parameter. If the threshold is set too high, not enough patterns are found. And if the threshold is set too low, too many patterns will be found and the algorithm may use an excessive amount of time and memory. To address this issue, we propose to address the problem of top-k on-shelf high utility itemset mining, where the user directly specifies k, the desired number of patterns to be output instead of specifying a minimum utility threshold value. An efficient algorithm named KOSHU (fast top-K on-shelf high utility itemset miner) is proposed to mine the top-k HOUs efficiently, while considering on-shelf time periods of items, and items having positive and/or negative unit profits. KOSHU introduces three novel strategies, named efficient estimated co-occurrence maximum period rate pruning, period utility pruning and concurrence existing of a pair 2-itemset pruning to reduce the search space. KOSHU also incorporates several novel optimizations and a faster method for constructing utility-lists. An extensive performance study on real-life and synthetic datasets shows that the proposed algorithm is efficient both in terms of runtime and memory consumption and has excellent scalability. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Knowledge and Information Systems Springer Journals

An efficient algorithm for mining top-k on-shelf high utility itemsets

Loading next page...
 
/lp/springer_journal/an-efficient-algorithm-for-mining-top-k-on-shelf-high-utility-itemsets-76MGupmgrN
Publisher
Springer London
Copyright
Copyright © 2017 by Springer-Verlag London
Subject
Computer Science; Information Systems and Communication Service; IT in Business
ISSN
0219-1377
eISSN
0219-3116
D.O.I.
10.1007/s10115-016-1020-2
Publisher site
See Article on Publisher Site

Abstract

High on-shelf utility itemset (HOU) mining is an emerging data mining task which consists of discovering sets of items generating a high profit in transaction databases. The task of HOU mining is more difficult than traditional high utility itemset (HUI) mining, because it also considers the shelf time of items, and items having negative unit profits. HOU mining can be used to discover more useful and interesting patterns in real-life applications than traditional HUI mining. Several algorithms have been proposed for this task. However, a major drawback of these algorithms is that it is difficult for users to find a suitable value for the minimum utility threshold parameter. If the threshold is set too high, not enough patterns are found. And if the threshold is set too low, too many patterns will be found and the algorithm may use an excessive amount of time and memory. To address this issue, we propose to address the problem of top-k on-shelf high utility itemset mining, where the user directly specifies k, the desired number of patterns to be output instead of specifying a minimum utility threshold value. An efficient algorithm named KOSHU (fast top-K on-shelf high utility itemset miner) is proposed to mine the top-k HOUs efficiently, while considering on-shelf time periods of items, and items having positive and/or negative unit profits. KOSHU introduces three novel strategies, named efficient estimated co-occurrence maximum period rate pruning, period utility pruning and concurrence existing of a pair 2-itemset pruning to reduce the search space. KOSHU also incorporates several novel optimizations and a faster method for constructing utility-lists. An extensive performance study on real-life and synthetic datasets shows that the proposed algorithm is efficient both in terms of runtime and memory consumption and has excellent scalability.

Journal

Knowledge and Information SystemsSpringer Journals

Published: Jan 10, 2017

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off