High utility pattern mining based on historical data table over data streams

2021 
Efficient high utility itemsets mining over data stream is one of the most challenging problems in the data mining literature. Due to many redundant candidate generations, the existing algorithm is hard to show good scalability over a massive data stream. Additionally, the existing literature has less considered the reference value of historical data and has less adopted a distributed architecture. This paper proposes an efficient algorithm HUMHDT to mine high utility patterns over a data stream with the sliding window technique. Besides, we use historical data to effectively prune the redundant candidates generated in the current data stream mining process. Moreover, by referring to historical data, the algorithm can discover the potential items more effectively. Furthermore, we propose a distributed system architecture that can construct and update the historical data table without affecting the data stream mining algorithm and optimize the current data stream mining algorithm through a historical data table. The complete experimental results show that HUMHDT is more efficient than the state-of-the-art algorithms in terms of execution time and memory consumption.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []