Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Wu, Jimmy Ming-Taia; * | Li, Ranrana | Wu, Mu-Enb | Lin, Jerry Chun-Weic
Affiliations: [a] Department of Information Management, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan | [b] Department of Information and Finance Management, National Taipei University of Technology, Taipei, Taiwan | [c] Department of Computer Science, Western Norway University of Applied Sciences, Bergen, Norway
Correspondence: [*] Corresponding author: Jimmy Ming-Tai Wu, Department of Information Management, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan. E-mail: [email protected].
Abstract: When the concentration focuses on data mining, frequent itemset mining (FIM) and high-utility itemset mining (HUIM) are commonly addressed and researched. Many related algorithms are proposed to reveal the general relationship between utility, frequency, and items in transaction databases. Although these algorithms can mine FIMs or HUIMs quickly, these algorithms merely take into account frequency or utility as a unilateral criterion for itemsets but the other factors (e.g., distance, price) could be also valuable for decision-making. A new skyline framework has been presented to mine frequent high utility patterns (SFUPs) to better support user decision-making. Several new algorithms have been proposed one after another. However, the Internet of Things (IoT), mobile Internet, and traditional Internet are generating massive amounts of data every day, and these cutting-edge standalone algorithms can not satisfy the new challenge of finding interesting patterns from this data. Big Data uses a distributed architecture in the form of cloud computing to filter and process this data to extract useful information. This paper proposes a novel parallel algorithm on Hadoop as a three-stage iterative algorithm based on MapReduce. MapReduce is used to divide the mining tasks of the whole large data set into multiple independent sub-tasks to find frequent and high utility patterns in parallel. Numerous experiments were done in this paper, and from the results, the algorithm can handle large datasets and show good performance on Hadoop clusters.
Keywords: Data mining, skyline frequent-utility patterns (SFUPs), cloud computing, Hadoop, MapReduce
DOI: 10.3233/IDA-220756
Journal: Intelligent Data Analysis, vol. 27, no. 5, pp. 1359-1377, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]