Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Wang, Jinfenga; b; * | Huang, Shuaihuia | Jiang, Fajiana | Zheng, Zhishena | Ou, Jianbina | Chen, Haoa | Chen, Runjianc | Wang, Wenzhongd; *
Affiliations: [a] College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China | [b] Guangzhou Key Laboratory of Smart Agriculture, Guangzhou, China | [c] Guangdong Electronic Certification Authority Co., LTD, Guangzhou, China | [d] College of Economics and Management, South China Agricultural University, Guangzhou, China
Correspondence: [*] Corresponding author. Jinfeng Wang, South China Agricultural University, E-mail: [email protected] and Wenzhong Wang, E-mail: [email protected].
Abstract: Fuzzy integral in data mining is an excellent information fusion tool. It has obvious advantages in solving the combination of features and has more successful applications in classification problems. However, with the increase of the number of features, the time complexity and space complexity of fuzzy integral will also increase exponentially. This problem limits the development of fuzzy integral. This article proposes a high-efficiency fuzzy integral—Parallel and Sparse Frame Based Fuzzy Integral (PSFI) for reducing time complexity and space complexity in the calculation of fuzzy integrals, which is based on the distributed parallel computing framework-Spark combined with the concept of sparse storage. Aiming at the efficiency problem of the Python language, Cython programming technology is introduced in the meanwhile. Our algorithm is packaged into an algorithm library to realize a more efficient PSFI. The experiments verified the impact of the number of parallel nodes on the performance of the algorithm, test the performance of PSFI in classification, and apply PSFI on regression problems and imbalanced big data classification. The results have shown that PSFI reduces the variable storage space requirements of datasets with aplenty of features by thousands of times with the increase of computing resources. Furthermore, it is proved that PSFI has higher prediction accuracy than the classic fuzzy integral running on a single processor.
Keywords: Parallel computing, sparse storage, fuzzy integral, fuzzy measure
DOI: 10.3233/JIFS-210372
Journal: Journal of Intelligent & Fuzzy Systems, vol. 41, no. 2, pp. 3137-3159, 2021
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]