Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Xie, Wenhaoa; * | Lei, Linb | Liu, Xiangyia | Liu, Yuanc
Affiliations: [a] School of Science, Xi’an Shiyou University, Xi’an, Shaanxi, China | [b] School of Computing, Xi’an Shiyou University, Xi’an, Shaanxi, China | [c] College of Petroleum Engineering, Xi’an Shiyou University, Xi’an, Shaanxi, China
Correspondence: [*] Corresponding author. Wenhao Xie, School of Science, Xi’an Shiyou University, Xi’an, Shaanxi, 710065, China. Tel.: +86 13891893690; Fax: +86 02988382421; E-mail: [email protected].
Abstract: Clustering is an essential unsupervised technique when category information is not available. Although K-means and Max-min distance K-means clustering algorithms are widely used, they have some disadvantages such as dependence on the initial centers, sensitivity to outliers caused by using only distance as the clustering criterion. To overcome the problems, this paper proposes SMM-K-means algorithm which overcomes the dependence on the initial cluster centers and the initial number of clusters and the sensitivity to the outliers. First, the initial value K of the optimal cluster number is determined by the elbow method, and K-means is used for initial clustering. A new inter-cluster separation measure is then constructed based on the idea of q-nearest neighbors, which is constructed by comprehensive considering the separation between clusters and the distribution compactness of clusters themselves. Finally, the two sample points with highest degree of separation are brought into Max-min distance K-means algorithm as new initial centers for clustering. The definite determining method of cluster centers eliminates the complicated iterative calculation, and the construction of inter-cluster separation measure overcomes the sensitivity of clustering results to noise points and isolated points, and has good applicability and generalization. In addition, this algorithm is not limited by the shape and size of the clusters and has better flexibility. The experimental results show that the SMM-K-means algorithm has higher CH values, resulting in a better clustering effect and stability.
Keywords: K-means algorithm, max-min distance K-means algorithm, elbow method, inter-cluster separation measure, CH index
DOI: 10.3233/JIFS-231747
Journal: Journal of Intelligent & Fuzzy Systems, vol. 46, no. 4, pp. 7839-7857, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]