Towards Obtaining Upper Bound on Sensitivity Computation Process for Cluster Validity Measures

Mishra, Sumit; Mondal, Samrat; Saha, Sriparna

doi:10.3233/FI-2018-1749

Towards Obtaining Upper Bound on Sensitivity Computation Process for Cluster Validity Measures

Article type: Research Article

Authors: Mishra, Sumit^{; *; †} | Mondal, Samrat | Saha, Sriparna

Affiliations: Department of Computer Science & Engineering, Indian Institute of Technology Patna, Patna, Bihar – 801103, India. [email protected], [email protected], [email protected]

Correspondence: [†] Address for correspondence: Department of Computer Science & Engineering, Indian Institute of Information Technology Guwahati, Guwahati, Assam – 781015, India.

Note: [*] Also affiliated at: Department of Computer Science & Engineering, Indian Institute of Information Technology Guwahati, Guwahati, Assam – 781015, India.

Abstract: Cluster validity indices are proposed in the literature to measure the goodness of a clustering result. The validity measure provides a value which shows how good or bad the obtained clustering result is, as compared to the actual clustering result. However, the validity measures are not arbitrarily generated. A validity measure should satisfy some of the important properties. However, there are cases when in-spite of satisfying these properties, a validity measure is not able to differentiate the two clustering results correctly. In this regard, sensitivity as a property of validity measure is introduced to capture the differences between the two clustering results. However, sensitivity computation is a computationally expensive task as it requires to explore all the possible combinations of clustering results which are very large in number and these are growing exponentially. So, it is required to compute the sensitivity efficiently. As the possible combinations of clustering results grow exponentially, so it is required to first obtain an upper bound on this possible number of combinations which will be sufficient to compute the value of the sensitivity. In this paper, we obtain an upper bound on the number of possible combinations of clustering results. For this purpose, a generic approach which is suitable for various validity measures and a specific approach which is applicable for two validity measures are proposed. It is also shown that this upper bound is sufficient to compute the sensitivity of various validity measures. This upper bound is very less as compared to the total number of possible combinations of clustering results.

Keywords: Clustering algorithm, sensitivity, cluster validity measure

DOI: 10.3233/FI-2018-1749

Journal: Fundamenta Informaticae, vol. 163, no. 4, pp. 351-374, 2018

Received September 2017

August 2018

Published: 03 November 2018

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia