Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Zhongguo, Yanga; b | Hongqi, Lia; b; * | Liping, Zhua; b | Qiang, Liua; b | Ali, Sikandara; b
Affiliations: [a] Key Lab of Petroleum Data Mining, China University of Petroleum, Beijing, China | [b] Department of Computer, China University of Petroleum, Beijing, China
Correspondence: [*] Corresponding author. Li Hongqi, Key Lab of Petroleum Data Mining of Beijing, China University of Petroleum, Beijing 102249, China. Tel.: +81 10 80116740; E-mail: [email protected].
Abstract: The k-nearest-neighbor classifier is a vital algorithm. In practice, the choice of k is decided by the cross-validation method. We propose a new method for neighborhood size selection based on the data set profile. The distribution of a data set and its intrinsic characteristics are the fundamental factors to the choice of k. A local complexity was computed for each example and a complexity profile was constructed by sorting these local complexity values which try to capture inner structure of a data set. After this, a feature vector was built by combing the local complexity profile and some statistic features of a data set. In addition, a history meta-data set was constructed by using the feature vector as attributes and the optimum k value of data set as the label, which was calculated by using ten cross-validation methods. A predict model was trained based on the historic meta-data set and used to predict optimum k value for a new data set. Some exclusive experiments are conducted to verify the proposed method. The results shows that the local complexity features could reflect the inner structure of a data set which could help find the optimum k for k-NN for different domains.
Keywords: k-NN classifier, data sets, local complexity profile, optimum k
DOI: 10.3233/JIFS-161062
Journal: Journal of Intelligent & Fuzzy Systems, vol. 33, no. 1, pp. 55-65, 2017
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]