Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Selected papers from the ISCA International Conference on Software Engineering and Data Engineering, and the ISCA International Conference on Computer Applications in Industry and Engineering, 2015, and Invited Papers
Guest editors: Takaaki Goto and Narayan C. Debnath
Article type: Research Article
Authors: Ding, Qina; * | Boykin, Robertb
Affiliations: [a] Department of Computer Science, East Carolina University, Greenville, NC 27858, USA | [b] Department of Computer Science, University of South Carolina, Columbia, SC 29208, USA
Correspondence: [*] Corresponding author: Qin Ding, Department of Computer Science, East Carolina University, Greenville, NC 27858, USA. Tel.: +1 252 328 9686; Fax: +1 252 328 0715; E-mail:[email protected]
Abstract: Within the field of data mining and machine learning, the K-Nearest Neighbor algorithm is a classic algorithm which simply yet elegantly classifies data based upon its similarity to other data. While it follows that the accuracy increases as more data are provided, handling large sets of data is difficult to process serially. It is therefore ideal to perform these tasks in parallel or distributed mode. In this paper, we proposed a framework for distributed nearest neighbor classification. A custom K-Nearest Neighbor algorithm was developed using Hadoop, an environment for developing and deploying applications in parallel on a cluster. The algorithm was implemented on a cluster then tested for accuracy and time of execution. It was observed that the accuracy depends on the provided k-value and on the data set, which is to be expected for the K-Nearest Neighbor process. The time of execution was found to increase logarithmically as the file size, and thus the amount of data the algorithm must parse, increases exponentially.
Keywords: Data mining, distributed data mining, classification, K-Nearest Neighbor, Hadoop
DOI: 10.3233/JCM-160676
Journal: Journal of Computational Methods in Sciences and Engineering, vol. 17, no. S1, pp. S11-S19, 2017
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]