Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Mathematical Modelling in Computational and Life Sciences
Guest editors: Ahmed Farouk
Article type: Research Article
Authors: Ali, Munwara; * | Jung, Low Tangb | Hosam, Osamac; d | Wagan, Asif Alie | Shah, Rehan Alif | Khayyat, Mashaelg
Affiliations: [a] Department of IT, Shaheed Benazir Bhutto University, Shaheed Benazirabad, Sindh, Pakistan | [b] Deparment of Computer and Information Sciences, Universiti Teknologi PETRONAS, Malaysia | [c] The College of Computer Science and Engineering in Yanbu, Taibah University, Medina, Saudi Arabia | [d] Informatics Research Institute, The City for Scientific Research and Technology Applications, Alexandria, Egypt | [e] Department of Computer Science, SMIU, Karachi, Pakistan | [f] Department of Computer Systems Engineering, Faculty of Engineering, The Islamia University Bahawalpur, Pakistan | [g] Department of Information Systems and Technology, Faculty of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia
Correspondence: [*] Corresponding author. Munwar Ali, Department of IT, Shaheed Benazir Bhutto University, Shaheed Benazirabad, Sindh, Pakistan. E-mail: [email protected]
Abstract: The k-NN algorithm is an instance-based learning algorithm which is widely used in the data mining applications. The core engine of the k-NN algorithm is the distance/similarity function. The performance of the k-NN algorithm varies with the selection of distance function. The traditional distance/similarity functions in k-NN do not perfectly handle the mix-mode words such as when one string has multiple substrings/words. For example, a two-word string of “Employee Name”, a one-word string of “Name” or more than one word such as, “Name of Employee”. This ambiguity is faced by different distance/similarity functions causing difficulties in finding the perfect match of words. To improve the perfect-match calculation functionality in the traditional k-NN algorithm, a new similarity distance metric is developed and named as word-distance (w-distance). The perfect match will help us to identify the exact required value. The proposed w-distance is a hybrid of distance and similarity in nature because it is to handle dissimilarity and similarity features of strings at the same time. The simulation results showed that w-distance has a better impact on the performance of the k-NN algorithm as compared to the Euclidean distance and the cosine similarity.
Keywords: k-NN algorithm, distance/similarity metric, text match, data mining, cosine similarity
DOI: 10.3233/JIFS-179552
Journal: Journal of Intelligent & Fuzzy Systems, vol. 38, no. 3, pp. 2661-2672, 2020
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]