Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: S, Pradeepaa; * | Gaspar, Nivedaa | Shanmuganathan, Vimalb; * | P, Subbulakshmic | Alkhayyat, Ahmedd | M, Kaliappanb
Affiliations: [a] Department of CSE, SASTRA Deemed University, Thanjavur, Tamil Nadu, India | [b] Department of Artificial Intelligence and Data Science, Deep Learning Lab, Ramco Institute of Technology, Rajapalayam, Tamil Nadu, India | [c] School of Computer Science and Engineering, VIT Chennai Campus, Tamil Nadu, India | [d] Department of Computer Technical Engineering, College of Technical Engineering, The Islamic University, Najaf, Iraq
Correspondence: [*] Corresponding authors: Pradeepa S, Department of CSE, SASTRA Deemed University, Thanjavur, TamilNadu, India. E-mail: [email protected]. Vimal Shanmuganathan, Deep Learning Lab, Department of Artificial Intelligence and Data Science, Ramco Institute of Technology, Rajapalayam, tamilnadu, India. E-mail: [email protected].
Abstract: A promoter is a brief stretch of DNA (100–1,000 bp) where RNA polymerase starts to transcribe a gene. A DNA (Deoxyribonucleic Acid) base pair is a fundamental unit of DNA structure and represents the pairing of two complementary nucleotide bases within the DNA double helix. The four DNA nucleotide bases are adenine (A), thymine (T), cytosine (C), and guanine (G). DNA base pairs are the building blocks of the DNA molecule, and their complementary pairing is central to the storage and transmission of genetic information in all living organisms. Normally, a promoter is found at the 5′ end of the transcription initiation site or immediately upstream. Numerous human disorders, particularly diabetes, cancer, and Huntington’s disease, have been shown to have DNA promoter as their root cause. The scientific community has long been interested in learning crucial information about protein-coding genes. Finding the promoters is therefore the first step in finding genes in DNA sequences. The scientific world has always been attracted by the effort to glean crucial knowledge about protein-coding genes. Consequently, identifying promoters has emerged as an intriguing challenge that has caught the interest of numerous researchers in the field of bioinformatics. We proposed Gaussian Decision Boundary Estimation in machine learning models to detect transcription start sites (promoters) in the DNA sequences of a common bacteria, Escherichia coli. The best features are identified through a score-based function to select relevant nucleotides that are directly responsible for promoter recognition, in order maximise the models’ performance. The Gaussian Decision Boundary Estimation based support-vector-machine model is trained with these features and finds the best hyperplane that separates the data into different classes. Throughout this study, promoter regions could be identified with high accuracy 99.9% which is better compared to other state of art algorithms. The comparison of machine learning classification models is another major emphasis of this paper in order to identify the model that most accurately predicts DNA sequence promoters. It provides analysis for further biological research as well as precision medicine.
Keywords: Promoter, DNA, Bioinformatics, machine learning, gaussian decision boundary estimation
DOI: 10.3233/IDT-230283
Journal: Intelligent Decision Technologies, vol. 18, no. 1, pp. 613-631, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]