Natural language processing (NLP) aided qualitative method in health research
Issue title: The Need for Innovations in Healthcare Systems Using Patient Experience and Advancing Information Technology
Guest editors: Varadraj P. Gurupur, Thomas T.H. Wan, Rama Raju Rudraraju and Shrirang A. Kulkarni
Article type: Research Article
Authors: Cheligeer, Cheligeera; b | Yang, Linc; d; e | Nandi, Tannisthaf | Doktorchik, Chelseab; e | Quan, Hudeb; e | Zeng, Yonga | Singh, Shamindere; g; *
Affiliations: [a] Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada | [b] Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada | [c] Department of Cancer Epidemiology and Prevention Research, Alberta Health Services, Calgary, AB, Canada | [d] Department of Cancer, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada | [e] Department of Community Health Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada | [f] Department of Information Technologies, Research Computing Services, University of Calgary, Calgary, AB, Canada | [g] School of Nursing and Midwifery, Faculty of Health, Community and Education, Mount Royal University, Calgary, AB, Canada
Correspondence: [*] Corresponding author: Shaminder Singh, E-mail: [email protected].
Abstract: Qualitative data analysis is produced frequently in healthcare settings, which is a time-consuming and skilled analytic task. The use of qualitative research findings in clinical settings takes years, which is sometimes obsolete knowledge as the health context is dynamic. Artificial Intelligence (AI)-based qualitative data analysis might present with rapid analysis of text-based data in real-time, thereby empowering qualitative researchers to expedite their analysis and facilitate timely use of the research findings. We tested an AI-based method to complement the manual analysis of text-based data from the verbatim transcripts of seven mall managers’ interviews. First, we prepared text data into a machine-calculable format and employed BERT model to extract sentence-level features in our case. Second, we implement TF-IDF-based keywords mining techniques to extract the main candidate themes from the interview transcripts to support text-based analysis, including: 1) primary cluster detection algorithm, and 2) keyword extraction algorithm. The extracted core themes provide qualitative researchers with a more comprehensive overview of the qualitative data. Most of the sentences clustered in meaningful short topics or sentences carrying independent and clear information. The extracted topics and clustered sentences reduced qualitative researchers’ workload by condensing and identifying meaningful concepts and naming them. This method combining contextualized word embeddings, unsupervised clustering, and keyword extraction techniques can significantly reduce the overall workload and time consumed in qualitative research using conventional methods.
Keywords: Qualitative health research, grounded theory, natural language processing, machine learning, artificial intelligence, text clustering, keyword extraction
DOI: 10.3233/JID-220013
Journal: Journal of Integrated Design and Process Science, vol. 27, no. 1, pp. 41-58, 2023