You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Importance of Task Selection for Connected Speech Analysis in Patients with Alzheimer’s Disease from an Ethnically Diverse Sample


Features of linguistic impairment in Alzheimer’s disease (AD) are primarily derived from English-speaking patients. Little is known regarding such deficits in linguistically diverse speakers with AD. We aimed to detail linguistic profiles (speech rate, dysfluencies, syntactic, lexical, morphological, semantics) from two connected speech tasks–Frog Story and picture description–in Bengali-speaking AD patients. The Frog Story detected group differences on all six linguistic levels, compared to only three with picture description. Critically, Frog Story captured the language-specific differences between the groups. Careful consideration should be given to the choice of connected speech tasks for dementia diagnosis in linguistically diverse populations.


Alzheimer’s disease (AD) is characterized by hallmark changes in memory and language [1]. Recent research in the linguistic profile of connected speech in non-English speaking communities indicates that the profile of impairment is not comparative across languages, and certainly not comparative to impairments deemed characteristic of language breakdown in English [2, 3]. Manifestation of the linguistic impairments depend on the structure of the language system [4], and this principle has implications for symptoms of language breakdown in AD.

Recently, we reported that Bengali-speaking AD patients produced fewer pronouns in novel story telling [3], in direct contrast with the overuse of pronouns by English-speaking AD patients [5, 6]. Similarly, Kav e´ & Levy (2003) reported that Hebrew-speaking AD patients produced a similar proportion of inflected words compared to controls in Cookie Theft picture description [7], a difference that is typically found in English-speaking AD patients [5, 8]. Thus, profiling of linguistic features is implicitly linked to the specific language structure [8–10]. To elicit these language specific features, the type of connected speech task used is critical. Research with English speakers has consistently shown that the type of task used (e.g., picture description, story narrative, semi-structured interview) impacts the breadth of linguistic features captured [9–11], with implications for the accuracy of diagnosis.

To date, research comparing linguistic profiles across commonly used connected speech tasks in AD patients of South-Asian languages, such as Hindi, Urdu, Bengali, Punjab, Tamil, and Marathi, has not been published. Petti et al. (2020)’s systematic review of automatic AD detection from speech and language noted the dearth of research in non-European languages in AD and highlighted an urgency to investigate these languages in future studies [10, 12].

The aim of the current study is to detail the linguistic profile from two connected speech tasks, story narrative and picture description, from Bengali speakers with AD. Bengali is a highly inflected pro-drop East Indo-Aryan language [13]. It is currently the seventh most spoken language in the world with over 265 million speakers and is the national language of Bangladesh, official language of three states in India (West Bengal, Tripura, and Assam) along with substantial Bengali diaspora in Western and Middle Eastern countries. This research therefore fills a significant gap in the literature for profiling linguistic impairments in ethnically diverse AD populations.


Participants and background assessments

Participants were six right-handed Bengali speakers with a clinical diagnosis of probable AD dementia based on the NINCDS/ADRAA criteria [14], and eight age-, gender-, education-, and language-matched healthy control participants (HC) (Table 1). They were recruited from the Duttanagar Mental Health Centre, Kolkata, eastern India. All participants were native speakers of Bengali and were Bengali-English sequential bilinguals. They were living in a predominantly Bengali speaking context (i.e., using Bengali at home and at work). At the time of the study, they were living with their families in the urban metropolis of Kolkata. They were professionally engaged prior to the onset of AD in education, business, agriculture, accounting, or engineering sectors. Exclusion criteria for both groups included a known history of alcohol or drug abuse, or other neurological or psychiatric illness, and less than ten years of education. Participants underwent a battery of tests to profile general cognitive functioning and activities of daily living (Table 1). All HC performed within the normal range on the test battery. Except for AD07 with moderate dementia (i.e., Clinical Dementia Rating, CDR global score of 2), all other AD participants had mild dementia (i.e., CDR global score of 1). This study was carried out with ethical clearance from the University of Reading (2017-035-AB).

Table 1

Demographic characteristics and neuropsychological data on the various background measures for each individual with Alzheimer’s disease (AD) as well as Mean and SD of AD and healthy controls (HC) groups. This table is adapted from Bose et al. 2021 [3]

Individual AD CasesGroup MeansResults of Statistical Tests
AD01AD03AD04AD06AD07AD09ADHCzpeffect size
Demographic information
Age at the time of study (y)67767851715666.510.8971.74.26778–0.650.52–0.17
Education (y)15141015171714.72.5816.11.21518–1.090.28–0.29
Duration of symptoms (mo)36362412304831.012.25
Age at the onset of symptoms (y)6473765068.55263.910.82
General cognitive functioning
Bengali Mini-Mental State Examination, BMSE (/30)22202022141619.03.2930.003030–3.440.00–0.92
ACE-III, Bengali adapted (/100)49404573273144.216.3892.72.38996–3.100.00–0.83
  Attention (/18)111011137810.02.1917.70.71618–3.230.00–0.86
  Memory (/26)1091216349.04.9025.30.72426–3.150.00–0.84
  Fluency (/14)4109112.73.398.01.0710–2.290.02–0.61
  Language (/26)1612152491515.25.0425.90.32526–3.310.00–0.89
  Visuoconstructional (/16)98711737.52.6615.80.41516–3.230.00–0.86
Clinical Dementia Rating (CDR)1111211.20.410.0000–3.530.00–0.94
Instrumental Activities of Daily Living Scale in Elderly (IADL-EDR) (% impairment)2050CNT111813639.627.560.0000–3.340.00–0.93

BMSE [15]; ACE-III [16]; CDR [17] (CDR score of 0 = no dementia, 0.5 = questionable dementia, 1.0 = mild dementia, 2.0 = moderate dementia, 3 = severe dementia); IADL-EDR [18] (a score > 16 is in the impaired range with higher value representing higher level of impairment), 1Could not be tested.

Experimental tasks

Two connected speech tasks were elicited in Bengali: 1) Story telling using the wordless picture book “Frog, Where Are you?” [19]; 2) Picnic Scene Picture Description from the Western Aphasia Battery-Revised [20]. For both tasks, participants were encouraged to speak in sentences. Other than occasional prompts and generic encouragement, tester interruptions were kept to a minimum. For the Frog story, participants were given a brief background about the story and were told that the main characters of the story are a boy, his dog, and a frog. Before describing the story based on the pictures, participants looked through the book once. For the Picnic scene, participants were given Bengali equivalent of the instruction “Tell me everything you are see going on in this picture”. Sessions were recorded using the digital audio recorder Olympus voice recorder WS-833 for subsequent verbatim orthographic transcription. The Frog Story data have been previously published in Bose et al. (2021) to develop language-specific linguistic profile for Bengali speakers [3].

Quantitative analysis of narrative speech and variables

To capture the multidimensional nature of connected speech, measures for this study were in keeping with the recommendations from recent reviews for linguistic levels that are essential for characterizing AD speech [21, 22]. They aimed at quantifying six different linguistic levels of production: 1) speech rate; 2) structural and syntactic measures; 3) lexical measures; 4) morphological and inflectional measures; 5) semantic measures; and 6) measure of spontaneity and fluency disruptions [3, 5, 6, 21–24]. The Quantitative Production Analysis (QPA) [25] and the Correct Information Unit (CIU) [26] analyses were implemented to calculate a set of count and proportional measures for each sample. The QPA scheme was augmented to capture specific linguistic features of Bengali (e.g., verbal and nominal morphology, proportion of postposition). Supplementary Table 1 provides the full definition of all the variables along with the individual level data. To keep the comparisons between the tasks succinct, we focused on the proportional measures. To ensure reliability, transcriptions and coding were reviewed and agreed upon by multiple authors (AB, MD, NSD). Details on transcription, along with definition and description for the full range of variables can be found in Bose et al. (2021) [3].

Statistical analysis

The novelty of these data in a language that has not been investigated before necessitates the capture of both group and individual level performance. We approached the analyses in two ways: group and case-series analyses. We report the comparative pattern of performance across the two tasks between groups. For the group comparisons, non-parametric versions of independent samples t-test (Mann-Whitney U test) were used for the selected variables. Given that finding might be informative for under-researched clinical population and potential for future larger scale studies [27, 28], we report findings with exact p-values and effect sizes for readers to appreciate the strength of these effects. It has been suggested that over-correction of alpha level risks the chance of increasing type II errors (i.e., rejecting significant findings) especially for under-represented clinical populations [27, 29]. In addition, we implemented Crawford and colleague’s single-subject statistical method of comparing a single case to a small control group (at least five) to identify differences between each AD participant and controls (e.g., [30, 31]). To facilitate understanding of individual variation and to capture the heterogeneity of the AD population, we mention the number of participants within the AD group who showed significant difference from the control based on Crawford et al.’s single-subject analysis methods (see Table 2).

Table 2

Table 2 Summary of the key findings across the six linguistic levels of speech and language production for both connected speech tasks (Frog Story and Picnic Picture Description), and information on the proportion of AD individuals who showed similar results to the group differences. Grey shading indicates significant group difference.

Frog StoryPicnic Picture Description (WAB-R)
VariablesADHCpEffect sizeDirection of effect for AD# (%) of AD patients showing sign differenceADHCpEffect sizeDirection of effect for AD# (%) of AD patients showing sign difference
Speech rate
Total number of words322.00133.43466.00211.980.16-0.38103.0042.2191.4341.540.73-0.10
Words per minute60.0729.52135.9231.890.00-0.79decreased5 (83%)77.2624.9476.934.610.73-0.12
Structural and syntactic measures
Proportion of words in sentences0.860.050.800.150.05-0.520.990.020.940.070.14-0.44
Mean sentence length4.260.647.680.820.00-0.83shorter6 (100%)4.760.705.110.680.63-0.16
Proportion of well-formed sentences0.790.130.950.060.01-0.68lesser2 (33%)0.580.180.900.170.01-0.68lesser4 (67%)
Embedding index0.030.050.600.220.00-0.83lower6 (100%)
Lexical measures
Proportion of open class words0.810.030.760.040.03-0.57increased0.830.060.820.030.73-0.10
Proportion of closed class words0.
Proportion of noun, N (N/all NW)0.330.040.330.030.48-0.190.340.060.370.050.63-0.14
Proportion of pronoun, P (P/all NW) (50%)
Proportion of pronoun to noun (P/P+N) (67%)
Proportion of verb, V (V/all NW)
Proportion of postposition, PP (PP/NW)
Number of reduplication0.500.553.002.780.05-0.53decreased3 (50%)1.170.901.431.400.95-0.02
Morphological and inflectional measures
Nouns inflections
Noun inflection index0.980.
Proportion of inflected nouns60.9514.3958.0510.720.80-0.0726.818.5926.949.860.95-0.02
Proportion of noun with 1 inflection0.820.060.850.090.56-0.160.840.180.920.150.30-0.33
Proportion of noun with 2 or more inflections0.
Proportion of definiteness markers in %60.3819.9527.0912.070.01-0.66increased5 (83%)27.7827.2219.1013.850.95-0.02
Proportion of case markers in %39.1617.1872.4412.560.01-0.72decreased5 (83%)72.2227.2283.2614.160.30-0.33
Verb inflections
Verb inflection index1.
Verb complexity score1.990.011.990.040.92-0.033.320.133.260.550.84-0.06
Semantic measures
Number of CIU135.6729.65161.635.710.01-0.70fewer4 (67%)65.8321.9874.8626.960.63-0.16
CIU% (Idea density)62.4812.4490.875.540.00-0.83decreased6 (100%)67.4313.5884.858.550.02-0.67decreased3 (50%)
CIUs per minute (Idea efficiency)41.2312.3498.2415.930.00-0.83decreased6 (100%)49.8648.7765.176.510.01-0.67decreased3 (50%)
Measures of spontaneity and fluency disruptions
Repetition2.832.560.751.040.11-0.432.332.390.000.000.02-0.67greater3 (50%)
Revisions8.504.592.252.550.01-0.72greater3 (50%)1.831.640.140.380.230.34
Total count of disruptions of fluency (repetition, revision, reformulations)11.335.963.132.900.01-0.71greater3 (50%) (83%)

NW, Narrative Words; CIU, Correct Information Unit.


Between group comparisons of the profile of linguistic impairments across the two tasks revealed that for the Frog Story AD patients showed significant differences from the controls in all six linguistic levels: speech rate, syntactic, lexical, morphological and inflectional, semantic, and spontaneity and fluency measures. In contrast, the picture description task could capture differences only in three levels: syntactic, semantic, and spontaneity and fluency measures (see Table 2). With the picture description task, of the linguistic levels that showed differences, fewer variables were different between the two groups. For example, for the Frog story, three variables within the syntactic measures were significantly different (i.e., mean length of sentences, proportion of well-formed sentences, embedding index) versus only one variable in the picture description (i.e., proportion of well-formed sentences). Furthermore, for significant findings, the effect sizes were stronger for the Frog Story than picture description. Individual level analyses revealed that, in contrast to picture description, Frog Story resulted in a higher number of AD patients showing significant differences from the control group (see Table 2).

Compared to controls, AD patients’ Frog Story narrative was characterized by a slower rate of speech with increased dysfluencies that was marked by increased reformulation attempts. Their sentences had smaller mean length, were less well-formed, and grammatically simpler with lower embedding index. The lexical distribution of the production indicated increased proportion of open-class words with a corresponding decrease in closed-class words, decreased proportion of pronouns, and decreased number of reduplications. AD patients produced a similar proportion of inflected lexical items compared to controls. Nouns and verbs which were inflected were inflected correctly without obvious errors. However, they defaulted to simpler types and forms of inflections as noted by a decrease in case markers and increase in definiteness markers for nouns. Semantically their production had a lower number of CIUs which resulted in lower idea density and efficiency. Unlike Frog Story, picture description was unable to capture the language-specific differences between the groups, showing no difference in pronoun usage, number of reduplications, or in the quality of noun inflections.


The key finding of this study is that complex narrative tasks that entail the integration of characters and events within a temporal framework, such as, the Frog Story task, capture more differences between Bengali-speaking AD patients and controls than single picture description. Compared to a picture description task, learned or novel story retelling tasks enable speakers to generate a rich and extended language output. For the reasons of simplicity and resource constraints, picture description tasks have been most commonly used in the field of connected speech analysis research and clinical practice [32]. However, this study shows that picture description is limiting in terms of richness, length and quality of the speech produced.

In the Frog Story, all six linguistic levels—speech rate, dysfluencies, syntactic, lexical, morphological and semantics—showed significant differences between AD patients and controls, whilst only three linguistic levels showed group differences using the picture description task. Moreover, even the linguistic levels that showed differences in both tasks, such as syntactic measures, Frog Story resulted in broader data capture with three variables revealing significant differences (mean sentence length, proportion of well-formed sentences, embedding index) versus only one variable with picture description (i.e., proportion of well-formed sentences). These findings can be attributed to the fact that picture description often encourages listing of items in the picture, as speakers do not need to generate complex and long sentences to describe the image (see Table 2) [33, 34]. Overall, the Frog Story was more sensitive in detecting differences at several different linguistic levels whereas picture description was most useful in evaluating semantic impairments.

Furthermore, amongst these observed differences Frog Story captured several language-specific features of Bengali, which were not evident in picture description. For example, in Frog Story, the AD patients produced a lower proportion of pronouns in Bengali, which is in direct contrast with the overuse of pronouns by English-speaking AD patients consistently reported in the literature (e.g., [5, 6]). Differential performance on the pronoun usage is driven by the pro-drop nature of Bengali, as it allows dropping of the subject nouns [3]. However, a lower proportion of pronouns was not observed in the picture description in Bengali. Similarly, the AD patients were defaulting to simpler noun inflections despite being able to produce equivalent proportion of noun inflections, which was only evident in Frog Story.

Recruiting a large sample of clinical group remains a perennial difficulty for researchers. This study had six participants with AD. A larger sample of AD participants would be desirable, although such number is not unusual in clinical studies particularly where participants belong to an underrepresented group. The methodology was selected to mitigate challenges of generalization. As such, statistical analysis captured findings at both the group and individual levels, offering a comprehensive, detailed, and nuanced approach to the profiling of linguistic impairments in a language which has not yet been linguistically studied in depth in neurological impairments. Future research must consider recruitment strategies for these underserved populations for development of larger sample sizes with varying severity and impairment profiles.

These findings highlight the need for researchers and clinicians to pull together resources to identify, characterize, and analyze the linguistic features of connected speech among individuals with dementia from different language users. Currently, our understanding of linguistic breakdowns in dementia in diverse languages is limited, as the vast majority of studies have been conducted in English-speaking participants [22, 23]. Furthermore, studies undertaking linguistic research in under-explored languages should employ a range of tasks and variables to consistently and reliably capture variables that differentiates patients and controls. Moreover, these tasks and variables must be sensitive in capturing language-specific differences. Clinical assessments limited to single picture descriptions would delay identification of early signs of dementia, which in turn would lead to delayed diagnosis, access to pharmacological and non-pharmacological interventions leading to poorer outcomes for patients.


The first author (AB) was supported by the Centre of Literacy and Multilingualism (CELM) pump priming grant from the University of Reading. The third author (NSD) was supported by the British Academy International Visiting Fellowship Grant (VF1∖103620; Visiting Fellowships Programme 2018). We are grateful to the funding bodies for supporting the work. We thank Ms. Athira M. Padmakumar for help with editing and formatting the manuscript. We are indebted to all our participants for their enthusiasm and time for participation in this research.

Authors’ disclosures available online (




Snowdon DA , Kemper SJ , Mortimer JA , Greiner LH , Wekstein DR , Markesbery WR (1996) Linguistic ability in early life and cognitive function and Alzheimer’s disease in late life: Findings from the Nun Study. JAMA 275, 528–532.


Auclair-Ouellet N (2015) Inflectional morphology in primary progressive aphasia and Alzheimer’s disease: A systematic review. J Neurolinguistics 34, 41–64.


Bose A , Dash NS , Ahmed S , Dutta M , Dutt A , Nandi R , Cheng Y , D. Mello TM (2021) Connected speech characteristics of bengali speakers with Alzheimer’s disease: Evidence for language-specific diagnostic markers. Front Aging Neurosci 13, 707628.


Paradis M (1988) Recent developments in the study of agrammatism: Their import for the assessment of bilingual aphasia. J Neurolinguistics 3, 127–160.


Ahmed S , Haigh A-MF , De Jager CA , Garrard P (2013) Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain 136, 3727–3737.


Fraser KC , Meltzer JA , Rudzicz F (2016) Linguistic features identify Alzheimer’s disease in narrative speech. J Alzheimers Dis 49, 407–422.


Kavé G , Levy Y (2003) Morphology in picture descriptions provided by persons with Alzheimer’s disease. J Speech Lang Hear Res 46, 341–352.


Sajjadi SA , Patterson K , Tomek M , Nestor PJ (2012) Abnormalities of connected speech in semantic dementia vs Alzheimer’s disease. Aphasiology 26, 847–866.


Clarke N , Barrick TR , Garrard P (2021) A comparison of connected speech tasks for detecting early Alzheimer’s disease and mild cognitive impairment using natural language processing and machine learning. Front Comput Sci 3, 634360.


Petti U , Baker S , Korhonen A (2020) A systematic literature review of automatic Alzheimer’s disease detection from speech and language. J Am Med Inform Assoc 27, 1784–1797.


Lavoie M , Black SE , Tang-Wai DF , Graham NL , Stewart S , Leonard C , Rochon E (2021) Description of connected speech across different elicitation tasks in the logopenic variant of primary progressive aphasia. Int J Lang Commun Disord 56, 1074–1085.


Mueller KD , Hermann B , Mecollari J , Turkstra LS (2018) Connected speech and language in mild cognitive impairment and Alzheimer’s disease: A review of picture description tasks. J Clin Exp Neuropsychol 40, 917–939.


Thompson H (2010) Bengali: A Comprehensive Grammar. Taylor and Francis, London.


McKhann GM , Knopman DS , Chertkow H , Hyman BT , Jack CR Jr , Kawas CH , Klunk WE , Koroshetz WJ , Manly JJ , Mayeux R (2011) The diagnosis of dementia due to Alzheimer’s disease: Recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 263–269.


Das SK , Banerjee TK , Mukherjee CS , Bose P , Biswas A , Hazra A , Dutt A , Das S , Chaudhuri A , Raut D (2006) An urban community-based study of cognitive function among non-demented elderly population in India. Neurol Asia 11, 37–48.


Hsieh S , Schubert S , Hoon C , Mioshi E , Hodges JR (2013) Validation of the Addenbrooke’s Cognitive Examination III in frontotemporal dementia and Alzheimer’s disease. Dement Geriatr Cogn Disord 36, 242–250.


Morris JC (1993) The Clinical Dementia Rating (CDR): Current version and scoring rules. Neurology 43, 2412–2414.


Mathuranath P , George A , Cherian PJ , Mathew R , Sarma PS (2005) Instrumental activities of daily living scale for dementia screening in elderly people. Int Psychogeriatr 17, 461–474.


Mayer M (1969) Frog, where are you? Penguin Books, New York, NY.


Kertesz A (2007) Western Aphasia Battery—Revised.


Slegers A , Filiou R-P , Montembeault M , Brambati SM (2018) Connected speech features from picture description in Alzheimer’s disease: A systematic review. J Alzheimers Dis 65, 519–542.


Filiou R-P , Bier N , Slegers A , Houzé B , Belchior P , Brambati SM (2020) Connected speech assessment in the early detection of Alzheimer’s disease and mild cognitive impairment: A scoping review. Aphasiology 34, 723–755.


Boschi V , Catricala E , Consonni M , Chesi C , Moro A , Cappa SF (2017) Connected speech in neurodegenerative language disorders: A review. Front Psychol 8, 269.


Wilson SM , Henry ML , Besbris M , Ogar JM , Dronkers NF , Jarrold W , Miller BL , Gorno-Tempini ML (2010) Connected speech production in three variants of primary progressive aphasia. Brain 133, 2069–2088.


Berndt RS (2000) Quantitative production analysis a training manual for the analysis of aphasic sentence production, Psychology Press.


Nicholas M , Obler LK , Albert ML , Helm-Estabrooks N (1985) Empty speech in Alzheimer’s disease and fluent aphasia. J Speech Lang Hear Res 28, 405–410.


Feise RJ (2002) Do multiple outcome measures require-value adjustment? BMC Med Res Methodol 2, 8.


Perneger TV (1998) What’s wrong with Bonferroni adjustments. BMJ 316, 1236–1238.


Streiner DL , Norman GR (2011) Correction for multiple testing: Is there a resolution? Chest 140, 16–18.


Crawford JR , Garthwaite PH (2002) Investigation of the single case in neuropsychology: Confidence limits on the abnormality of test scores and test score differences. Neuropsychologia 40, 1196–1208.


Crawford JR , Garthwaite PH , Porter S (2010) Point and interval estimates of effect sizes for the case-controls design in neuropsychology: Rationale, methods, implementations, and proposed reporting standards. Cogn Neuropsychol 27, 245–260.


de la Fuente Garcia S , Ritchie CW , Luz S (2020) Artificial intelligence, speech, and language processing approaches to monitoring Alzheimer’s disease: A systematic review. J Alzheimers Dis 78, 1547–1574.


Stark BC (2019) A comparison of three discourse elicitation methods in aphasia and age-matched adults: Implications for language assessment and outcome. Am J Speech Lang Pathol 28, 1067–1083.


Wright HH , Capilouto GJ (2009) Manipulating task instructions to change narrative discourse performance. Aphasiology 23, 1295–1308.