You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Molecular Gene Expression Testing to Identify Alzheimer’s Disease with High Accuracy from Fingerstick Blood

Abstract

Background:

There is no molecular test for Alzheimer’s disease (AD) using self-collected samples, nor is there a definitive molecular test for AD. We demonstrate an accurate and potentially definitive TempO-Seq® gene expression test for AD using fingerstick blood spotted and dried on filter paper, a sample that can be collected in any doctor’s office or can be self-collected.

Objective:

Demonstrate the feasibility of developing an accurate test for the classification of persons with AD from a minimally invasive sample of fingerstick blood spotted on filter paper which can be obtained in any doctor’s office or self-collected to address health disparities.

Methods:

Fingerstick blood samples from patients clinically diagnosed with AD, Parkinson’s disease (PD), or asymptomatic controls were spotted onto filter paper in the doctor’s office, dried, and shipped to BioSpyder for testing. Three independent patient cohorts were used for training/retraining and testing/retesting AD and PD classification algorithms.

Results:

After initially identifying a 770 gene classification signature, a minimum set of 68 genes was identified providing classification test areas under the ROC curve of 0.9 for classifying patients as having AD, and 0.94 for classifying patients as having PD.

Conclusions:

These data demonstrate the potential to develop a screening and/or definitive, minimally invasive, molecular diagnostic test for AD and PD using dried fingerstick blood spot samples that are collected in a doctor’s office or clinic, or self-collected, and thus, can address health disparities. Whether the test can classify patients with AD earlier then possible with cognitive testing remains to be determined.

INTRODUCTION

Alzheimer’s disease (AD) is the most common form of dementia, affecting more than 55 million people worldwide with more than 10 million new cases/year.1 It is estimated that three-fourths of those with AD worldwide have not been diagnosed, and on average it can take over 2 years for a diagnosis of AD when one is made.2 6.5 million Americans live with AD, with a risk of developing AD of 1 in 5 for women, and 1 in 10 for men.3 The prevalence of AD is higher among non-Hispanic Black and Hispanic Americans, populations which experience health disparities reducing access to healthcare. This may result in under-diagnosis, particularly at earlier stages of disease.

Estimates of AD prevalence are based on cognitive testing because there is no FDA approved definitive molecular test. By “definitive” we mean a test, which if positive for AD, does not require further testing other than to stratify patients and assess their progression. Currently, a definitive diagnosis of AD requires a battery of tests including clinical phenotype of AD and biomarker evidence of AD pathology. Thus, when a patient presents with clinical manifestations of possible AD, a definitive diagnosis is based on assessment of patient history, neurological exam, objective cognitive/functional assessment to establish a clinical phenotype commonly associated with AD, scans (MRI, CT, PET) to rule out hemorrhages, strokes, etc., a positive immunoassay for tau and amyloid-β (Aβ), and may include positive FDG PET/CT and Aβ PET scans.4 This series of tests requires access to a medical specialist and numerous laboratory visits. While immunoassays measuring Aβ and tau proteins from cerebrospinal fluid or plasma have received FDA clearance as in vitro diagnostic devices, they are not sufficient alone to diagnose AD.5–10 Cognitively unimpaired subjects can have a positive tau or amyloid immunoassay test and be classified as at-risk for progression to a dementia but may never develop clinical manifestations of AD within their lifetime.4 Thus, the intended use of these tests are for patients already being evaluated for AD.6

Not only is there no definitive molecular test for AD, but there is no molecular test that uses a minimally invasive sample permitting collection in any doctor’s office or self-collection that can be used to screen persons to identify those who should seek further diagnosis for AD. Furthermore, the number of AD patients greatly outweighs the number of neurologists, making it important to have a test that can be caried out on a sample that can be collected in any doctor’s office or clinic, to identify those who should be seen for further diagnosis.11,12 Finally, the inaccessibility of many patients to any doctor, much less a neurologist, leads to health disparities, creating a need for a classification test that can use a self-collected sample without having to see any doctor first. We report a classifier test for AD that uses a sample that can be collected in any doctor’s office or clinic, or which can be self-collected to inform subjects that they should see a neurologist for diagnosis. While there is the potential for this test to provide a definitive diagnosis, it has clinical utility even if it is not definitive because of how easy it is to collect the sample, and it has utility as a research use assay generally in the field of dementia research.

There are many reports of gene expression signatures for AD and PD that have utilized RNA extracted from whole blood or from white cells or were derived bioinformatically from databases of such samples.13–28 All these used blood obtained by venipuncture. While most did not correlate their results to biomarker assays, one did, demonstrating a good correlation (classifying 24 of 28 positive patients) to the CSF amyloid biomarker test.18 Another reported an accuracy of 74–77% predicting AD in patients with mild cognitive impairment (MCI) two years prior to diagnosis of AD.21 Thus, there is good evidence that a whole blood test can be used to classify patients with AD.

TempO-Seq® is an assay platform that uses crude sample lysates to measure the expression of focused sets of genes up to the whole transcriptome.29 Seeking to develop a minimally invasive test, we pursued the use of dried blood spots on filter paper prepared from a fingerstick as the assay sample. Others have profiled RNA extracted from blood spotted on filter paper, but not from AD patients.30 Because the TempO-Seq assay does not require RNA extraction, we chose to directly test punches of dried blood collected on filter paper. The TempO-Seq assay has also been shown to provide the same quality data from highly degraded RNA (RIN 3.0) as from high quality RNA samples, a feature we believed would be beneficial profiling RNA within the spotted blood samples.29 This TempO-Seq Dried Blood Spot (TempO-SeqDBS) assay was used to determine whether patients clinically diagnosed with AD could be identified and differentiated from asymptomatic controls and patients clinically diagnosed with PD.

MATERIALS AND METHODS

We tested cohorts of patient samples to identify a gene signature and algorithm that classified patients with AD and differentiated them from controls and PD. While subtypes of AD based on gene expression profiling have been described, our objective was to identify a test that could classify patients as having AD regardless of disease subtype.31 Thus, we used an approach that identified an AD classifier signature that was in common to all patients in the training set. We subsequently retrained and retested the algorithm on independent cohorts of samples to confirm the ability of the test to classify patients.

Patients being seen at the Neurology Center of Southern California with a clinical diagnosis of AD or PD and asymptomatic controls, were consented under an IRB-approved protocol (Solutions IRB, protocol ID 0417). The clinical diagnosis of AD followed the clinical guidance published by the National Institute on Aging and Alzheimer’s Association (NIA-AA), published in 2011, and involved a thorough medical evaluation, including a review of the patient’s medical history, physical examination, cognitive testing using, e.g., the Mini-Mental State Examination or Montreal Cognitive Assessment, neurological evaluation including MRI/CT imaging to exclude other causes of the cognitive functional loss, and lab testing to exclude metabolic diseases that could account for the cognitive function loss. The clinical diagnosis of PD followed the guidance published by the UK Parkinson’s Disease Society Brain Bank for PD, and was based on medical history, review of symptoms, neurological and physical exam, identifying the presence of characteristic motor symptoms of bradykinesia, rigidity, postural instability, and resting tremor. Blood was obtained by a fingerstick, 2-3 drops spotted on Whatman filter paper to generate a spot with a diameter of ∼1.5 cm, air dried, de-identified, and then shipped to BioSpyder and tested. Two areas, each 1.6 mm diameter, were collected using a hand punch and placed into a microplate well, with four replicate wells per patient sample (providing 4 within patient biological replicates). Samples were lysed by heating at 95°C for 10 min in 2 μL of Denaturation Buffer covered with 10 μL of mineral oil. The assay was initiated by adding 2 μL of an Annealing Mix containing detector oligos from the commercial TempO-Seq Whole Transcriptome Whole Blood assay v2.1 (measuring all ∼21,000 genes) and incubated at 70°C for 10 min, followed by a temperature ramp to 45°C over 50 min, then held at 45°C for 16 h before being cooled to 25°C. 20 μL of Whole Blood Nuclease Mix was added to the aqueous layer (achieved by adding 20 μl nuclease buffer and then transferring the supernatant into a fresh microplate well and adding concentrated nuclease mix), and samples incubated for 1.5 h at 37°C before adding 24 μL of Ligation Mix to each well, incubating 1 h at 37°C, then incubating for 15 min at 80°C, before dropping the temperature to 25°C. 10 μL of each well containing the sample-specific ligated products, were transferred for amplification into TempO-Seq PCR Primer Plates containing sample-specific indexed universal forward and reverse PCR primers and PCR mix. After 30 cycles, the ligated detector oligos for each sample were uniquely indexed and sequencing primers incorporated into the PCR adduct, to prepare a sequencible product. 5 μL of the PCR product from each sample well were pooled together into a sequencing library, purified using the Macherey-Nagel NucleoSpin Gel and PCR Cleanup Kit (catalog number 740609.50) and quantified by absorbance using the 260/230 and 260/280 ratios and Qubit fluorescence. The filter paper TempO-Seq processing kits with the reagents described above are now commercially available, for use with the content of any TempO-Seq assay.

Sequencing was carried out on an Illumina NextSeq to an average depth of ∼6 M reads/sample. Demultiplexing was carried out on the sequencer to correctly associate individual FASTQ files (and ultimately counts) with each indexed sample. Commercial TempO-SeqR™ software was then used for alignment to generate count tables and evaluate the sequencing run. Quality control metrics for analysis were repeatability between replicates, determined using a Pearson correlation, with any samples among the four replicates that had an R < 0.9 correlation to the remaining replicates being removed from analysis as an outlier, while requiring at least 3 remaining replicates/sample. We calculated the NSig80 (the number of probes that capture 80% of the signal) and the NCov10 (the number of probes receiving less than 10 reads) for each sample, and performed analysis to assure that distributions of these metrics were consistent between the classes of controls, AD, and PD.

RESULTS

Components of whole blood have previously been shown to interfere with certain steps in the TempO-Seq protocol, an interference we were successful in overcoming to establish the Tempo-SeqDBS protocol.32 We discovered that after spotting on filter paper and drying, interferences were eliminated by the TempO-SeqDBS protocol, permitting gene expression to be profiled. The reproducibility of whole transcriptome data measured by the TempO-SeqDBS assay from different replicate regions within the spot tested for each donor is shown as Pearson correlations, calculated using expression of all genes as a variable so that significance values could be calculated for the overall comparison rather than at a gene level, between all possible pairwise comparisons between the replicates for each donor sample after log transforming the data (Fig. 1). The coefficients reported in Fig. 1 were all significant with a p < 0.05. Assay repeatability between finger stick samples collected and tested from the same subject on different days over a period of 400 days is depicted in Fig. 2. In this case, all the replicates for the same time point were averaged, log transformed, values were correlated, and the Pearson correlation for each comparison plotted.

Fig. 1

Histogram showing the distribution of the Pearson r correlation between replicates as a measure of reproducibility. Four different areas of the filter paper spotted with finger stick blood from each donor were tested as replicates. The Pearson correlations between all the possible pairwise comparisons between the replicates for each donor sample were calculated after log transforming the data, and the results depicted as a histogram of the number of samples grouped into each Pearson coefficient value.

Histogram showing the distribution of the Pearson r correlation between replicates as a measure of reproducibility. Four different areas of the filter paper spotted with finger stick blood from each donor were tested as replicates. The Pearson correlations between all the possible pairwise comparisons between the replicates for each donor sample were calculated after log transforming the data, and the results depicted as a histogram of the number of samples grouped into each Pearson coefficient value.
Fig. 2

Pearson r correlation of log-transformed counts to quantify reproducibility of gene expression measurement from the same donor, different days. Samples from the same donor were spotted and tested on different days as controls over a period of 400 days. The replicates for the same time point were averaged and the average log transformed values were correlated using the Pearson correlation, comparing days 0, 7, 15, 24, 30, and 400. The Pearson correlations are indicated as the values of the heatmap.

Pearson r correlation of log-transformed counts to quantify reproducibility of gene expression measurement from the same donor, different days. Samples from the same donor were spotted and tested on different days as controls over a period of 400 days. The replicates for the same time point were averaged and the average log transformed values were correlated using the Pearson correlation, comparing days 0, 7, 15, 24, 30, and 400. The Pearson correlations are indicated as the values of the heatmap.

The TempO-SeqDBS whole transcriptome assay was used to test AD, PD, and control samples to identify signatures of differentially expressed genes to be subsequently used to identify classification algorithms for AD and PD. A first cohort (cohort A) of samples was profiled using the whole transcriptome assay. The majority, 24 controls, 28 AD, and 27 PD passed the quality control replicate sample metric (see Methods). A set of genes was identified that provided a differentiating signature by performing pairwise comparison between the three classes of patients. To account for the presence of multiple dementia subtypes in our dataset, we contrasted different “classes” (AD to control, PD to control, and AD to PD) of samples by testing hundreds of random subsets and performing differential gene expression analyses (DESeq2). The genes that proved to be differentially expressed in at least 33% of the tests (n = 770 differentially expressed genes for control vs AD, control versus PD, AD versus PD, Table 1) were used as a signature differentiating the three classes of patients.

Table 1

List of the genes within the 770 and 68 gene signatures

770 Gene signature
FCRL2H1F0SERPINA1FOLR3NPIPA1CSF2RBJCHAINBLVRBMYZAPADTRPRUNX3CD6ZDHHC2CSF3R
JUNBGNAI2BTG2OGFRL1CYP4F3FCGR2ARPL4SH2D1BSLC6A8FCMRFCGR3BGYPCRGCCCTDSP1
CRISPLD2S100A12DUSP1MPIG6BFMN1MMP25KCNJ15BCL2A1ODC1RHOGIER2IGHA2ABCA7DCAF12
EIF1AYHLA-DRACXCR1IGLV3-21AMD1RAB11FIP1TUBB1FYNRPL13HEBP1CCL4RAB31RPL19DDX5
BSGOST4RPL3HAGHCD8AHLA-DRB1HIST1H4KMAP4K1ARHGAP15TREM1FYB1PDZK1IP1UBBDEFA3
UTYSEPTIN7RBM38WNK1HLA-DRB4ALDOACD3DADMNFAM1HNRNPCPROK2FGL2KCNE3FCGR3A
TRBC1FAM126BFOSCD79ASRSF5USP48CD22SELENBP1CXCR2IDSBST2GIMAP5S100A6FCMR
HLA-DRB5TNFSF13BTCL1AOAS3OGTMXD1VPS8SHISA4PVALBNBPF9CCNITAP2HBA2FPR1
VMP1GNA13IFI30GZMBATP6V0CXPO6ANXA2ZNF417AIF1PLCB2MGAT4ARPS4XOR1S2FTL
NAGKSLC25A39SIRPB1SERPING1GNLYVCANCST3FCN1SIRPATPCN2KLF3FXYD5ARHGEF40GABARAPL2
HALTMCO1CEP85LCA1TRDCRPL12LCP1SNAP23HLA-CTMCC3TIMP2C6orf62PAIP2GADD45B
NBPF12PLXNB2RPL23RGS18BCL2YWHAZUBN1SMCHD1RNF182PGGHGFAM153ASLC25A37RPS2GNAS
KLF4FPR1ALAS2PARP1CXCL8TNFAIP8TAF1DAZU1OSTF1CD163PRDX6RNF213VCLGNLY
DENND4AZBTB1GCAPFN1MAPK1LRP1IGHV3-23NQO2CASP1TRGC1UCP2PSMF1TENM1GPX1
CAVIN2MUC20VSIRIFIT1MKRN1TSPAN5IL1R2LMBRD1UBA52RPL36NKG7SMIM5TCP11L2GYPC
APEX1ANXA11UBALD2USP12VIMB2MALDH2CTSDITGB3RAC2CORO1BCD33AP1S2HBA1
MMP9DCUN1D1SLC35A2GABARAPL2CD3EIFI6NEAT1DGKANPRL3MSNCNOT1INKA2WASHC4HBG1
RALGPS2AC011462.1WLSS100BUSP4NLRC5FOSBCTSBALOX5APRNF130TSC22D3SPTBKRT5HDGF
STRADBNBEAL2RPS23NCF2DDX3YFTH1GBP5FCGR3ANKX3-1NUP50RPL28TMEM154BCL2L1HLA-A
SORL1SOCS3HNRNPH1HCKIFIT2MPEG1RASSF2RNASET2IER5TOMM7ZNF728IL6RAGO4HLA-C
SOX6YPEL3ISG15AC124319.1CD53PABPC1RPL9MT2AHLA-DQB1MRC2BNIP3LLRRK2SLC6A6IFIT1
CCL3L1N4BP2L2HIST1H4ETMSB10SLC11A1PHC2GPX1RPL18CD3GEVI2ABCL6PHOSPHO1ZDHHC18IGF2R
AHSPTRIM25CD68ITGAMEMC3KDM5CPECAM1VPS13BSH3BGRLADGRG3PTGS2DCAF12CEACAM4IGLV3-21
HK3EPB42MAP2K3STAT1UBE2L6OCLNNACA2XKORAI2ADGRE2AGTRAPBZW1ABCB10ITGB2
EMP3RPL35AELF1CD58PLEKHG2EPB41MTRNR2L9HLA-DRB3WASHC1S100A8C9orf78CD74TYMPLST1
HIST1H2BODICER1CDC34CARD8TMEM176BTM9SF2CDC42LGALS3STEAP4LITAFKDM4CSTMP1DEFA1LYZ
MAXCD226TUBA1AATP6V0E1LPIN2TAGLN2MICAL2FCGR1BHBZRPL23AIGHG1SH3BGRL3LEF1MAP2K3
SLFN5RPS27IGF2RZFAND6TRIM22PPP3R1GATA1GIMAP7GMPRSLC16A3FBXO7PCED1ACTDSP1MGAM
S100PFBXL5CA4MXI1CTSSAGAP6S100A4DDTVTI1BMTRNR2L8SEC62UBXN6SDCBPMMP25
LTA4HLILRA3PIM2MSR1RPS6FTLAPOBEC3AMYL9MED18APLP2NRGNTMCC2SRPRAMNDA
VNN2CRIP1IRF1SERF2RPL21KLF1SLC25A28RNF149P2RX5-TAX1BP3GLRX5ABI3HDGFLAPTM5MTRNR2L9
CD247CR1RGS2NPM1THNSL2MX1ADGRE5PINK1IFI44ABCC4LFNGKAT2BFAM210BMYO1F
DGLUCYLAP3PPM1FRPL26HNRNPDIL10RASLC38A5AATKTBC1D10CHSP90AB1KRT14ASAH1MEFVNCF2
LYL1PTGS1GZMHPYCARDSRP14LILRA4CDR1AQP9M6PRSIGLEC10HIST1H1CS100A9LYZNEAT1
EIF4G2TCEA1MYL12BCISD2ZFC3H1DENND2DSTX7YBX1RPL22ANXA1IGLC2PGM5HIST1H1ENKG7
GADD45BC15orf54CMTM6RPIASCPEP1CYP27A1IFIT1BNBPF14FCER1GZNF141LAMP2JUNTMSB4XNPRL3
MMECSF1RSRRM2SPOPLTYROBPSOD2GNAQPSMB9ESPNHVCN1CMTM1TRANK1NCOA4PDZK1IP1
HBA1GNASSLC35C2JAK1SMG1NIBAN3IGHDHSPB1RESF1ISCA1TRGC2LY6EPIM1PGGHG
EIF2S3BIFITM1PI3TAPBPRAB18DGAT2SLC7A5RIOK3FAM153BEIF3LOSBP2P2RY13HLA-APIP4K2A
ATP5MGARF6FPR2RGS10BTNL3MT1LRAB2BMGME1HSPA6RAI2STAT6POTEFCDC42SE1PRF1
TSC22D4JUNDUBE2OCFL1NBPF8HSPA8CALM2SLC44A2GIMAP4RPS8IQGAP1EFHD2IFITM3PROK2
NBPF26SELLARHGAP9DDX39BLCKRNF10ADSSACOX1TFDP1ARPC3YY1AP1OTUB1PPDPFPSMF1
PPM1ASRGNCD46CMPK2HNRNPA1ORMDL3GNSPLSCR1MDGA1PPP1CBPTBP1GANCMDM4RFK
ITGAXVAMP3ATF6BOGASLC38A1IGSF6COTL1LILRB3IVNS1ABPSRSF7NOSIPCD83CD52RNASET2
PGDRGPD1ACTBITGB1FCARNMIIFITM2CRTAPBCL11BTXNIPKIF20BGUK1TMEM123S100A6
ATG16L2SPOCK2TPST1UHMK1MALAT1ACTN1MANBASULF2F13A1ZYXPILRAGSPT1PRR13S100A9
MX2PITHD1CD37SRGAP2MYO1FPPBPCD300EPNISRFBXW7NTSR1GID4RPS16GUCD1S100P
VNN3GZMANAGAPOTEJCD5GPR146TRIM58KLRC2IFIT5SP110CCR2DEFA1BPSAPSEC62
ABHD18TPT1TMEM30ANPIPB11DDX5PIP4K2AHBG1ACAP2IGHMCPSF1HIST1H2BDANK1ARL6IP1SLC2A3
EVI2BPADI4TMOD1EIF2AK1FAM104ARPS27ACHCHD2ZNF83SIRPB2HIPK3ICAM3APAF1IGF2BP2SORL1
AOAHMTRNR2L6WSB1MARCH8SPARCCD36MBNL1SMARCA2FAM117AFOXO3RHOHSIGLEC968 GeneTENT5C
NT5C3AFBXL13CFDOPTNARHGDIBLILRA5ACSL1TMEM164S100A11TUBB2ABTF3R3HDM4SignatureTOMM7
FCER2NINJ2RFKITGB2TLN1TANKPRF1SRSF2RPS4Y1DMTNAKR1E2RBM39ACTBTRBC1
RPS3RSRP1TCF7FECHRHOQSEC14L1INSIG1CD8BFCGR2CRNF11MPP1RBBP4ALAS2TRIM22
HLA-EPTBP3UBE2HUIMC1FGFR1OP2NCF4ITGA2BSF3B1DPM2ITGALTRIM10RBM33ARHGAP9TRIM58
C4orf3PRPF4BTIMM10TRGV4SMIM1SAT1SLC15A4CCL4L2MGAMHEMGNCYBBVENTXARHGDIBUBB
ELOBLST1FLNAMNDAKDM5DRAF1EGR1PAK2SH3KBP1FCGR1ARPS20NUSAP1B2MVSIR
ADIPOR1SLC2A3GBP2LDHBFOXO4DEFA3MYL4HMOX1STK17BNPIPB4ACTG1LGALS2BNIP3LYBX1
TMEM50ACSF3RTNFRSF10CPRKAR1AGTF2IP1RPS15AHBMHCAR3ARRDC3FKBP8MS4A1NBPF19C9orf78YBX3
ZEB2OAZ1THEMIS2CCR7PLEK2TCIRG1YBX3CHI3L1RSAD2PXNUBE2WTENT5CCRIP1YWHAZ

Before proceeding to identify classification algorithms, the identified signature genes were grouped into pathways to explore/establish the translational significance of the differentially expressed signature genes. We performed an over-representation analysis in well-annotated biological processes (p value threshold of 0.05). The predominant pathways were immune response pathways (Fig. 3), with the three most significantly differentially expressed pathways related to the involvement of neutrophils in the immune and inflammatory response.

Fig. 3

Gene expression pathway analysis. The differentially expressed genes in the AD classifier signature were used to carry out analysis of what molecular pathways were differentially expressed. Gene set enrichment was performed using the Gene Ontology database for biological processes with a p value threshold of 0.05. The 20 most significant identified pathways are depicted, ranked by log adjusted p value for each.

Gene expression pathway analysis. The differentially expressed genes in the AD classifier signature were used to carry out analysis of what molecular pathways were differentially expressed. Gene set enrichment was performed using the Gene Ontology database for biological processes with a p value threshold of 0.05. The 20 most significant identified pathways are depicted, ranked by log adjusted p value for each.

We used different machine learning approaches, together with the 770 gene signature, to build algorithms that differentiated control, AD and PD. Methods used were k-nearest neighbors, Random forest, Support vector machine (SVM) with either linear, polynomial or Radial Basis Function (rbf) kernel and Extreme Gradient Boosting (XGboost) and for all of these models hyper-parameters were tuned using a Random Grid approach. Randomly sampled subsets accounting for 80% of the initial dataset were used to train these classifiers, which were then tested on the remaining 20% of the samples. For each classifier an area under receiver operating characteristic (ROC) curve (AUC-ROC) was calculated for classification of both AD and PD. The Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) resulted in the best performance for the classification of both AD (ROC-AUCAD = 0.81), and for PD (ROC-AUCPD = 0.88), as shown in Fig. 4. This model was used without further retraining to classify an independent cohort of 22 AD patients (cohort B), 18 of which were predicted as AD while the remaining 4 were predicted as PD. The two cohorts were pooled and the same SVM model was retrained using 80% of the samples and tested with the remaining 20%. The AUC for AD increased to 0.86 (Fig. 5), and the AUC for PD remained 0.87. The model misclassified one control classified as AD, one AD as control, and one PD as AD thus providing an accuracy of 97% across all 106 samples. Thus, while the average age of controls was 67, of AD was 78, and PD was 68, and there were a percentage of patients with co-morbidities, these factors did not appear to have significantly impacted classification and the 97% accuracy of the test.

Fig. 4

ROC Curve for classification of Cohort A samples as AD and PD. The ROC curves are shown for the classification of patient samples as AD or PD using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and the initially identified 770 gene signature. Use of this model resulted in the best performance for the classification of both AD (ROC-AUCAD = 0.81), and for PD (ROC-AUCPD = 0.86). The model was trained using 80% each of the control, AD, and PD samples in cohort A. As a test for classification accuracy, the trained model was used to classify the remaining 20% of samples. The resulting ROC-AUC vales were 0.81 for AD and 0.88 for PD.

ROC Curve for classification of Cohort A samples as AD and PD. The ROC curves are shown for the classification of patient samples as AD or PD using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and the initially identified 770 gene signature. Use of this model resulted in the best performance for the classification of both AD (ROC-AUCAD = 0.81), and for PD (ROC-AUCPD = 0.86). The model was trained using 80% each of the control, AD, and PD samples in cohort A. As a test for classification accuracy, the trained model was used to classify the remaining 20% of samples. The resulting ROC-AUC vales were 0.81 for AD and 0.88 for PD.
Fig. 5

ROC Curve for Classification of the combined Cohort A and B samples. The ROC curves for the classification of patient samples using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and the initially identified 770 gene signature are depicted after pooling together Cohorts A and B to provide a sample set of 106 samples, training with 80% of the samples, and testing with the remaining 20%. For this combined cohort, the ROC-AUC was 0.86 for Ad and 0.87 for PD.

ROC Curve for Classification of the combined Cohort A and B samples. The ROC curves for the classification of patient samples using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and the initially identified 770 gene signature are depicted after pooling together Cohorts A and B to provide a sample set of 106 samples, training with 80% of the samples, and testing with the remaining 20%. For this combined cohort, the ROC-AUC was 0.86 for Ad and 0.87 for PD.

The 770 gene signature was further investigated in an effort to identify the smallest subset that could reliably classify patients using the SVM model. Briefly, 100 random training sets were selected from the original dataset A, B, and C, each consisting of 80% of the total samples. SVM models were trained for each random dataset and tested on the remaining 20%. For each of the 770 genes a weight score was calculated at each iteration and such scores were summed in order to obtain a cumulative value. These values were sorted in descending order and ROC curves were calculated for the AD classifications using the genes featuring the 5 highest cumulative scores. This analysis was repeated using gene sets spanning between 6 and 100 of the top scorers. The same approach was used to select the genes with the highest weight for the classification of PD patients. A steady maximum level of AUC was observed when the first 47 of the top scoring genes were selected for both the AD and PD classifications, with 13 of those genes being in common between AD and PD. The final signature consisted of a total of 68 genes. This signature set of 68 genes (see Table 1) was used to recalculate the ROC-AUC for the classification analysis. Figure 6 shows the ROC-AUC curves relative to dataset A, B, and C, providing an ROC-AUCAD = 0.9, and ROC-AUCPD = 0.94, outperforming the 770 gene classifier. Across all 106 samples, two were miss-classified, providing an accuracy using the 68 gene signature of 98%.

Fig. 6

ROC Curve for classification of Cohort A, B, and C samples as AD and PD, 68 gene signature. The ROC curves are shown for the classification of patient samples as AD or PD using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and minimal 68 gene signature. The model was trained using 80% each of the control, AD, and PD samples in cohorts A, B, and C. As a test for classification accuracy, the trained model was used to classify the remaining 20% of samples. The resulting ROC-AUC values were 0.9 for AD and 0.94 for PD.

ROC Curve for classification of Cohort A, B, and C samples as AD and PD, 68 gene signature. The ROC curves are shown for the classification of patient samples as AD or PD using the Support Vector Machine (SVM) model (kernel = linear, gamma = 0.1, c = 0.001) and minimal 68 gene signature. The model was trained using 80% each of the control, AD, and PD samples in cohorts A, B, and C. As a test for classification accuracy, the trained model was used to classify the remaining 20% of samples. The resulting ROC-AUC values were 0.9 for AD and 0.94 for PD.

Having established the classification algorithm for AD based on cognitive/functional assessment of the subjects we assessed how this correlated to Aβ PET scores to begin assessing whether the TempO-SeqDBS AD classifier might be definitive. Because Aβ PET scans were not reimbursable until after the samples were collected (reimbursement was approved October 2023), the set of patients for which scans were available was limited. There were Aβ PET scores available for 12 of the patients clinically diagnosed with AD. Nine patients were scored as positive for plaques in excess of those typical for their age, supporting a diagnosis of AD, while three patients were scored as negative, not significantly different from the degree of plaque formation for a person their age. All twelve AD patients, whether with positive or negative Aβ PET scores indicative of AD, were classified as AD by the TempO-SeqDBS test. In addition, one patient diagnosed with PD had both a positive Aβ PET scan and DaTscan (used to distinguish between PD where it is positive for loss of dopamine, and essential tremor), and was classified as PD by the TempO-SeqDBS assay.

DISCUSSION

These TempO-SeqDBS test results demonstrate not only the feasibility of identifying gene expression signatures from fingerstick whole blood spotted on filter paper, but also signatures for AD and PD that enabled patients to be classified as AD or PD. These data indicate the feasibility of implementing a test for AD (and PD) that uses a minimally invasive sample that can not only be collected in any doctor’s office or clinic, but also can be self-collected, and thus, address health disparities, whether a definitive test or a screen that identifies patients who should be seen for further assessment.

The ROC-AUC values of 0.9 and 0.94 for AD and PD, respectively (Fig. 6) for the combined Cohorts A, B, and C, were excellent, indicating strong putatively diagnostic performance. While we acknowledge the risk of overfitting data from small sample sizes, the consistent results for the additional independent AD patient Cohort B of 22 samples, tested separately after retraining the 770 gene signature with all the Cohort A samples, strengthened the validity and generalizability of our findings. Two of the Cohort A, B, and C AD samples were classified as PD, resulting in 98% accuracy calling AD. Thus, the assay provides highly accurate classification whether based on the AUC of 0.9 for AD (Fig. 6) or this final measure of 98% accuracy. It also accurately classified PD (AUC = 0.94, Fig. 6), but the number of PD samples was more limiting, and this will be the focus of a follow-up investigation.

Demonstrating that patients with a clinical diagnosis of AD can be classified from fingerstick blood using a gene expression assay represents the first step towards developing a screening test that informs subjects to see a neurologist for diagnosis earlier then they might otherwise have sought diagnosis, and, potentially, a definitive diagnostic test. With the recent lifting of the curb on reimbursement for Aβ PET scans for AD patients (October 2023), a positive Aβ PET scan compared to non-AD individuals of the same age may also be required for definitive diagnosis. However, in the meantime, Aβ PET scans serve as a useful benchmark test, though itself not definitive.

While Aβ PET scores were only available for 12 (24%) of the 50 patients clinically diagnosed with AD during the sample collection period (as patients had to pay for these scans themselves), 9 were Aβ PET score positive, 3 negative. That all 9 patients with positive Aβ PET scores were also classified as AD by the TempO-SeqDBS test suggests that this classifier may be specific for AD compared to other dementias. That it is not just correlated to positive Aβ PET scores is supported by the observation one patient clinically diagnosed as PD had both a positive Aβ PET scan and positive DaTscan, and this patient was classified by the TempO-DeqDBS test as PD, not AD. A much larger cohort of AD and PD samples with associated Aβ PET scores and DaTscan data is necessary to confirm this conclusion.

Additionally, the classification of all three patients clinically diagnosed with AD, but with Aβ PET scores that were not indicative of AD, requires further investigation. The research demonstrating a blood-based gene expression assay’s ability to classify AD patients as much as two years before diagnosis, supports the possibility that the TempO-SeqDBS test can identify patients with AD before they become Aβ PET positive.21 To determine how early in the progression of AD the TempO-SeqDBS test can classify patients, longitudinal samples need to be collected and tested from individuals before they develop dementia (both those with and without MCI). This will allow us to determine whether classification can occur before patients are biomarker or Aβ positive, and whether all patients classified as AD by the TempO-SeqDBS test will eventually become biomarker and Aβ PET positive.

While we do not have data addressing whether changes in gene expression were connected to disease progression or to disease response, the three most significantly differentially expressed pathways in AD blood relate to the involvement of neutrophils in the immune and inflammatory response, consistent with the literature.32–35 Among other significant pathways were the cytokine-mediated signaling pathway, cellular response to cytokine stimulus, antigen, and receptor-mediated signaling pathway, all consistent with AD immune cell function literature.36,37 Whole blood single-cell gene expression studies are planned to address which cells contribute to the classifier, and to pursue the association of specific cells and AD.

Once validated, there are several scenarios for use of the TempO-SeqDBS test, particularly because there is a shortage of neurologists with a wide diversity in access based on geographic location of a patient, which reduces access to care, increases health disparities, and worsens patient outcomes.11,12 One potential application of the TempO-SeqDBS AD test is as a screening tool for patients who present to their general practitioner (GP) with concerns about early-stage AD or PD. The test results could then be used to identify individuals who require further evaluation. Another potential use is for self-testing, which could directly reduce health disparities. This would also help identify individuals who should seek further evaluation. Thus, without having established that the test is definitive, but rather using the TempO-SeqDBS AD test for screening, neurologists would see patients earlier in the course of their disease, allowing for earlier intervention and potentially greater benefit from therapy. With the recent approval of donanemab and lecanemab for the treatment of early-stage AD, earlier diagnosis can lead to significant benefits for patients. The TempO-SeqDBS AD test could help patients realize these benefits by eliminating the delay caused by limited access to medical specialists, even if the diagnosis itself cannot be made any sooner than with traditional cognitive testing and biomarker analysis conducted by a neurologist. There is also the potential that the TempO-SeqDBS test could be definitive. To reach this conclusion it will be necessary to correlate the classification by the TempO-SeqDBS assay to biomarker tests and Aβ PET scan data, demonstrating that when classified as AD those patients either are, or over time become, biomarker and Aβ PET scan positive.

AUTHOR CONTRIBUTIONS

Bruce E. Seligmann (Conceptualization; Funding acquisition; Project administration; Supervision; Visualization; Writing – original draft; Writing – review & editing); Salvatore Camiolo (Data curation; Formal analysis; Software; Validation; Writing – review & editing); Monica Hernandez (Investigation; Methodology; Writing – review & editing); Joanne M. Yeakley (Methodology; Writing – review & editing); Gregory Sahagian (Resources; Visualization; Writing – review & editing); Joel McComb (Project administration; Resources; Writing – review & editing).

ACKNOWLEDGMENTS

We thank Joelle McComb, Gail Ramirez and April Tenorio for accruing subjects and obtaining the blood samples.

FUNDING

Funding for this work was in part provided by the National Institute of Health National Institute on Aging grant 1R43 AG065039.

CONFLICT OF INTEREST

Bruce Seligmann, Salvatore Camiolo, Monica Hernandez, Joanne Yeakley, and Joel McComb are current employees of BioSpyder Technologies, Inc., which has commercialized the TempO-SeqDBS assay and may commercialize a test for Alzheimer’s or other dementias based on this platform and the data presented. Greg Sahagian is CEO of the Neurology Center of Southern California which received funding to provide samples.

DATA AVAILABILITY

The data supporting the findings of this study are openly available in ENA (https://www.ebi.ac.uk/ena/) with the study accession number PRJEB71651.

REFERENCES

1. 

World Health Organization, Dementia - WHO 2022 https://www.who.int/news-room/fact-sheets/detail/dementia (2022, accessed 1 September 2022).

2. 

Gauthier G , Rosa-Neto P , Aorais JJ , et al. World Alzheimer Report 2021: Journey through the diagnosis of dementia. London: Alzheimer’s Disease International, (2021) .

3. 

Alzheimer’s disease facts and figures. Alzheimers Dement (2022) ; 18: : 700–789.

4. 

Dubois B , Villain M and Frisoni GB . Clinical diagnosis of Alzheimer’s disease: recommendations of the international working group. Lancet Neurol (2021) ; 20: : 484–496.

5. 

Thijssen EH , Verberk IMW , Kindermans J , et al. Differential diagnostic performance of a panel of plasma biomarkers for different types of dementia. Alzheimers Dement (Amst) (2022) ; 14: : e12285.

6. 

Hansson O , Seibyl J , Stomrud E , et al. CSF biomarkers of Alzheimer’s disease concord with amyloid-b PET and predict, clinical progression: A study of fully automated immunoassays in bioFINDER and ADNI cohorts. Alzheimers Dement (2018) ; 12: : 1470–1481.

7. 

Gobom J , Parnetti L , Rosa-Neto P , et al. Validation of the LUMIPULSE automated immunoassay for the measurement of core AD biomarkers in cerebrospinal fluid. Clin Chem Lab Med (2022) ; 60: : 207–219.

8. 

Jiao B , Liu H , Guo L , et al. Performance of plasma amyloid β, total tau, and neurofilament light chain in the identification of probable Alzheimer’s disease in south China. Front Aging Neurosci (2021) ; 13: : 749649.

9. 

Shaw LM , Hansson O , Manuilova E , et al. Method comparison study of the Elecsys® b-amyloid (1-42) CSF assay versus comparator assays and LC-MS/MS. Clin Biochemistry (2019) ; 72: : 7–14.

10. 

Willemse EAJ , Van Maurik IS , Tijms BM , et al. Diagnostic performance of Elecsys immunoassays for cerebrospinal fluid Alzheimer’s disease biomarkers in a nonacademic, multicenter memory clinic cohort: the ABIDE project. Alzheimers Dement (Amst) (2018) ; 10: : 563–572.

11. 

Majersik JJ , Ahmed A , Chen I-H A , et al. A shortage of neurologists – we must act now: A report from the AAN 2019 transforming leaders program. Neurology (2021) ; 96: : 1122–1134.

12. 

Lin CC , Callaghan BV , Burke JF , et al. Geographic variation in neurologist density and neurologic care in the United States. Neurology (2021) ; 96: : e309–e321.

13. 

Li H , Honf G , Lin M , et al. Identification of molecular alterations in leukocytes form gene expression profiles of peripheral whole blood of Alzheimer’s disease. Sci Rep (2017) ; 7: : 14027.

14. 

Tang R and Liu H. Identification of temporal characteristic networks of peripheral blood changes in Alzheimer’s disease based on weighted gene co-expression network analysis. Frontiers Aging Neurosci (2019) ; 11: : 83.

15. 

Rahman R , Islam T , Zaman T , et al. Identification of molecular signatures and pathways to identify novel therapeutic targets in Alzheimer’s disease: Insights from a systems biomedicine perspective. Genomics (2020) ; 112: : 1290–1299.

16. 

Patel H , Iniesta R , Stahl D , et al. Working towards a blood-derived gene expression biomarker specific for Alzheimer’s disease. J Alzheimers Dis (2020) ; 74: : 545–561.

17. 

Han G , Wang J , Seng F , et al. Characteristic transformation of blood transcriptome in Alzheimer’s disease. J Alzheimers Dis (2013) ; 35: : 373–386.

18. 

Rye PD , Booig BB , Grave G , et al. A novel blood test for the early detection of Alzheimer’s disease. J Alzheimers Dis (2011) ; 23: : 121–129.

19. 

Voyle N , Keohane A , Newhouse S , et al. A pathway based classification method for analyzing gene expression for Alzheimer’s disease diagnosis. J Alzheimers Dis (2016) ; 49: : 659–669.

20. 

Milanesi E , Dobre JM , Cucos CA , et al. Whole blood expression patterns of inflammation and redox genes in mild Alzheimer’s disease. J Inflammation Res (2021) ; 14: : 6085–6102.

21. 

Roed L , Grave G , Lindahl T , et al. Prediction of mild cognitive impairment that evolves into Alzheimer’s disease dementia within two years using a gene expression signature in blood: A pilot study. J Alzheimers Dis (2013) ; 35: : 611–621.

22. 

Lunnon K , Ibrahim Z , Proitsi P , et al. Mitochondrial dysfunction and immune activation are detectable in early Alzheimer’s disease blood. J Alzheimers Dis (2012) ; 30: : 685–710.

23. 

Lee T and Lee H. Prediction of Alzheimer’s disease using blood gene expression data. Sci Rep (2020) ; 10: : 3485.

24. 

Shigemizu D , Mori T , Akiyama S , et al. Identification of potential blood biomarkers for early diagnosis of Alzheimer’s disease through RNA sequencing analysis. Alzheimer Res Ther (2020) ; 12: : 87.

25. 

Leandro GS , Evangelista AF , Lobo RR , et al. Changes in expression profiles revealed by transcriptomic analysis in peripheral blood mononuclear cells of Alzheimer’s disease patients. J Alzheimers Dis (2018) ; 66: : 1483–1495.

26. 

Santiago JA , Bottero V and Potashkin JA. Evaluation of RNA blood biomarkers in the Parkinson’s disease biomarkers program. Frontiers Aging Neurosci (2018) ; 10: : 157.

27. 

Gao A . Identification of blood-based biomarkers for early stage Parkinson’s disease. medRxiv (2020) ; doi: https://doi.org/10.1101/2020.10.22.20217893. Posted October 27, 2020.

28. 

Henderson AR , Wang A , Meechoovet B , et al. DNA methylation and expression profiles of whole blood in Parkinson’s disease. Front Genet (2021) ; 12: : 640266.

29. 

Yeakley JM , Shepard PJ , Goyena DE , et al. A trichostatin A expression signature identified by TempO-Seq targeted whole transcriptome profiling. PLoS One (2017) ; 12: : e0178302.

30. 

McDade TW , Ross K , Fried R , et al. Genome-wide profiling of RNA from dried blood spots: convergence with bioinformatic results derived from whole venous blood and peripheral blood mononuclear cells. Biodemography Soc Biol (2016) ; 62: : 182–197.