Using Extracellular miRNA Signatures to Identify Patients with LRRK2-Related Parkinson’s Disease

Background: Mutations in the Leucine Rich Repeat Kinase 2 gene are highly relevant in both sporadic and familial cases of Parkinson’s disease. Specific therapies are entering clinical trials but patient stratification remains challenging. Dysregulated microRNA expression levels have been proposed as biomarker candidates in sporadic Parkinson’s disease. Objective: In this proof-of concept study we evaluate the potential of extracellular miRNA signatures to identify LRRK2-driven molecular patterns in Parkinson’s disease. Methods: We measured expression levels of 91 miRNAs via RT-qPCR in ten individuals with sporadic Parkinson’s disease, ten LRRK2 mutation carriers and eleven healthy controls using both plasma and cerebrospinal fluid. We compared miRNA signatures using heatmaps and t-tests. Next, we applied group sorting algorithms and tested sensitivity and specificity of their group predictions. Results: miR-29c-3p was differentially expressed between LRRK2 mutation carriers and sporadic cases, with miR-425-5p trending towards significance. Individuals clustered in principal component analysis along mutation status. Group affiliation was predicted with high accuracy in the prediction models (sensitivity up to 89%, specificity up to 70%). miRs-128-3p, 29c-3p, 223-3p, and 424-5p were identified as promising discriminators among all analyses. Conclusions: LRRK2 mutation status impacts the extracellular miRNA signature measured in plasma and separates mutation carriers from sporadic Parkinson’s disease patients. Monitoring LRRK2 miRNA signatures could be an interesting approach to test drug efficacy of LRRK2-targeting therapies. In light of small sample size, the suggested approach needs to be validated in larger cohorts.


INTRODUCTION
The majority of familial Parkinson's disease (fPD) cases is caused by mutations in the Leucine Rich Repeat Kinase 2 (LRRK2) gene [1,2], with the G2019S mutation being the most common [3].Despite its apparent role in PD, LRRK2 expression levels in the CNS are relatively low compared to peripheral tissues such as the lung and the kidneys, as well as blood monocytes [4,5].Increased kinase activity is thought to be the cause of the disturbed cellular homeostasis observable in cell models of PD [6].Most importantly, LRRK2 was shown to play a role in sporadic cases, with non-coding variants in the LRRK2 gene increasing the risk of PD [7].Further, sporadic PD (sPD) and fPD cases show overlapping clinical features [8].Studying fPD cases therefore offers the opportunity to better understand the molecular pathogenesis of both fPD and sPD cases.
Targeting multiple molecular pathways in selected PD patients with an individually formulated combination of drugs could be a potential strategy for disease modification [9].However, this strategy necessitates not only the discovery of particular druggable pathways, but also the careful selection of patients who are most likely to benefit from a given treatment, ideally at an early or even asymptomatic stage in the course of their disease.While aberrant LRRK2 activity represents such a molecular target and LRRK2 inhibitors are approaching clinical testing [10,11], pre-selecting patients as well as monitoring target engagement and drug efficacy remain challenging.There are no established biomarkers available that can aid in the early detection of disease onset before significant neurodegeneration occurs.
Recently, microRNAs (miRNAs), small noncoding RNA molecules that are involved in the posttranscriptional regulation of gene expression [12], have been intensively studied as potential biomarkers for neurodegenerative diseases, including PD [13,14].So far, most biomarker studies using miRNAs have focused on the differentiation of sPD patients from healthy controls (HC), while in the present proof-of-concept study we aimed to test how well extracellular miRNAs can be used to reliably differentiate between sPD patients and LRRK2 mutation carriers (LRRK2 MC ).Since the distinction between fPD and sPD is possible by using more stringent methods, such as sequencing, in the long run we aim to identify individuals with relevant involvement of LRRK2-dependent pathways among the sPD population.However, in a first step we decided to define what constitutes a molecular LRRK2 fingerprint by comparing the extracellular miRNA signatures of fPD and sPD cases.Given that LRRK2 was described to change the cellular miRNAome [15], we hypothesized that mutations in the LRRK2 gene also alter the extracellular miRNA signature in LRRK2 MC .

Experimental design
In a first step, we quantified the expression levels of 91 extracellular miRNAs from a cohort of ten sPD, ten fPD, and eleven HC in both plasma and CSF using RT-qPCR (Fig. 1).After processing of raw Ct values and calculation of log2 fold change values (log2fc), as described in more detail in the respective method section, we performed t-tests to identify single miRNAs with the potential of discriminating between groups.We further applied group sorting algorithms such as principal component analysis (PCA), least absolute shrinkage and selection operator (LASSO) and ran-Fig.1. Study Design Overview.The experimental cohort compromised 31 patients categorized into three groups: individuals carrying a LRRK2 mutation, sporadic PD patients and healthy controls.A total of 91 miRNAs, selected by reviewing relevant literature, were quantified in CSF and plasma samples using reverse transcription and qPCR.Subsequently, the obtained raw Ct values were calibrated and fc and log2fc values were calculated.miRNAs that were expressed only in subgroups were excluded.Group differences were then analyzed using t-tests, PCA, Random Forest and LASSO regression.Created with BioRender.com.dom forest (RF) using the calibrated Ct values of all miRNAs that could be reliably detected throughout the cohort.The ability to differentiate between sPD patients and LRRK2 MC with and without PD was assessed for each sorting algorithm by evaluating their sensitivity and specificity.Finally, we selected those miRNAs that had a significant impact on the performance of each model.

Ethics approval and consent to participate
The study was approved by the Hertie Institute for Clinical Brain Research Biobank and the ethics committee of the medical faculty of the University of Tübingen and University Clinic Tübingen (project ID: 199/2011B01).All participants gave written, informed consent.

Cohort design
For each LRRK2 MC , one sPD and one HC was added to the study cohort.To obtain homogenous groups, individuals were selected based on age, gender, and disease duration.As the histopatholog-ical complexity of PD likely increases over time and patients with long disease durations might no longer display pathologies specific to mutation status rather than showing features common to all PD patients, disease durations were kept as short as possible.The group of LRRK2 MC initially included a PD patient carrying two LRRK2 variants (N1437S, S1647).Given that these polymorphisms were not reported to be pathogenic, we finally decided to exclude this patient from the analysis.The final cohort therefore comprised three groups (Table 1): Ten LRRK2 MC (fPD patients: n = 6 (LRRK2 G2019S: n = 3; LRRK2 G2019S/G1819: n = 1, LRRK2 R1441C: n = 1; LRRK2 I2020T: n = 1), asymptomatic G2019 S mutation carriers: n = 4), ten sPD patients and eleven HCs.After performing PCA as described in more detail in the respective method section, the data set of one of the asymptomatic LRRK2 MC appeared as a technical or biological outlier and was removed from the plasma dataset before further analysis (Supplementary Figure 1).Clinical features are reported in

miRNA selection and qPCR panel design
We scanned the literature for miRNAs that were previously reported to be dysregulated in the context of LRRK2 mutations, functionally relevant to LRRK2 or generally dysregulated in the context of PD.Next, we selected a total of 91 miRNAs to be included in our customized qPCR panel (Supplementary Table 1).Selection was based on a set of criteria, including the number of relevant studies reporting dysregulation and a preference for miRNAs assessed in CSF and plasma samples over miRNAs reported in brain tissue or animal models.Additionally, miRNAs associated with LRRK2-linked PD were given priority over those associated with sporadic PD or other neurodegenerative diseases.For quality control purposes, we included a set of spike-in controls.Those included UniSpike 2 and 4 from the QIAGEN RNA Spike-In Kit (Cat.No.: 339390, QIAGEN, Hilden, Germany), which were used to monitor RNA isolation efficacy, UniSpike 6 (included in the QIAGEN miRCURY LNA RT Kit) to verify the efficacy of reverse transcription (RT) and UniSpike 3 (pre-applied primers on the custom qPCR plates) for inter-plate calibration.Additionally, a blank spot containing no primer was included and functioned as a negative control.

Collection and storage of biofluids from study participants
Blood and CSF samples were collected from individuals according to standardized protocols.Within 90 min after collection from the individual, the collected material was brought from the medical facilities to the Neuro-Biobank Tuebingen and immediately processed.CSF and EDTA plasma were centrifuged at 2000 g for 10 min.Samples were then transferred to a 15 ml Falcon tube, mixed thoroughly by vortexing, and aliquoted to cryotubes.Finally, both CSF and plasma cryotubes were frozen and stored at -80 • C in the biobank.Before RNA isolation, samples were slowly thawed on ice.

RNA isolation
Extracellular RNA was isolated from plasma and CSF using the QIAGEN miRNeasy Serum/Plasma Advanced Kit (Cat.No: 217204, QIAGEN) and following the vendor's instructions with slight modifications.Briefly, 200 l of plasma or 400 l of CSF were used as a starting volume and subsequent reaction volumes were adjusted accordingly.After addition of RPL buffer and incubation for three minutes, we added 1 l of the spike-in mix provided in the QIAGEN RNA Spike-In Kit (Cat.No.: 339390, QIAGEN) to the tube, in order to monitor the efficacy of the RNA isolation.Finally, RNA elution volume was reduced to 7 l of RNase-free water.

Reverse transcription
The RT reaction was performed using the QIA-GEN miRCURY LNA RT Kit (Cat.No.: 339340, QIAGEN).The reaction mix included 2 l of 5x miR-CURY SYBR ® Green RT Reaction Buffer, 1 l of 10x miRCURY RT Enzyme Mix, 4.5 l of RNasefree water, 0.5 l of UniSpike 6 RNA spike-in and 2 l of template RNA per sample.After brief centrifugation, tubes were incubated at 42 • C for 60 min, followed by 5 min of incubation at 95 • C, which inactivated the RT reaction.Finally, samples were diluted 1 : 40 using RNase-free water.qPCR qPCR was performed using QIAGEN miRCURY LNA miRNA Custom PCR Panels (Catalog No.: 339330, QIAGEN) along with the QIAGEN miR-CURY LNA SYBR Green PCR Kit (Cat.No.: 339347, QIAGEN).The reaction mix included 515 l of 2x miRCURY SYBR ® Green Master Mix, 115 l of RNase-free water and 400 l of the cDNA template per sample.After pipetting 10 l into each reaction-well, which already contained the dried primers, the plates were sealed and vortexed for 1 min.After brief centrifugation, plates were incubated for 10 min at room temperature to allow primers to resolve.Finally, plates were vortexed for 1 min and briefly centrifuged.Amplification and quantification were performed using a LightCycler ® 480 instrument (Roche, Basel, Switzerland).The cycling program included a 2-min heat activation step at 95 • C, followed by 45 cycles of denaturation at 95 • C for 10 s and annealing at 56 • C for 60 s.To ensure specificity of amplifications, melting points of the amplified products were analyzed (Supplementary Tables 2  and 3).

Processing of raw data
Before applying statistical analysis, raw data were processed and Ct and fold-change values (fc) were calculated (Supplementary Figure 2): First, raw data from the qPCR were converted using the LC480Conversion software to make them readable for LinRegPCR (Version 11.0, [16]), which was used for calculating Ct values.Further processing and analysis of data was performed in R (Version 4.2.2) using RStudio (Version 2023.03.0 + 386) [17].To account for variations between plates, the Ct values were normalized using an interplate calibration factor.The calibration factor for each plate was calculated by subtracting the mean Ct value of UniSpike 3 reactions from all plates from the mean Ct value of all UniSpike 3 reactions in a given plate.All Ct values of the respective plate were then calibrated by subtracting the respective calibration factor from all Ct values in that plate.After calibration, Ct values > 40 were considered unspecific and excluded from the dataset.If a miRNA was not specifically amplified in more than one individual, it was removed completely from the analysis, after confirming that missing values occurred equally over all groups (Supplementary Figure 3).This resulted in 58 remaining miRNAs for the plasma dataset and eleven miRNAs for the CSF dataset.
Next, the fc and log2fc values were calculated using the 2 -Ct method (Supplementary Figure 4) [18].In a first step, for each individual we independently calculated the mean Ct by adding all Ct values of every quantified miRNA and dividing it by the number of quantified miRNAs, which resulted in a mean Ct for each individual.We then computed Ct for each miRNA and individual by subtracting the mean Ct of the respective individual from every miRNA quantified in that individual.This resulted in multiple Ct miR-n per individual, where n represents the specific miRNA.The entire group of healthy controls was used as a reference group.Subsequently, for each miRNA we calculated the mean of Ct miR-n , only using Ct miR-n from the HCs (mean Ct miR-n-healthy ).Next, Ct for each miRNA and individual was determined by subtracting the mean Ct miR-n-healthy from the patient's Ct miR-n .Finally, the fold change was obtained using the following formula: fold change = 2 -Ct .To achieve a linear scale with symmetry around zero, log2fc was used for the heatmaps and t-tests.Data from the HCs were used only as a reference for calculating the fold change values.Since the focus of this study was put on the identification of a LRRK2-driven molecular patterns and the discrimination between fPD and sPD cases, in the subsequent analyses, data from the HCs were not analyzed further.

Data visualization
Heatmaps visualizing either the log2fc or calibrated Ct values were created using the pheatmap package in R [19].Log2fc and Ct values were scaled row-wise to z-scores using the scale function while ignoring missing values.After confirming normality of the data via assessing QQ-Plots, an unpaired t-test using the log2fc values was performed for each miRNA, comparing the sPD group with the LRRK2 MC .The p-value threshold to determine significance was corrected for multiple testing by dividing by the number of tests.The distribution of pvalues was examined and visualized via histograms.For miRNAs that the t-test revealed to be significantly differentially expressed between the groups, ROC analysis was performed.PCA was performed on calibrated Ct values using the built-in prcomp function of R. As the function does not accept missing values, they were imputed by setting them to the mean Ct value from all individuals.Two-dimensional plots were generated by graphing PC1 and PC2 values.

Least absolute shrinkage and selection operator regression models
The caret package [20] was used for building the LASSO regression models.For missing values, the mean of the Ct values of all patients was used.We selected sensitivity and specificity in differentiating between the sPD and LRRK2 MC group, determined by Leave-One-Out-Cross-Validation (LOOCV) [21], as our experimental read-outs.

Random forest prediction model
For building a RF model that predicts group affiliation to either the sPD or the LRRK2 MC group, the randomForest package [22] was used.To obtain mean values of sensitivity and specificity along with 95% confidence intervals, 100 models were constructed with each model containing 200 trees.For the RF model we did not use LOOCV, as sensitivity and specificity of the classification is already tested on out-of-bag samples reducing the risk of over-fitting.Along with this, we generated proximity heatmaps describing similarity of patient pairs based on the number of trees that classify those two patients in the same terminal node.Missing values were computed using the rfImpute function.The Gini coefficient's mean decrease, which describes each decision node in terms of classification accuracy, was used to evaluate the importance of each variable.High purity of a node results in a low Gini coefficient.Consequently, variables that are more crucial for distinguishing between groups exhibit a greater mean decrease of the Gini coefficient compared to others.

Integration of CSF and plasma data sets
To integrate the information from both the plasma and CSF datasets, combined variables were computed.First, Pearson's correlation analysis was performed between the Ct values of the eleven miR-NAs reliably detectable in CSF (Ct CSF ) and the Ct values of the corresponding miRNAs detected in plasma (Ct plasma ).A two-tailed p-value cut-off was applied, and alpha was set to 0.05.All significant combinations were used to form new variables by multiplying the corresponding Ct values (Ct multiplied ).This generated 20 new variables, which were subsequently used to create a heatmap, perform t-tests and PCA and generate LASSO and RF models.

Identification of discriminatory miRNAs
In a final step, we wanted to select those miRNAs that most accurately discriminated between groups.This feature selection was performed for each of the analyses using the following criteria: 1) p-values resulting from the unpaired t-test were sorted in ascending order and the top five miRNAs with the smallest p-value were selected; 2) the loading scores of PC1 were assessed and the top five miRNAs were selected; 3) the five most influential miRNAs from the RF model were selected based on the mean decrease of the Gini coefficient; 4) the LASSO regression model automatically selects discriminatory features through shrinkage and elimination of less relevant variables by introducing penalties; in this case, three miRNAs were selected.As the CSF data set alone had proven to not efficiently discriminate between groups, these selection steps were only applied to the Ct plasma and Ct multiplied data sets.

Target prediction enrichment analysis
We used the miRDB database [23] to predict gene targets of selected miRNAs.Next, for each miRNA we selected the top 100 genes sorted by Target score and performed Gene Ontology (GO) enrichment analysis in R using the ontology class "biological processes", the database org.HS.eg.db and the clus-terProfiler package [24].

Detection of miRNA expression patterns in plasma and CSF
In plasma, a total of 58 miRNAs passed our selection criteria.23 miRNAs showed stable expression in 29/30 individuals while the remaining 35 miRNAs were detected in all 30 individuals.When visualizing log2fc values using a heatmap, the group of LRRK2 MC , which included both symptomatic and asymptomatic mutation carriers, showed similar expression patterns and could be distinguished from the sPD group (Fig. 2A).When using the calibrated Ct plasma data, the expression patterns became less apparent but still noticeable (Supplementary Figure 5A).In contrast, in CSF only eleven miRNAs survived our selection process, with five miRNAs detected in all 31 individuals and six miRNAs in 30/31 individuals.Expression levels in CSF were generally lower compared to plasma.In CSF, no clear clustering was observable (Fig. 2B).Heatmaps based on calibrated Ct CSF also did not show any clear observable group-specific clusters (Supplementary Figure 5B).

Detecting patient clusters using plasma miRNA expression levels and PCA
PCA explained 82% of the overall variance in the plasma dataset.PC1 (77.4% of variance) explained the majority of the variance and appeared to discriminate groups, while PC2 only explained 4.6% of the overall variance (Fig. 3A).Interestingly, all LRRK2 MC clustered together while the various LRRK2 mutations could not be distinguished (Fig. 3A).PC1 discriminated LRRK2MC from sPD with a sensitivity of 100% (9/9) and a specificity of 70% (7/10).PC1 and PC2 combined reached the same sensitivity with a specificity of 80% (8/10) (Supplementary Figure 7A).Next, we performed PCAs including the healthy controls (Fig. 4).While clustering was less clear when comparing HC to sPD, for comparison of HC and LRRK2 MC clear clustering was observable (Fig. 4A, B).The clearest clustering was achieved when comparing LRRK2 MC with sPD (Fig. 4C).When including all three groups in the PCA, clustering becomes less clear, yet LRRK2 MC remains notably separated from sPD and HCs (Fig. 4D).The PCA of the CSF data explained 84.2% of the overall variance, with both PC1 (75.4% of variance) and PC2 (8.8% of variance) not able to differentiate between the groups (Fig. 3B).

miRNA expression levels in CSF correlate to plasma levels
A total of 20 miRNA combinations from Ct plasma and Ct CSF values showed significant correlation (Fig. 3C) with R-values ranging from 0.36 to 0.51 (Supplementary Table 5).Significantly correlating combinations were used to obtain new variables by multiplication (Ct multiplied ).

Multiplied Ct values discriminate groups in t-test and PCA
When creating a heatmap based on the Ct multiplied data, no clear group-specific clusters were noticeable (Supplementary Figure 8A).In the t-test ten miRNA-combinations reached a p-value below 0.05 (Supplementary Table 6).The combination of miR-29c-3p from plasma and miR-27a-3p Fig. 4. Plasma PCA plots comparing all groups.A) sPD and HC.B) LRRK2 MC and HC.C) LRRK2 MC and sPD.D) All groups.Arrows point towards the respective mean of PC1 and PC2.In A, B, and C, clear separation can be observed as indicated my arrows pointing towards opposing directions.When including all groups in D, clustering is less apparent but still from CSF passed the adjusted p-value threshold of 0.0025 (LRRK2 MC : 930.8 (SD:±61.6);sPD: 813.6 (SD:±55.9),t(18) = 4.2, p = 0.0007).PCA for Ct multiplied discriminated the groups while explaining 90.2% of the total variance (PC1 : 83.2%, PC2 : 7%) (Supplementary Figure 8B).Groups were separated based on PC1 with a sensitivity of 100% (9/9) and a specificity of 80% (8/10).Combining PC1 and PC2 did not improve sensitivity or specificity (Supplementary Figure 7B).

LASSO regression and random forest models predict group membership
LASSO regression and the RF model were performed using the Ct plasma and the Ct multiplied dataset (Fig. 5A).Sensitivity of the LASSO model was 88.8% (8/9) for both Ct plasma and Ct multiplied .Specificity was 70.0% (7/10) (Ct plasma ) and 80.0% (8/10) (Ct multiplied ), respectively (Fig. 5B).Predictions from the RF models using Ct plasma had a mean sensitivity of 84.2% (95% CI: 83.1% -85.3%) and a mean specificity of 70.1% (95% CI: 69.9 -70.2%).When using Ct multiplied mean sensitivity was 73.6% (95% CI: 72.5% -74.6%) and mean specificity was 80 % (95% CI: 80% -80%) (Fig. 5B).When comparing the two models using an unpaired t-test, sensitivity was significantly better when using Ct plasma (t(98) = -13.8,p < 0.001), while specificity was higher when using Ct multiplied (t(98) = 99, p < 0.001).Classifications performed by the RF model were further analyzed by looking into proximity scores of patient pairs.The resulting heatmap for the Ct plasma dataset indicated that patients within the same group exhibit greater proximity to each other than patients from different groups (Fig. 5C).This becomes even more apparent in the model using Ct multiplied (Fig. 5D).Next, mean decrease of Gini scores were calculated for both of these RF models (Fig. 5E, F) to assess variable influence on classification accuracy.Interestingly, in both RF models, miR-223-3p had the highest score, indicating a great impact on the model performance.This miRNA had already been observed in the plasma t-test, where it indicated a potential for group discrimination with an uncorrected p-value of p = 0.051.Fig. 5. Group prediction with LASSO and Random Forest.A) Overview of performed analyses and respective read-outs.B) Sensitivity and specificity values acquired by LASSO regression and Random Forest, using Ct plasma and Ct multiplied.C) Heatmaps display the proximity scores calculated in RF for all sPD patients (red) and LRRK2 MC (blue) using Ct plasma or D) Ct multiplied .Groups clearly separate along mutation status.E) Mean Decrease Gini scores were calculated after building the respective RF models for both the Ct plasma and F) the Ct multiplied data set.Values were sorted in descending order and the top ten miRNAs are displayed.High decrease of the Gini coefficient translates to a high impact on the performance of the respective RF model.G) From each analysis, most influential or discriminatory miRNAs were extracted.Chord diagram displays the overlap of miRNAs selected from the different analyses.Numbers in brackets display set size and grey connecting lines indicate miRNA overlaps.The table displays miRNAs that were identified as relevant from each model or test.miRNAs highlighted in bold were identified in more than one analysis.Of note, miR-223-3p (italic) was extracted from both the Ct plasma and the Ct multiplied RF models.

Selection of discriminatory miRNAs identified by multiple tests or models
Finally, for each test or model, we identified those miRNAs, that showed the greatest potential for group separation or prediction (Table 2).From the RF models, PCA and the t-tests, we extracted five miRNAs each, while the LASSO model identified three miR-NAs.Interestingly, of the selected miRNAs, three were identified by more than one method; 1) miR-29c-3p was identified in t-tests and LASSO, 2) miR-128-3p in t-test and RF, and 3) miR-424-5p in LASSO and RF (Fig. 5G, Supplementary Figure 9).GO analysis was performed on the four miRNAs and we identified associated biological functions and processes (number of significant annotations identified: miR-223-3p: 25, miR-29c-3p: 60, miR-128-3p: 0, miR-424-5p: 3, see Supplementary Figure 10).Interestingly, the annotations for miR-223-3p included neuron death, regulation of neuron death and neuron apoptotic process.

DISCUSSION
In this proof-of-concept study, we examined the extracellular miRNA signatures in plasma and CSF derived from LRRK2 MC and sPD patients for LRRK2-dependent patterns.We discovered that plasma miRNA expression levels can distinguish sPD from LRRK2 MC .MiRNAs have been extensively studied in PD, but to the best of our knowledge, this is the first study to use machine learning and a broad data set of miRNA expression levels to distinguish genetic from sporadic PD.
Our results show that PCA separated LRRK2 MC from most individuals with sPD without differentiating between the various LRRK2 mutations.Separation was also observable in PCA comparing LRRK2 MC and HC as well as in an analysis comparing all three groups.This indicates that the LRRK2 mutation status has a measurable effect on the extracellular miRNA signature in LRRK2 MC .Some sPD individuals clustered in proximity to the LRRK2 MC individuals.We therefore hypothesize, that this method could be used to identify patients who would benefit from a LRRK2 targeted therapy.However, this question needs to be addressed using larger cohorts in the context of LRRK2-inhibitor trials.Regular assessment of the identified miRNA signatures could help monitor treatment efficacy in therapies targeting LRRK2, where a shift in miRNA expression levels could indicate target engagement.
miR-29c-3p has been reported to play a role in disease modulation by mediating neuroinflammation and has been found to be involved in apoptotic processes [25].It has further been proposed as a PD specific marker in several studies, but contradicting findings on expression levels compared to controls exist: studies report either downregulation [26,27] or upregulation [28] in serum of sPD patients compared to HCs.One study found upregulation in serum of sPD patients, but no dysregulation in LRRK2 associated PD [29].
miR-223-3p has been reported to be upregulated in midbrain dopamine neurons [30] and serum of PD patients [31].It was shown to be involved in the modulation of inflammasome activity [32].LRRK2 has long been suspected to play a role in inflammation and, e.g., was shown to affect microglial activation and pro-inflammatory cytokine production [33].Further, the GO annotations identified for miR-223-3p included processes specific to neurodegeneration underlining a possible influence in neurodegenerative disorders such as PD.
miR-128-3p has previously been identified as a potential treatment target due to its ability to protect neurons from apoptosis.The upregulation in the sPD group could therefore be reflective of a compensatory mechanism, that might not be relevant in individuals carrying a LRRK2 mutation.miR-424-5p was shown to be increased in the forebrain of PD patients, while being associated with FOXO1 [34].FOXO1 activity is induced by LRRK2 [35] and thereby links altered miR-424-5p levels to LRRK2 activity.
The low RNA abundance in CSF compared to plasma made miRNA detection difficult, resulting in the exclusion of many miRNAs from CSF analyses.This reduced the complexity of miRNA signatures we could assess in CSF, which may explain the poor performance of CSF-based analyses.An alternative explanation could be that LRRK2 protein expression levels are known to be low in the CNS [4,5] despite their apparent relevance in the pathophysiology of PD.In contrast, peripheral organs such as the lung and the kidney display higher expression levels, which could also explain the increased accuracy of models based on Ct plasma miRNA profiles in identifying LRRK2 MC .We have further analyzed the relation of miRNA expression levels in CSF and plasma and found that for most miRNAs, these two biofluids seem to display very different signatures.However, through correlation analysis we have identified a subset of miRNAs whose expression levels in the CNS and the periphery seem related.We convoluted correlated miRNA data sets in order to test whether Ct multiplied would provide RF models with significantly higher sensitivity or specificity, which was not the case.While RF models based on Ct multiplied still performed well and slightly different than RF models based on Ct plasma , when facing the discussed classification problem there seems to be no clear benefit from adding CSF data to the model.
In summary, this proof-of-concept study showed promising results and deepens our understanding of LRRK2-associated PD, but it has limitations we want to address.First, the number of individuals included in the present study is relatively small.When utilizing group sorting algorithms for classification, larger sample sizes improve robustness and replicability.Ideally, the data set should be divided into a training and a testing data set to avoid overfitting.Additionally, a completely independent data set should then be used to replicate the predictions.While we did perform cross-validation using LOOCV or out-ofbag samples, this cannot fully replace validation in a testing or replication cohort.The results therefore have to be considered with caution until replicated in larger cohorts.Further, we selected miRNAs to be included in our qPCR panel based on the liter-ature and therefore only considered miRNAs that have already been reported as dysregulated in PD or other neurodegenerative diseases.This may have introduced a bias and preventing discovery of novel contributing miRNAs.Finally, we found no evident advantage to employing multi-layered or composited readouts over standard t-tests.While we still believe that in diseases as complex as PD basing classifications on multiple variables is an interesting approach, this concept has still to be improved and repeated in larger and more complex cohorts.

Conclusion
In conclusion, in this proof-of-concept study we showed that LRRK2 mutation status impacts the extracellular miRNA signature measured in plasma and shows promise to separate LRRK2 MC from sPD.Monitoring changes of the extracellular miRNA signatures upon e.g.LRRK2 inhibition could be used to study drug efficacy or target engagement.We further hyopthesize that multi-layered approaches could be applied to identify sporadic PD patients with a relevant role of LRRK2-related pathways, but this needs to be thoroughly assessed in larger cohorts.

Fig. 2 .
Fig. 2. Visualizing miRNA signatures and group differences.A) Heatmap based on log2fc values from the plasma dataset, normalized per row.Rows represent different miRNAs while columns represent patients.Grey cells represent missing values.Column sorting function was inactivated to not interfere with group arrangement.The similarity indicated by the column dendrograms is not proportional between groups.B) Heatmap based on log2fc values from CSF dataset, normalized per row.This heatmap includes a smaller amount of miRNAs (represented in rows) as they were excluded from the CSF dataset due to missing values.C) Histogram showing the distribution of uncorrected p-values obtained from the plasma and the D) CSF datasets after performing unpaired t-tests using log2fc values.E) Scatterplots display log2fc values from sPD patients and LRRK2 MC .Five miRNAs with the lowest p-value in the t-tests were selected.Thick line indicates mean while the box indicates the standard deviation.

Fig. 3 .
Fig. 3. PCA graphs and correlation of plasma and CSF data.A) PCA graph from plasma and B) CSF Ct values.Ellipses indicate 95% confidence interval.Numbers indicate LRRK2 mutation (1: G2019S, 1*: G2019S + G1819, 2: R1441C, 3: I2020T).In the plasma PCA graph, group ellipses overlap while the blue LRRK2 MC ellipse trends to only include individuals of the LRRK2 MC group.Based on PCA, the LRRK2 MC group could be interpreted as a subpopulation of all PD patients.C) Correlation matrix presenting statistically significant (p < 0.05) correlations between Ct plasma and Ct CSF values of the eleven miRNAs included in the CSF dataset.

Table
. The researcher performing the experiments was blinded to the group annotation.

Table 2
miRNAs selected as most influential by the different models.