Parkinson’s disease is a complex and heterogeneous condition, and there are many gaps in the medical community’s scientific and practical understanding of the disease. Closing these gaps relies on objective data about symptoms and signs, collected over long durations. Smartphones contain sensor devices which can be used to remotely capture behavioral signals. From these signals, computational algorithms can distill metrics of symptom severity and progression. This brief review introduces the main concepts of the discipline, addressing the experimental, hardware and software logistics, and computational analysis. The article finishes with an exploration of future prospects for the technology.
Parkinson’s disease is a heterogeneous condition, with different individuals experiencing different combinations of symptoms and different rates of symptom progression . As of 2020, we do not have a clear understanding of how to detect the condition in its early stages, and we do not understand the ultimate causal factors which lead to someone developing the condition in the first place. Clinimetric tools for measuring symptoms and progression are therefore required to advance the multiple applications such as prodromal symptom detection, real-time symptom fluctuation monitoring, intra-day tracking of symptom treatment effectiveness, observational and longitudinal data collection, and assessing the effectiveness of clinical trials.
A major difficulty for these applications is that traditional, subjective, in-clinic measurements of symptom severity—quantitative rating scales such as the MDS-UPDRS and PDQ-39—are of limited use for most practical clinical applications. Today’s smartphones, owned by a sizable majority of the population, come equipped with various, commodity “sensor” devices for recording continuous physical measurements, including movement, sound, location, and touch (Fig. 1). This continuous stream of sensor data can be used to measure individual behaviors which are partially caused by the underlying disease process. For example, accelerometry recordings of leg movements may be used to infer changing patterns of stepping during walking related to bradykinesia, and patterns of tapping on the smartphone touch screen can be related to rigidity (see Table 2). Starting in 2013, a handful of novel academic studies making use of smartphones for Parkinson’s disease symptom measurement, were conducted (see Table 1), which kick-started the research discipline across both academia and industry.
|Sensor||Physical measurement||Symptoms (body placement)|
|Accelerometer||Sum of dynamic and gravitational acceleration||Gait impairment (hip)|
|Sit-stand transitions (hip)|
|Balance disturbance (hip)|
|Gyroscope||Rate of rotation (spin)||Gait impairment (hip)|
|Balance disturbance (hip)|
|Magnetometer||Geomagnetic field strength (direction)||Turning problems (hip)|
|Barometer||Ambient air pressure (altitude)||Stair climbing difficulties (hip)|
|Microphone||Ambient sound waves||Voice and speech impairment (hand)|
|Touch screen||Finger location on screen||Dexterity impairments (tapping)|
|Thermometer||Ambient air temperature||Heat/cold intolerance|
|GPS||Outside location (latitude, longitude)||Mobility disability|
|Start year [reference]||Operating system/hardware||Custom software application||Participating users (N)||Study duration||Study design||Recruitment|
|2013 ||Android||HopkinsPD||20||3 months||Remote observational||In-clinic|
|2013 ||Android||HopkinsPD||522||5 years||Remote observational||In-clinic|
|2014 ||Apple iOS||Bradyapp||26||N/A||In-clinic observational||In-clinic|
|2014 ||Android||HopkinsPD||457||6 months||Remote observational||Remote|
|2014 ||Android||Roche Proprietary||79||24 weeks||RCT, non-primary endpoint||In-clinic|
|2015 ||Apple iOS||mPower||898||6 months||Remote observational||Remote|
The driving interest behind this extremely active research field is the development of experimental methods, computer software, mathematical and statistical algorithms, to convert commodity smartphones into tools for doing high-quality, rapid, measurements of Parkinson’s disease symptom severity (see Table 2). The widespread availability of low-cost smartphones might allow measurements on large patient populations for the clinical and research applications mentioned above. This concise review lays out the state-of-the-art of this field for a non-technical, clinical audience.
EXPERIMENTAL LOGISTICS OF SMARTPHONE-BASED MEASUREMENT
For sensor-based data recording, the body placement, social and physical environment of the smartphone, in relation to the subject’s behavior, are critical factors. For example, if the goal is to measure gait impairment, then the smartphone must make good mechanical contact near the trunk or lower limbs. An ideal wear location in this case, is a tight-fitting pocket on the thigh. Clearly, this is not feasible for many individuals. Similarly, to record speech in naturalistic settings it is necessary in some legal jurisdictions, to gain consent from other conversational participants, and the acoustic environment should be largely free of extraneous background noise. These constraints may be difficult to satisfy in practice. As a result, smartphone-based symptom measurement requires specialized experimental design and planning.
Active (structured) testing
One practical approach to addressing experimental difficulties is to structure the measurement of symptoms using specialized app software, instructing participants to perform prescribed actions. This imposes specific behavioral controls that eliminate many confounding effects due to unknown participant actions within the environment. These software-guided actions are designed to elicit specific symptoms according to a certain testing protocol. For example, a gait impairment test would require participants to wear the smartphone in a particular location and walk in a straight line for a time, a task that might be difficult to perform with symptoms such as freezing of gait. Another for dexterity issues might require sequential tapping of on-screen buttons, where tremor and rigidity might affect tapping strength and speed.
Passive (unstructured) testing
While active testing reduces many behavioral and environmental confounds, it is not a naturalistic activity, so it introduces a burden on participants, however small. To be conducted, typical active tests demand finding suitable locations and times of the day. This causes participants to lose interest and willingness to contribute to studies, to be lost to attrition. An alternative is to record sensor data opportunistically, when the smartphone is being worn, so that no explicit interaction with the smartphone is required. The hope is that, if the smartphone is worn for a substantial portion of the day, symptoms such as gait impairment or tremor severity can be monitored without the need to interrupt the subject at all. However, because this approach lacks behavioral control, it is prone to substantial contextual confounding.
FROM SENSOR DATA TO SYMPTOM MEASUREMENT
Sensor data captured on the smartphone is not readily interpretable in clinical terms, nor can it be directly explained in terms of the relationship to the underlying disease process. Further processing is necessary. Typically, this processing requires mathematical and statistical algorithms which are too complex to be carried out on the smartphone itself. Thus, the sensor data is encrypted and transmitted to a remote server where the processing takes place. This processing “pipeline” has evolved over the years as the field has matured .
The first step is quality control. In the case of active testing, participants do not always adhere to the test protocol which invalidates the test sensor data. Automated QC algorithms “clean” the sensor data to improve the reliability of the recording [3, 4]. In the case of passive testing, the sensor data is segmented into contiguous intervals representing the behavior of interest. For instance, gait impairment testing requires isolating sufficiently long intervals of intentional walking .
The next step is preprocessing and feature extraction. The cleaned sensor data is standardized (for example, accelerometry data is transformed into a body-centric coordinate frame) and then summarized down to a few metrics. For example, with voice recordings, typical preprocessing standardizes the amplitude to compensate for distance of the patient’s mouth to the microphone, followed by extracting measures of loudness across a range of frequencies .
The final step involves clinical prediction. Here, the extracted features are analyzed together to produce a clinically-interpretable metric of some aspect of the disease process. For example, one clinical application is that of tracking intraday symptom response to dopaminergic drug treatment, in which case the prediction is the ON/OFF status of the participant on, say, 4–8 eight hourly intervals . Another example is of detecting the overnight presence or absence of a prodromal symptom such as sleep disturbance . An even more “continuous” metric might be medication-induced, gait performance fluctuations, measured on a minute-by-minute basis .
To validate predictive algorithms such as the above often requires some kind of labelling of the data. Examples of such labelling include expert decisions: a diagnostic or periodic symptom assessment applying some clinical criteria to observations of the patient’s patterns of movement or behavior, or non-expert annotations, such as assessment of other behavior (walking indoors versus outdoors) in order to establish the context of behaviors as discussed above. Labelling requires care. There are the usual ‘clinimetric’ issues with sufficient training, inter-rater variability, test-retest reliability and others. In addition, there are experimental considerations. For example, to validate predictions of medication effectiveness, it is necessary to collect information on treatment schedule adherence on an intra-day basis. Participants in clinical trials may find it a substantial effort to record this information over the long term and may therefore be somewhat unreliable or patchy witnesses. This is one reason why any smartphone-based remote symptom monitoring technology may be limited to the ability to label the data for validation purposes.
Given the problems with contextual confounding of smartphone usage, there is substantial need to disambiguate sensor data. Current analytical approaches to smartphone sensor data processing make little use of simultaneous measurements across multiple sensors. As an example, GPS speed might be a useful differentiator for separating bicycling from walking when measuring gait impairment. Novel algorithmic processing pipelines will need to be developed.
Similarly, as smartphone hardware becomes more capable, new, non-classical metrics of Parkinson’s disease may become accessible. In particular, future smartphones may come equipped with photoplethysmographic sensors to measure peripheral blood flow and oxygenation, which may allow measurement of pulse to quantify autonomic nervous system dysfunction. Coupled with standard tests such as sit-stand transitions, it may be possible to isolate conditions such as orthostatic hypotension which is a common condition in Parkinson’s disease.
One of the major difficulties with current smart-phone-based analysis algorithms is that they are all associational. That is, they make predictions by associating the sensor data with clinical outcomes or other meaningful symptom labels. We would, in particular, like to ensure that these associations are not conflated with causation, to avoid making predictions which merely reflect spurious correlations. One approach to avoiding this pitfall is analytical methods from causal inference . These methods, which seek to identify cause-effect relationships free from confounding, are widely used in disciplines such as epidemiology, but have only more recently started to be incorporated into computational analysis .
While an individual’s smartphone sensor data is fairly obscure and thus difficult to identify, there are some streams of data which are highly sensitive, such as GPS location, and thus require encryption and special security measures. If these kinds of sensitive data are to be used en-masse, it will likely be necessary to restrict computational processing to the smartphone. Alternatively, new advances in computational security such as differential privacy may be required .
Finally, the current evidence (as of 2020) for the practical effectiveness of smartphone-based moni-toring does not reach the usual standard for regulatory approval, e.g., the randomized controlled trial. All existing studies of smartphone-based testing use retrospective observational, not prospective interventional, data. For example, testing whether smartphone-based symptom monitoring is effective and leads to improved outcomes for patients could be carried out by performing a novel diagnostic trial .
Over the next decade, smartphones are predicted to become ever more widely available. With expected increases in computation and storage capacity, the smartphone could also process the sensor data without the delay in moving it over the internet. We can also expect that the skills and knowledge as described in this review will become more widespread in the clinical research communities. As these technologies mature, become more reliable, understanding of the function and limitations of these analytical tools improves, and they enter diagnostic clinical trials for diverse populations, we will likely see the emergence of fully-approved smartphone software tools for remote symptom monitoring. These tools will likely become a common “building-block” for special applications adapted to particular users’ circumstances.
To illustrate a future application scenario: for a user engaged in testing out a change in treatment regime, a symptom monitoring algorithm may run in the background on their smartphone, collecting sensor data and processing it to provide real-time feedback on symptom changes. An annotated, visual representation of symptom changes would be displayed on a daily chart, which would be automatically shared over the internet with the responsible clinical staff. In conjunction with the clinician, referring to the chart as evidence of effectiveness, the change in treatment would either be acted upon or rejected. This kind of remote, objective evidence, may have a substantial, near-term improvement on the medical science and practice of Parkinson’s disease.
Research reported in this publication was partly supported by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under Award Number P50NS108676. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health.
CONFLICT OF INTEREST
The author reports that there are no conflicts of interest.
Jankovic J , King Tan E (2020) Parkinson’s disease: Etiopathogenesis and treatment. J Neurol Neurosurg Psychiatry 91, 795–808.
Eysenbach G (2005) The law of attrition. J Med Internet Res 7, e11.
Badawy R , Raykov YP , Evers LJW , Bloem BR , Faber MJ , Zhan A , Claes K , Little MA (2018) Automated quality control for sensor based symptom measurement performed outside the lab. Sensors 18, 1215.
Poorjam AH , Raykov YP , Badawy R , Jensen JR , Christensen MG , Little MA (2019) Quality control of voice recordings in remote Parkinson’s disease monitoring using the infinite hidden Markov model. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 805-809.
Raykov YP , Evers LJW , Badawy R , Bloem BR , Heskes TM , Meinders M , Claes K , Little MA (2020) Probabilistic modelling of gait for robust passive monitoring in daily life. IEEE J Biomed Health Inform, doi: 10.1109/JBHI.2020.3037857
Arora S , Venkataraman V , Zhan A , Donohue S , Biglan K , Dorsey ER , Little MA (2015) Detecting and monitoring the symptoms of Parkinson’s disease using smartphones: A pilot study. Parkinsonism Relat Disord 21, 650–653.
Zhan A , Mohan S , Tarolli C , Schneider RB , Adams JL , Sharma S , Elson MJ , Spear KL , Glidden AM , Little MA , Terzis A , Dorsey ER , Saria S (2018) Using smartphones and machine learning to quantify Parkinson disease severity: The mobile Parkinson disease score. JAMA Neurol 75, 876–880.
Arora S , Baig F , Lo C , Barber TR , Lawton MA , Zhan A , Rolinski M , Ruffmann C , Klein JC , Rumbold J , Louvel A , Zaiwalla Z , Lennox G , Quinnell T , Dennis G , Wade-Martins R , Ben-Shlomo Y , Little MA , Hu MT (2018) Smartphone motor testing to distinguish idiopathic REM sleep behavior disorder, controls, and PD. Neurology 91, e1528–e1538.
Pearl J (2000), Causality: Models, Reasoning, and Inference,Cambridge University Press, Cambridge, UK.
Little MA , Badawy R (2019) Causal bootstrapping. arXiv:1910.09648.
Dwork C (2008) Differential privacy: A survey of results. In Theory and Applications of Models of Computation, TAMC2008,Agarwal M, Du D, Duan Z, Li A, eds. Springer, Berlin, Heidelberg.
Rodger M , Ramsay T , Fergusson D (2012) Diagnostic randomized controlled trials: The final frontier. Trials 13, 137.
Printy BP , Renken LM , Herrmann JP , Lee I , Johnson B , Knight E , Varga G , Whitmer D (2014) Smartphone application for classification of motor impairment severity in Parkinson’s disease. In Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 2686-2689.
Lipsmeier F , Fernandez-Garcia I , Wolf D , Kilchenmann T , Scotland A , Schjodt-Eriksen J , Cheng WY , Siebourg-Polster J , Jin L , Soto J , Verselis L , Facklam MM , Boess F , Koller M , Grundman M , Little MA , Monsch A , Postuma R , Gosh A , Kremer T , Taylor K , Czech C , Gossens C , Lindemann M (2017) Successful passive monitoring of early-stage Parkinson’s disease patient mobility in a Phase I RG7935/PRX002 clinical trial with smartphone sensors. Mov Disord 32, S358–S359.
Bot BM , Suver C , Neto EC , Kellen M , Klein A , Bare C , Doerr M , Pratap A , Wilbanks J , Dorsey ER , Friend SH , Trister AD (2016) The mPower study, Parkinson disease mobile data collected using ResearchKit. Sci Data 3, 160011.