You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

A novel method for automatic classification of Parkinson gait severity using front-view video analysis



Gait impairment is an essential symptom of Parkinson’s disease (PD).


This paper introduces a novel computer-vision framework for automatic classification of the severity of gait impairment using front-view motion analysis.


Four hundred and fifty-six videos were recorded from 19 PD patients using an RGB camera during clinical gait assessment. Gait performance in each video was rated by a neurologist using the unified Parkinson’s disease rating scale for gait examination (UPDRS-gait). The proposed algorithm detects and tracks the silhouette of the test subject in the video to generate a height signal. Gait features were extracted from the height signal. Feature analysis was performed using the Kruskal-Wallis rank test. A support vector machine was trained using the features to classify the severity levels according to UPDRS-gait in 10-fold cross-validation.


Features significantly (p< 0.05) differentiated between median-ranks of UPDRS-gait levels. The SVM classified the levels with a promising area under the ROC of 80.88%.


Findings support the feasibility of this model for Parkinson’s gait assessment in the home environment.


Parkinson’s disease (PD) deteriorates motor functions and develops gait symptoms over time. These symptoms include short-shuffling steps, postural instability, slow walking, etc. [1]. Parkinsonian gait is clinically examined using the unified Parkinson’s disease rating scale part-III item-29 (UPDRS-gait) [2]. This examination requires a patient to walk back and forth on a 10 meters gait platform. A doctor rates the walk on a scale of ‘0’ and ‘4’ using UPDRS-gait. ‘0’ represents a healthy walk ‘1’ represents a slow walk with shuffling steps ‘2’ represents a walk with shortshuffling steps and festination. ‘3’ represents a severe gait disturbance that requires assistance for walking ‘4’ indicates total disability to walk even with assistance.

Some limitations of examining PD include the consumption of extensive time and resources of healthcare systems [3], the physical ability of patients to visit clinics for regular assessment, and subjective evaluation of symptoms by a doctor that is prone to human error. A solution is to employ vision-based telemonitoring tools to enable continuous monitoring of patients in their home environment.

Figure 1.

At-home gait assessment based on front-view video analysis.

At-home gait assessment based on front-view video analysis.

State of the art vision-based methods of gait analysis used Kinect sensors [4]. However, Kinect sensors are not commonplace as compared to the RGB cameras in smartphones, laptops, and tablets. Importantly, these devices allow transmitting recordings of gait to a server where videos can be processed, and results presented to a caregiver. Subsequently, the prescription can be transmitted back to the patient’s device (Fig. 1a). This feedback mechanism of computerized gait assessment improves interactivity between patients and caregivers that allows timely treatment of patients at home not possible through conventional manual ways of treating Parkinson’s disease.

A recent study [5] used an RGB camera to record side-view of test subjects to obtain visual separation of legs for estimating steplength. The study reported a strong correlation between fall and steplength. Similarly, another study [6] used an RGB camera and side-view for estimating the silhouette of a walking person and suggested that gait analysis can be performed without a lab, or physical attachment of sensors or markers to the patients.

However, a disadvantage of side-view assessment is that a large room is needed for recording (Fig. 1b). Alternatively, recording from front confines the back and forth movement of the subject within the camera field-of-view, which allows recording gait in compact spaces such as corridors. This is important because studies suggest that freezing and falling is less likely to occur in corridor walks since corridors provide visual cues to the patients that assist them in planning their movement [7].

We propose a machine learning model for estimating Parkinsonian gait symptoms using front-view video analysis. The method follows the UPDRS protocols and allows using a compact space. The algorithm uses the varying height of the subject in a sequence of video frames to extract features representing gait symptoms. A support vector machine (SVM) was trained using these features to score the severity of gait impairment based on the UPDRS-gait.


2.1Data acquisition

Data were acquired between 2002 and 2003 at five clinics in Sweden in a study entitled ‘Duodopa Infusion: Randomized Efficacy and Quality-of-life Trial’ [8]. In the study, gait examinations of 24 patients (19 males and 5 females) were videotaped. The patients were aged between 50 and 75 and had a mean total-UPDRS score of 50.45 on a scale between ‘0’ (healthy) and ‘108’ (total disability).

Gait examination was conducted in a 10 meters long corridor. Patients were seated on one end of the corridor and a camera is pivoted at the other end. Patients were asked to rise from the chair, walk straight to the camera, turn, and walk back to the chair. The gait was recorded, and the video was transmitted to a server accessed by a neurologist. The neurologist watched the video and rated the walking performance based on the UPDRS-gait.

Each patient was examined and videotaped 17 times throughout the day with a rest of half-an-hour before each examination. Videos of patients with a total disability to walk (rated ‘4’) and those who required assistance (rated ‘3’) were not used for the analysis due to the interference of nursing assistants in the videos. Also, some patients dropped out of the study. The videos were recorded at 25 frames per second and a resolution of 352 × 288 pixels. Written informed consent was obtained from all patients.

Since multiple videos were recorded of an individual, to avoid subjective bias in model development and to balance sample distribution, 456 videos with reasonable quality (no blur/shadows/highlights/occlusion) were randomly selected from the database such that classes ‘0’, ‘1’ and ‘2’ consisted of 152 samples each. The videos were used for method validation and analysis.

2.2Method description

The block diagram of the algorithm is shown (Fig. 2). In the first step, the test subject was identified in the video using a human detector based on the histogram-of-oriented gradients (HOG) [9]. HOG returns a bounding-box that confines the height and width of the subject in a video frame. In the second step, a height signal was produced by using the varying height of the boundingbox in a sequence of video frames. The signal was height-adjusted and normalized. Features were extracted from the height signal for training an SVM to score UPDRS-gait. The steps are described further.

Figure 2.

Block diagram of the gait algorithm.

Block diagram of the gait algorithm.

Figure 3.

Human detection in an image sequence. Shi in pixels is shown in the top-left corner of the bounding box that tends to increase with each forward step towards the camera.

Human detection in an image sequence. Sh⁢i in pixels is shown in the top-left corner of the bounding box that tends to increase with each forward step towards the camera.

2.2.1Human detection

Human detection using HOG [9] is based on the idea that the appearance of a local object in an image can be characterized by the distribution of gradients of pixel intensities. A significant intensity difference across pixels indicates an edge. The algorithm operates by dividing an image into connected regions called cells. A local 1-D histogram of pixel intensities in that cell is computed. The histogram is contrast-normalized using the Gaussian weight of pixel intensities across larger regions of the image referred to as blocks (Eq. (1)).


Where x is pixel intensity, μ is the mean pixel intensity in a block and σ is the standard deviation of pixel intensities in that block. These normalized histograms of cells of a block are termed as HOG descriptors that are collectively used as features for training an SVM to detect human presence in an image. The method was previously tested on the MIT pedestrian database [10] consisting of 509 training and 200 test images of walking pedestrians, as well as the INRIA database [9] consisting of 1805 test images of human poses. The method successfully detected human in both databases with zero miss rate. In our study, the cell size was 6 × 6 pixels, and the block size was 3 × 3 cells. The method detected walking subjects in our video recordings with 100% accuracy.

2.2.2Height signal

The HOG algorithm returns a bounding-box with height Sh and width Sw of a human silhouette in an image. Sh increases when the subject walks closer to the camera and decreases when he walks away (Fig. 3). Sh remains constant when left and right legs are adjacently positioned during mid-swing and mid-stance phases and increases when both legs are positioned apart during terminal-swing and terminal-stance (Fig. 5b). Sh generates a height signal Shi for a video-frame sequence i= 1 to n total frames.

2.2.3Signal pre-processing

For accurate estimation of gait symptoms, the method must be robust to varying heights of people since gait attributes are affected by height. For instance, a tall person’s stride is generally longer than a short person’s stride. To account for the height variation, Shi was scaled using a human model [11]. According to this model, face height fh is proportional to total height Sh. A face detector [12] was used to compute fh. Shi was height-adjusted by dividing Shi by fh in each video frame to produce Shi. Also, recordings using a camera placed closer to the gait platform produces higher Sh than if the camera is placed farther. To accommodate varying camera positions, Shi was normalized between 0 and 1 using Eq. (2).

Shi=Shi-min(Shi)max(Shi)-min(Shi) for i=1n

The normalized signal Shi of representative videos rated ‘0’ (healthy), ‘1’ (mildly-impaired), and ‘2’ (moderately-impaired) are shown (Fig. 4a). It was observed that the completion time of gait, i.e., time taken in walking forward from the initial position, turning and walking back to the initial position, was lowest for healthy, higher for mildly-impaired, and highest for moderately-impaired gait. Small-shuffling steps in impaired gait signals were noticed, i.e., signals showed smaller amplitude changes compared to healthy gait. Turning time was lowest in healthy gait. Importantly, the healthy gait signal showed quick and smooth progress compared to impaired gait signals

Figure 4.

The height signal Shi is shown for representative samples rated ‘0’ (healthy), ‘1’ (mildly-impaired), and ‘2’ (moderately-impaired). 4a shows normalized Shi of a complete walk, i.e., walking forward from the initial position, turning and walking back to the initial position. 4b shows quantized Shi of a forward walk.

The height signal ⁢Sh⁢i is shown for representative samples rated ‘0’ (healthy), ‘1’ (mildly-impaired), and ‘2’ (moderately-impaired). 4a shows normalized ⁢Sh⁢i of a complete walk, i.e., walking forward from the initial position, turning and walking back to the initial position. 4b shows quantized ⁢Sh⁢i of a forward walk.

To remove signal aberrations, Shi was smoothed using a moving-average filter and quantized using the Lloyd algorithm [13]. The algorithm approximates a continuous set of values within a signal partition and maps them to one discrete weighted-average-centroid of points in that partition. A partition size of five points was selected for quantization. Gait events were approximated using increasing values of the quantized signal Shi representing a forward walk. To do this, Shi was split between the forward and backward walks by using the maximum height value that represents the position where the subject is closest to the camera. The forward walk Shi in representative videos are shown (Fig. 4b).

2.2.4Feature extraction

Stride has two phases, swing and stance. As discussed above, Sh remains constant during mid-swing and mid-stance and increases during terminal-swing and terminal-stance. Strides were approximated using Shi to compute features representing level-1 symptoms of slow walking and short-shuffling steps, and level-2 symptom of gait-festination. First, stance time (ST) was computed using Eq. (3).


Where Ts(i) is the timestamp in Shi that represents the initial contact of the front foot with the ground. Timestamp Ts(i+1) is the point of amplitude increase in Shi that represents the terminal-stance (Fig. 5b). Now, swing time (SW) was computed using Eq. (4).


Where Ts(i+1) is the timestamp in Shi representing the initial swing. Timestamp Ts(i+2) is the point of amplitude increase in Shi that represents the terminal-swing (Fig. 5b). Finally, stride time (ST) was computed using Eq. (5).

ST(j)=ST(i)+SW(i) for i=1,3,5n-2, and j=1N total strides

Figure 5.

Representation of gait events in the quantized height signal.

Representation of gait events in the quantized height signal.

To estimate short steps indicating level-1 impairment, average stride time S𝑎𝑣𝑔 was computed using Eq. (6). Low S𝑎𝑣𝑔 indicates an overall short step-length.


Detrended fluctuation analysis (DFA) and entropy E were used to estimate step shuffling. DFA determines signal self-affinity using long-range correlations. This was done by integrating signal Shi using Eq. (7).

y(k)=i=1k[Shi-S𝑎𝑣𝑔] for k=1N

Where y(k) is the integrated signal. y(k) was divided into boxes of equal length l. We kept l= 5. For each box, a least-square fit and y-coordinates of the fitted-line yn(k) were computed. Fluctuation F(l) was measured for total boxes L using Eq. (8).

F(l)=k=1L(yk-yn(k))2L for l=l,2l,3lL/l

Self-similarity α was computed as the slope of the log-log plot between F(l) and l. α equals 1 if the boxes are similar, or lesser or greater than 1 otherwise [14].

Entropy in Shi was computed using Eq. (9).


Spectral centroid variability in Shi was computed to estimate gait-festination indicating level-2 impairment. Quick abrupt short steps accompanied by imbalance characterize gait-festination. This means that a level-2 signal should have higher randomness as well as sharp shifts in signal values, meaning weak frequency centroids across the signal compared to healthy gait. Spectral centroids were computed for a total of N boxes of box size n= 5 using Eq. (10).

Ci=fixixi for i=1N

Where fi is the frequency in Hertz and xi are spectral values in the ith box. Centroid variability for estimating abrupt short steps was computed as the mean difference between consecutive centroids given as


Slow walking was estimated by computing time T between the valley and peak of the signal Shi. A total of five features 1) S𝑎𝑣𝑔, 2) α, 3) E, 4) AΔC and 5) T representing UPDRS-gait symptoms were used for training an SVM to classify UPDRS-gait.

2.2.5Feature analysis

A non-parametric one-way analysis of variance of features across severity levels was performed using the Kruskal-Wallis test [15]. For each feature, the test ranked feature values from smallest to largest. Level mean-ranks were compared to test the null-hypothesis that independent samples belong to continuous distributions that are indistinguishable. Statistical significance (p< 0.05; 95%CI) was computed to identify if features truly represent UPDRS-gait symptoms and discriminate severity levels based on mean-ranks. Results are given in Section 3.


SVM was chosen for its ability to find optimal margins between class boundaries over a high-dimensional feature space [16]. SVM uses a kernel function for mapping features to a higher dimension by using images of the inner product between pairs of features, which is computationally inexpensive compared to computing actual feature coordinates in a high-dimensional space. We used a PUK kernel k(ViVj) [17] that is a modified form of the Pearson VII Gaussian function given in Eq. (12).


Where Vi and Vj are training feature vectors, σ adjusts the half-width of the peak of the Gaussian curve, and ω controls the tailingfactor of the peak. A feature matrix of 5 features × 456 samples were used to train the SVM to classify UPDRS-gait levels ‘0’, ‘1’, and ‘2’. For this multi-class classification problem, a one-vs-all classification approach was used such that the SVM was trained to discriminate between samples of a class versus samples of the other two classes. Hence three models were developed to classify ‘0’, ‘1’, and ‘2’ separately. The training performance was optimized by tuning σ and ω.

To avoid biased generalization, data were stratified using 10-fold cross-validation i.e., the models were trained and tested in 10 iterations. In each iteration, 90% of randomly selected samples were used for training and 10% were used for testing the model. Samples used for testing once are not repeated for testing in other iterations. Prediction accuracy was computed for each iteration and results are averaged over ten iterations. The overall performance was evaluated using confusion matrices and ROC curves. Results are given in Section 3.


A comparison between feature mean-ranks of UPDRS-gait levels is shown (Fig. 6). Feature S𝑎𝑣𝑔 estimates short steps, which is a level-1 symptom. The test confirmed that the mean-ranks of S𝑎𝑣𝑔 was the lowest in level-1 and significantly different (p-value = 6.37 × 10-11) than the mean-ranks of level-2. However, the mean-ranks of levels 0 and 1 were not significantly different.

Figure 6.

Feature analysis using the Kruskal-Wallis mean-rank test. Bars bordered with circles represent the UPDRS-gait severity level under which a symptom is examined.

Feature analysis using the Kruskal-Wallis mean-rank test. Bars bordered with circles represent the UPDRS-gait severity level under which a symptom is examined.

Features α and E estimate step shuffling which is a level-1 symptom. The level-1 mean-ranks of α was significantly higher (p-value = 1.13 × 10-6) than the mean-ranks of level 0 and 2; however, the mean-ranks of 0 and 2 were not significantly different. Also, feature E mean-ranks was the highest in level-1 although insignificantly (p-value = 0.085).

Feature AΔC estimates gaitfestination that is a level-2 symptom. Results affirmed that AΔC mean-ranks in level-2 were significantly (p-value = 1.20 × 10-8) lower than the mean-ranks of levels 0 and 1. Also, feature T discriminated between the mean-ranks of the three levels with statistical significance (p-value = 7.86 × 10-41) suggesting that walking speed reduces with severity of gait impairment.

The SVM model trained using these features and tuned using model parameters (ω= 0.2; σ= 1.0) predicted the UPDRS-gait scores with an averaged accuracy of 70.83% (Fig. 7). Reasonable true-positive rates were produced for classes ‘0’ (74.3%), ‘1’ (64.5%), and ‘2’ (73.7%). The averaged area under the ROC curves of 80.88% was promising. Moreover, the ROC curves of class ‘0’ and ‘1’ were protruded upwards, supporting the model’s ability to classify class ‘0’ and ‘2’ with high accuracy.

Figure 7.

Classification performance of the SVM model.

Classification performance of the SVM model.


We introduced a new method of Parkinson’s gait assessment using front-view video analysis. The method computes the varying height of the human silhouette in video frames and quantizes the height signal to estimate temporal gait features. Important features significantly (p< 0.05) represented gait symptoms of short-shuffling steps and festination that are clinically observed by a doctor to rate mild, moderate and severe stages of Parkinson’s gait. Moreover, the SVM model correctly predicted the UPDRS-gait scores with a high average area under ROC curves.

Recent work based on Kinect sensors [18] supports that front-view analysis saves space for gait assessment compared to side-view that requires large space. However, Kinect sensors are not commonplace. By contrast, our methodology used an RGB-camera available in devices used in everyday life, facilitating gait assessment in narrow corridors at home with no specialised equipment. Moreover, the algorithm is a low-cost alternative to motion capture systems for PG assessment, such as [19], that requires advanced equipment and a controlled environment.

In conclusion, the proposed SVM model and features accurately characterized the severity of gait impairment according to UPDRS standards without requiring complicated lab settings and the need for physical attachment of body markers and sensors. The excellent accuracy obtained in the classification of UPDRS-gait severity levels and importantly, the significant ability of features to characterize the severity, suggest that the model can be used for clinical evaluation in non-laboratory settings, can support in tracking gait symptoms and help in treatment interventions.

Future work includes optimizing the framework by incorporating biomechanics such as leg joints, angles, hand movements, etc. made possible by recording videos at higher speed and resolution. The study could be expanded to examine gait problems in other neurological disorders such as Huntington’s disease, neuropathy, or rehabilitation after lower limb surgeries. Also, deep learning can be used for model development by recording a larger dataset of gait videos for training the classifier. We plan to integrate the proposed method to a test battery system [20] that allows telemonitoring of activities of daily living of patients to enable an overall PD assessment.

Conflict of interest

None to report.



Nieuwboer A. Freezing of gait: problem analysis and rehabilitation strategies. Parkinsonism & Related Disorders. (2006) Oct 1; 12: : S53–4. doi: 10.1016/j.parkreldis.2006.05.016.


Fahn SR. Unified Parkinson’s disease rating scale. Recent Development in Parkinson’s Disease. (1987) ; 2: : 293–304.


Afentou N, Jarl J, Gerdtham UG, Saha S. Economic evaluation of interventions in Parkinson’s disease: a systematic literature review. Movement Disorders Clinical Practice. (2019) Apr; 6: (4): 282–90. doi: 10.1002/mdc3.12755.


Tan D, Pua YH, Balakrishnan S, Scully A, Bower KJ, Prakash KM, Tan EK, Chew JS, Poh E, Tan SB, Clark RA. Automated analysis of gait and modified timed up and go using the Microsoft Kinect in people with Parkinson’s disease: associations with physical outcome measures. Medical & Biological Engineering & Computing. (2019) Feb 13; 57: (2): 369–77. doi: 10.1007/s11517-018-1868-2.


Stricker M, Hinde D, Rolland A, Salzman N, Watson A, Almonroeder TG. Quantifying step length using two-dimensional video in individuals with Parkinson’s disease. Physiotherapy Theory and Practice. (2019) Mar 21: 1–4. doi: 10.1080/09593985.2019.1594472.


Verlekar TT, De Vroey H, Claeys K, Hallez H, Soares LD, Correia PL. Estimation and validation of temporal gait features using a markerless 2D video system. Computer Methods and Programs in Biomedicine. (2019) Jul 1; 175: : 45–51. doi: 10.1016/j.cmpb.2019.04.002.


Nonnekes J, Snijders AH, Nutt JG, Deuschl G, Giladi N, Bloem BR. Freezing of gait: a practical approach to management. The Lancet Neurology. (2015) Jul 1; 14: (7): 768–78. doi: 10.1016/S1474-4422(15)00041-1.


Nyholm D, Remahl AN, Dizdar N, Constantinescu R, Holmberg B, Jansson R, Aquilonius SM, Askmark H. Duodenal levodopa infusion monotherapy vs oral polypharmacy in advanced Parkinson disease. Neurology. (2005) Jan 25; 64: (2): 216–23. doi: 10.1212/01.WNL.0000149637.70961.4C.


Dalal N, Triggs B. Histograms of oriented gradients for human detection. In 2005 IEEe Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE. Vol. 1, (2005) Jun 20; pp. 886–893. doi: 10.1109/CVPR.2005.177.


Papageorgiou C, Poggio T. A trainable system for object detection. International Journal of Computer Vision. (2000) Jun 1; 38: (1): 15–33. doi: 10.1023/A:1008162616689.


Tafazzoli F, Safabakhsh R. Model-based human gait recognition using leg and arm movements. Engineering Applications of Artificial Intelligence. (2010) Dec 1; 23: (8): 1237–46. doi: 10.1016/j.engappai.2010.07.004.


Turk MA, Pentland AP. Face recognition using eigenfaces. In Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society. (1991) Jan 1; pp. 586–587.


Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory. (1982) Mar; 28: (2): 129–37. doi: 10.1109/TIT.1982.1056489.


Kantelhardt JW, Koscielny-Bunde E, Rego HH, Havlin S, Bunde A. Detecting long-range correlations with detrended fluctuation analysis. Physica A: Statistical Mechanics and its Applications. (2001) Jun 15; 295: (3–4): 441–54. doi: 10.1016/S0378-4371(01)00144-3.


Breslow N. A generalized Kruskal-Wallis test for comparing K samples subject to unequal patterns of censorship. Biometrika. (1970) Dec 1; 57: (3): 579–94. doi: 10.1093/biomet/57.3.579.


Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC. Estimating the support of a high-dimensional distribution. Neural Computation. (2001) Jul 1; 13: (7): 1443–71. doi: 10.1162/089976601750264965.


Üstün B, Melssen WJ, Buydens LM. Facilitating the application of support vector regression by using a universal Pearson VII function based kernel. Chemometrics and Intelligent Laboratory Systems. (2006) Mar 1; 81: (1): 29–40. doi: 10.1016/j.chemolab.2005.09.003.


Sheshadri MG, Okade M. Kinect based Frontal Gait Recognition using skeleton and depth derived features. In 2020 National Conference on Communications (NCC). IEEE. (2020) Feb 21; pp. 1–5. doi: 10.1109/NCC48643.2020.9056001.


Pistacchi M, Gioulis M, Sanson F, De Giovannini E, Filippi G, Rossetto F, Marsala SZ. Gait analysis and clinical correlations in early Parkinson’s disease. Functional Neurology. (2017) Jan; 32: (1): 28. doi: 10.11138/FNeur/2017.32.1.028.


Westin J, Dougherty M, Nyholm D, Groth T. A home environment test battery for status assessment in patients with advanced Parkinson’s disease. Computer Methods and Programs in Biomedicine. (2010) Apr 1; 98: (1): 27–35. doi: 10.1016/j.cmpb.2009.08.001.