Distance-limited walk tests post-stroke: A systematic review of measurement properties

BACKGROUND: Improving walking capacity is a key objective of post-stroke rehabilitation. Evidence describing the quality and protocols of standardized tools for assessing walking capacity can facilitate their implementation. OBJECTIVE: To synthesize existing literature describing test protocols and measurement properties of distance-limited walk tests in people post-stroke. METHODS: Electronic database searches were completed in 2017. Records were screened and appraised for quality. RESULTS: Data were extracted from 43 eligible articles. Among the 12 walk tests identified, the 10-metre walk test (10mWT) at a comfortable pace was most commonly evaluated. Sixty-three unique protocols at comfortable and fast paces were identified. Walking pace and walkway surface, but not walkway length, influenced walking speed. Intraclass correlation coefficients for test-retest reliability ranged from 0.80–0.99 across walk tests. Measurement error values ranged from 0.04–0.40 and 0.06 to 0.20 for the 10mWT at comfortable and fast and paces, respectively. Across walk tests, performance was most frequently correlated with measures of strength, balance, and physical activity (r = 0.26-0.8, p < 0.05). CONCLUSIONS: The 10mWT has the most evidence of reliability and validity. Findings indicate that studies that include people with severe walking deficits, in acute and subacute phases of recovery, with improved quality of reporting, are needed.


Introduction
Use of standardized assessment tools is considered a best practice in stroke rehabilitation to evaluate the magnitude of gait deficit, monitor response to thera-peutic intervention, educate, and set patient-centered goals (Moore et al., 2009;Otterman et al., 2017;Potter et al., 2011;Teasell et al., 2020). Distancelimited walk tests, such as the 10-metre walk test (Wade, 1992) (10 mWT), have been recommended for assessing gait speed after stroke Otterman et al., 2017;Sullivan et al., 2013;Teasell et al., 2020). Gait speed is an important outcome of stroke rehabilitation as it is essential for community ambulation (Potter et al., 2011;Salbach et al., 2014), associated with motor function, balance (Ahmed et al., 2003;Kwong et al., 2016), walking function (Ahmed et al., 2003;, and health-related quality of life (Khanittanuphong & Tipchatyotin, 2017), and a predictor of survival (Studenski et al., 2011). Clinical use of measures of gait speed is inconsistent and variable across settings (Agyenkwa et al., 2020;Braun et al., 2018;Salbach et al., 2011;Van Peppen et al., 2008). Knowledge translation research, guided by models, theories and frameworks, is needed to overcome barriers to gait speed measurement in clinical practice.
The knowledge-to-action (KTA) framework (Graham et al., 2006) is a knowledge translation framework used to guide the process of translating research into practice. Specifically, the knowledge creation funnel in the KTA framework is used to describe the filtering process required to develop knowledge products or tools for end-users. At the base of the funnel, first-generation knowledge refers to the various individual sources of information on a topic, such as research articles and reports, that are of variable quality and time-consuming to acquire. Second-generation knowledge, or knowledge synthesis, is described as an essential precursor to the development of third-generation, user-friendly knowledge tools such as evidence-based algorithms, guides, and guidelines. PTs report that evidence supporting the measurement properties of standardized tools positively influences their decision to adopt them in clinical practice (Jette et al., 2009;McGlynn & Cott, 2007;Pattison et al., 2015). Therefore, the synthesis and critical appraisal of the measurement literature on distance-limited walk tests is necessary to inform the development of knowledge translation strategies designed to facilitate their use among PTs. Such a synthesis for time-limited walk tests has been reported (Salbach et al., 2017). The objective of this study was to synthesize research evidence of the reliability, measurement error, construct validity, and sensitivity to change for distance-limited walk tests in people with stroke. A secondary objective was to determine the influence of walk test protocol elements on test performance.

Overview
A systematic review was conducted in two phases guided by a review protocol developed by the research team. The PRISMA checklist (Liberati et al., 2009) was used to guide reporting. Title and abstract and full text screening forms, the critical appraisal form and data extraction form and guide were piloted and refined prior to use by reviewers. All reviewers involved with study selection and appraisal completed orientation and training with the study coordinator.

Search methods
An initial search was conducted in July 2013 using methods previously described (Salbach et al., 2017) and updated in 2017 due to advancements in literature search methodology (Garner et al., 2016). In collaboration with an academic health sciences librarian, we designed a new Medline search strategy that was peer-reviewed by a second librarian (McGowan et al., 2016), before being translated for use with other databases. The updated search included Ovid MEDLINE: Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Ovid MEDLINE ® Daily and Ovid MEDLINE ® , OVID Embase, EBSCO CINAHL, EBSCO SportDiscus, and The Cochrane Library from inception to August 16th, 2017. The new search strategy captured all articles that were included in the original unpublished review. See Supplemental Digital Content 1A for search strategies. A manual search of reference lists and authors' personal libraries was also conducted.
Records identified in the updated search were imported into EndNote™ software (version X7.7) and duplicate citations were removed using the Bramer method (Bramer et al., 2016). All unique records from the updated search were compared to records found in the original unpublished search, and duplicates, previously screened for eligibility, were removed. The final set of records was uploaded to Covidence™ (https://www.covidence.org) for screening.

Selection criteria
Studies were considered eligible if: (1) participants included adults (18 + years) post-stroke; (2) the study reported on reliability, measurement error, construct validity, and sensitivity to change, or the effect of a walk test protocol element (e.g., walkway length, practice trials, etc.) on performance of distance-limited walk tests (for construct validity, studies reporting associations between walk test performance and other variables, regardless of whether this was framed as validity testing, were included); (3) the study reported the timed, acceleration, and deceleration distance to enable test replication; (4) walk tests were performed separately and were not embedded within another test; and (5) the report was written in English, French or Spanish. Studies were excluded if: (1) the percentage of participants with stroke was below 80%; (2) the walk test was completed on a treadmill; (3) instrumented timing methods (e.g., GaitRite mat, footswitches) were used; or (4) the study was a conference proceeding, dissertation, case report/series or limited to abstract form.
To ensure the feasibility of the review, inclusion of studies examining construct validity was limited to those reporting unadjusted correlations and associated p-values or confidence intervals between walk test performance and measures of motor function, aerobic capacity, balance, balance self-efficacy, strength (including force, torque and power), walking, stairs, sit-to-stand, mobility, physical activity, participation, health-related quality of life, or discharge destination as these constructs are considered important rehabilitation outcomes (Lang et al., 2011;Otterman et al., 2017). Among studies examining predictive validity, only those reporting the ability of a distance-limited walk test to predict VO 2peak or max , physical activity, discharge destination, or healthrelated quality of life were included. Among studies reporting reliability, only those reporting an intraclass correlation coefficient (ICC) were included. Among studies reporting measurement error, only those reporting minimal detectable change (MDC) and/or standard error of measurement (SEM) were included.

Study selection
Three reviewers screened titles and abstracts independently and in duplicate, and classified studies as potentially relevant or not relevant to the review. Fulltexts of potentially relevant records were uploaded to Covidence™ and screened by one of six reviewers to determine eligibility. A second reviewer was consulted to resolve uncertainty regarding the eligibility of a study.

Data extraction
A single reviewer independently extracted data on general study information, study characteristics, participant characteristics, walk test protocol and results from included studies. To ensure data accuracy and completeness, another reviewer randomly selected and verified data from 30% of included articles.
Discrepancies were resolved through discussion. Data on participant characteristics (i.e., age, time since stroke onset, sex, type of stroke, side of stroke, walking speed, use of walking aids/orthoses), walk test characteristics (i.e., name, walkway distances, pace, location, timing method, trials, rest interval, scoring, evaluator position/qualifications/training, instructions), and measurement properties, were collected.

Method of quality assessment
The methodological quality of included studies was assessed using the COnsensus-based Standards for the selection of health Measurements INstruments (COSMIN) Risk of Bias Checklist (Mokkink et al., 2018). The tool classifies each measurement property as very good, adequate, doubtful, or inadequate based on the lowest score reported on the corresponding checklist. The research team adapted the checklists and developed a checklist for assessing sensitivity to change based on the format of the COSMIN checklists (see Supplemental Digital Content 1B). Additionally, operational definitions were developed to optimize scoring consistency. For example, for reliability and measurement error, we defined a retest time interval over which patient stability would be assumed for three recovery phases post-stroke as: ≤ 1 day (acute), ≤ 5 days (subacute) and ≤ 3 weeks (chronic) based on results from longitudinal studies of walking (Jørgensen et al., 1995;Richards & Olney, 1996) and research team consensus (Salbach et al., 2017). A single author assessed the methodological quality of included studies, and a second author, not involved in the quality appraisal, was consulted to resolve uncertainty. COSMIN checklists were applied to studies reporting specific measurement properties, not for properties (i.e., MDC) computed using abstracted data.

Appraisal of study methodology
Figures 2, 3, and 4 summarize critical appraisal results for articles assessing reliability and measurement error, construct validity, and sensitivity to change, respectively. All 11 articles evaluating reliability were rated as very good or adequate. The most prevalent issue was sub-optimal reporting of statistical methods (n = 4; 36%). Of the seven articles reporting on measurement error, all were rated as very good or adequate. The most prevalent issue was sub-optimal reporting of similar testing conditions (n = 2, 29%). Of the 33 articles reporting on construct validity, the number rated as very good, adequate, doubtful, and inadequate was 13 (39%), 3 (9%), 13 (39%), and 4 (12%), respectively. The most prevalent issue was other methodological flaws (n = 15; 45%), including insufficient descriptions of walk test evaluator position, qualifications, or training received. All 3 articles that evaluated sensitivity to change were rated as very good.

Influence of walk test protocol elements on test performance
3.5.1. Effect of walkway length and walking pace One study (Ng et al., 2012) of 25 participants with chronic stroke did not find significant differences in performance on the 5 m, 8 m, or 10 m walk tests at comfortable or fast pace, indicating these walkway lengths yield similar speeds. Performance at a comfortable pace (mean 0.76-0.79 metres per second (m/s)) was significantly slower than performance at a fast pace (mean 0.97-1.00 m/s) for each walkway length.

Effect of walkway surface
In one study, the effects of walkway surface on 6 mCWT and 6 mFWT performance among 24 people with subacute stroke was examined (Stephens & Goldie, 1999). Participants walked significantly faster on parquetry (hardwood) than on carpet with a mean difference of 0.05 m/s and 0.03 m/s for the 6 mCWT and 6 mFWT, respectively.

Discussion
This novel review provides a comprehensive synthesis of existing literature on measurement properties for distance-limited walk tests in people with stroke. The results are extensive which makes it challenging to understand how they might inform the selection of a distance-limited walk test to measure gait speed post-stroke in clinical practice. We therefore offer the following framework to guide decision-making that integrates systematic review findings.

D.K.-Y. Cheng et al. / Distance-limited walk tests post-stroke
capacity (1) Participation (2) Mobility (1) Stair function (2) Fast walk speed (1)  Participation (1) Stair function (2) Comfortable walk speed (1) Walk distance (1)  The extensive evidence presented in this review can help guide the selection of a distance-limited walk test for clinical use post-stroke based on principles of measurement and generalizability, the influence of protocol elements on performance, and available resources (e.g., space). The first measurement principle guiding selection is an understanding that reliability is a prerequisite of validity (Streiner DL et al., 2014). One must first choose a walk test protocol that has demonstrated excellent reliability indicated by not only the ICC value, but also the lower limit of the 95% CI, and, secondarily, evidence of construct validity in the 'population of interest'. Walking speed is a temporal-distance parameter of gait, not an abstract concept. Validity evidence increases our understanding of how strongly gait speed relates to impairments, activity limitations and participation restrictions (World Health Organization, 2001), and helps us to appreciate its relevance to human functioning, rehabilitation outcomes, and patient-centered goals.
The second principle guiding the selection of a distance-limited walk test for clinical use post-stroke relates to the generalizability of evidence to a particular clinical population (also known as external validity). If one's clinical practice involves communication and/or program evaluation of walk test performance across acute care, and inpatient and outpatient rehabilitation settings (i.e., the care continuum), then ideally one will choose a distance-limited walk test with evidence of excellent reliability and validity in people with acute, subacute, and chronic stroke. If clinical use of walk test performance is limited to a single practice setting, one could select a test that is reliable and valid among patients seen in that setting alone. For clinical practice along the care continuum post-stroke, review findings reveal that the 10 mCWT is the only test with evidence of excellent reliability and construct validity in people with acute, subacute, and chronic stroke. Comfortable gait speed measured using the 10 mCWT consistently relates to balance and strength impairments, and mobility/walking limitations across settings; and participation in activities of daily living, physical activity, and other meaningful activities relevant to the out-patient setting (Lang et al., 2011) in people with chronic stroke. If one's clinical practice is limited to treating people within 6 months post-stroke (acute and subacute phases), then the 5 mCWT is an excellent alternative, particularly for settings that cannot accommodate the 10 mCWT walkway length, given the evidence from this review of excellent reliability of the 5 mCWT and associations between 5 mCWT performance and important physical rehabilitation outcomes, such as motor function and basic mobility. Once reliability and validity evidence in the population of interest, and available space have been considered, a tertiary measurement consideration is sensitivity to change defined as the ability of a measure to detect change in the construct of interest (Cohen, 1977). Effect size/SRM estimates of sensitivity to change were large for the 5 mCWT and medium-to-large for the 10 mCWT in people with acute and subacute stroke (Ahmed et al., 2003;English et al., 2006;Salbach et al., 2001), reflecting the ability of both tests to capture change in walking capacity when individuals are likely participating in rehabilitation (Hall RE et al., 2018).
Generally, the walk test protocol, including instructions, acceleration/deceleration and timed distances, timing method, allowance for evaluator assistance and use of mobility devices, that is selected for clinical practice, should be identical to the one used in the reliability study supporting its use. Interestingly, review findings support an excellent level of reliability based on the ICC point estimate and lower 95% CI limit of diverse walk test protocols that did not allow physical assistance to walk (English et al., 2007;Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Isho & Usuda, 2016;Lam et al., 2010;Peters et al., 2014;Stephens & Goldie, 1999). These protocols included walkways of 3 m (Peters et al., 2014), 5 m (English et al., 2007), 6 m (Lam et al., 2010;Stephens & Goldie, 1999), and 10 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Isho & Usuda, 2016) traversed at a comfortable pace, and 6 m (Stephens & Goldie, 1999) and 10 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012) walked at a fast pace; acceleration/deceleration distances of 2.0 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Lam et al., 2010;Peters et al., 2014;Stephens & Goldie, 1999) or 2.5 m (English et al., 2007;Isho & Usuda, 2016); 0 or 1 practice trial and 1 test trial (English et al., 2007;Isho & Usuda, 2016;Lam et al., 2010;Stephens & Goldie, 1999), as well as the mean of 2 or 3 trials (Faria et al., 2012;Flansbjer et al., 2005) or the maximum of 3 trials (Faria et al., 2012); and individuals with variable levels of plantar flexor tone (Hiengkaew et al., 2012) and community ambulation (Peters et al., 2014). It appears that, regardless of the protocol, any standardized distance-limited test to evaluate walking speed in people with stroke not requiring assistance is highly reliable. However, it is important that selected walk tests be compared to tests with the same testing distance and protocol, as results from only one study of people chronic stroke (Lam et al., 2010) showed that walkway distance did not affect walking speed. One cannot assume that these results apply to people with acute and subacute stroke, populations that are often seen in rehabilitation settings with less stable walking capacity compared to people with chronic stroke (Christensen et al., 2008;Schepers et al., 2006).
Excellent reliability based on ICC magnitude alone was also observed for a small number of walk test protocols (i.e., 5 mCWT, 10 mCWT, and 10 mFWT) in studies of very good or adequate quality that allowed the evaluator to provide physical assistance (Cheng et al., 2020;Høyer et al., 2014), with lower 95%CI limits in the acceptable range (Cheng et al., 2020;. These findings are extremely relevant to acute and inpatient rehabilitation settings in which a substantial proportion of people post-stroke require assistance to walk (Hall RE et al., 2018). Healthcare professionals in these settings should consider adopting a protocol that allows the evaluator to provide physical assistance at the waist (Cheng et al., 2020;Høyer et al., 2014), but not to advance the lower extremity (Cheng et al., 2020). In fact, in people with acute and subacute stroke walking at slow speeds (e.g., mean ∼0.25 m/s), the reliability of walk test protocols evaluated is excellent and MDC 90 values are small (0.07 or 0.12 m/s) Høyer et al., 2014).
This review revealed gaps in the literature. Evidence for the reliability of the 5 mF-, 7 mC-, 7 mF-, 8 mC-, and 12 mCWT, for measurement error of the 5 mF-, 6 mF-, 7 mC-, 7 mF-, 8 mC-, and 12 mCWT, and for the construct validity of the 3 mC-, 6 mF-, and 8 mFWT, ideally across the care continuum, was lacking. Despite recommendations for the use of the 10 mCWT in clinical (Otterman et al., 2017;Sullivan et al., 2013;Teasell et al., 2020) and research  settings, and its popularity in research studies (Salbach et al., 2014), there was limited research evaluating test-retest reliability and measurement error of this test in people with acute or subacute stroke. Furthermore, while some guidelines promote the 6-metre walk test for neurologic populations (Moore et al., 2018), our review found that evidence for reliability of this test was limited to the subacute stage, and the precision of the estimates is unknown because CIs were not reported (Lam et al., 2010;Stephens & Goldie, 1999). The vast majority of studies included in this review had limited applicability to rehabilitation settings as they enrolled people who walked faster than 0.4 m/s. Studies targeting people who walk slowly and may require assistance to walk, deficits commonly seen in acute care and inpatient rehabilitation settings (Hall RE et al., 2018), are needed.
This review has some limitations. Due to the extensive literature in this area and finite resources, we were unable to include evidence of validity for all constructs, studies of minimal clinically important change, or a more current review. More recent publications may address some of the gaps we identified. Although only one reviewer completed full text screening, data extraction and critical appraisal, extensive training and verification of data were undertaken. The review was comprehensive given the large number of databases searched and inclusion of any study reporting associations with gait speed for evidence of validity.

Conclusions
The 10 mCWT is the only measure demonstrating excellent reliability and construct validity across the care continuum post-stroke, and sensitivity to change in people with acute and subacute stroke. The 5 mCWT demonstrates excellent reliability, construct validity, and sensitivity to change in acute and subacute phases of stroke recovery. Despite wide variations, the majority of protocols for distancelimited tests have excellent reliability, and evidence of validity indicated by associations with important physical rehabilitation outcomes, even in people who require assistance to walk. Review findings provide guidance for future research and improved quality of reporting.
review. PT is an author on 1 article included in this review. Otherwise, the authors declare no conflicts of interest. Funding for this project is provided by the Canadian Institutes of Health Research, Heart & Stroke Foundation, Canadian Partnership for Stroke Recovery, and Canadian Frailty Network. NMS holds the Toronto Rehabilitation Institute Chair at the University of Toronto.