Distance-limited walk tests post-stroke: A systematic review of measurement properties

BACKGROUND: Improving walking capacity is a key objective of post-stroke rehabilitation. Evidence describing the quality and protocols of standardized tools for assessing walking capacity can facilitate their implementation. OBJECTIVE: To synthesize existing literature describing test protocols and measurement properties of distance-limited walk tests in people post-stroke. METHODS: Electronic database searches were completed in 2017. Records were screened and appraised for quality. RESULTS: Data were extracted from 43 eligible articles. Among the 12 walk tests identified, the 10-metre walk test (10mWT) at a comfortable pace was most commonly evaluated. Sixty-three unique protocols at comfortable and fast paces were identified. Walking pace and walkway surface, but not walkway length, influenced walking speed. Intraclass correlation coefficients for test-retest reliability ranged from 0.80–0.99 across walk tests. Measurement error values ranged from 0.04–0.40 and 0.06 to 0.20 for the 10mWT at comfortable and fast and paces, respectively. Across walk tests, performance was most frequently correlated with measures of strength, balance, and physical activity (r = 0.26-0.8, p < 0.05). CONCLUSIONS: The 10mWT has the most evidence of reliability and validity. Findings indicate that studies that include people with severe walking deficits, in acute and subacute phases of recovery, with improved quality of reporting, are needed.

The knowledge-to-action (KTA) framework (Graham et al., 2006) is a knowledge translation framework used to guide the process of translating research into practice.Specifically, the knowledge creation funnel in the KTA framework is used to describe the filtering process required to develop knowledge products or tools for end-users.At the base of the funnel, first-generation knowledge refers to the various individual sources of information on a topic, such as research articles and reports, that are of variable quality and time-consuming to acquire.Second-generation knowledge, or knowledge synthesis, is described as an essential precursor to the development of third-generation, user-friendly knowledge tools such as evidence-based algorithms, guides, and guidelines.PTs report that evidence supporting the measurement properties of standardized tools positively influences their decision to adopt them in clinical practice (Jette et al., 2009;McGlynn & Cott, 2007;Pattison et al., 2015).Therefore, the synthesis and critical appraisal of the measurement literature on distance-limited walk tests is necessary to inform the development of knowledge translation strategies designed to facilitate their use among PTs.Such a synthesis for time-limited walk tests has been reported (Salbach et al., 2017).The objective of this study was to synthesize research evidence of the reliability, measurement error, construct validity, and sensitivity to change for distance-limited walk tests in people with stroke.A secondary objective was to determine the influence of walk test protocol elements on test performance.

Overview
A systematic review was conducted in two phases guided by a review protocol developed by the research team.The PRISMA checklist (Liberati et al., 2009) was used to guide reporting.Title and abstract and full text screening forms, the critical appraisal form and data extraction form and guide were piloted and refined prior to use by reviewers.All reviewers involved with study selection and appraisal completed orientation and training with the study coordinator.

Search methods
An initial search was conducted in July 2013 using methods previously described (Salbach et al., 2017) and updated in 2017 due to advancements in literature search methodology (Garner et al., 2016).In collaboration with an academic health sciences librarian, we designed a new Medline search strategy that was peer-reviewed by a second librarian (McGowan et al., 2016), before being translated for use with other databases.The updated search included Ovid MEDLINE: Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Ovid MEDLINE ® Daily and Ovid MEDLINE ® , OVID Embase, EBSCO CINAHL, EBSCO SportDiscus, and The Cochrane Library from inception to August 16th, 2017.The new search strategy captured all articles that were included in the original unpublished review.See Supplemental Digital Content 1A for search strategies.A manual search of reference lists and authors' personal libraries was also conducted.
Records identified in the updated search were imported into EndNote™ software (version X7.7) and duplicate citations were removed using the Bramer method (Bramer et al., 2016).All unique records from the updated search were compared to records found in the original unpublished search, and duplicates, previously screened for eligibility, were removed.The final set of records was uploaded to Covidence™ (https://www.covidence.org) for screening.

Selection criteria
Studies were considered eligible if: (1) participants included adults (18 + years) post-stroke; (2) the study reported on reliability, measurement error, construct validity, and sensitivity to change, or the effect of a walk test protocol element (e.g., walkway length, practice trials, etc.) on performance of distance-limited walk tests (for construct validity, studies reporting associations between walk test performance and other variables, regardless of whether this was framed as validity testing, were included); (3) the study reported the timed, acceleration, and deceleration distance to enable test replication; (4) walk tests were performed separately and were not embedded within another test; and (5) the report was written in English, French or Spanish.Studies were excluded if: (1) the percentage of participants with stroke was below 80%; (2) the walk test was completed on a treadmill; (3) instrumented timing methods (e.g., GaitRite mat, footswitches) were used; or (4) the study was a conference proceeding, dissertation, case report/series or limited to abstract form.
To ensure the feasibility of the review, inclusion of studies examining construct validity was limited to those reporting unadjusted correlations and associated p-values or confidence intervals between walk test performance and measures of motor function, aerobic capacity, balance, balance self-efficacy, strength (including force, torque and power), walking, stairs, sit-to-stand, mobility, physical activity, participation, health-related quality of life, or discharge destination as these constructs are considered important rehabilitation outcomes (Lang et al., 2011;Otterman et al., 2017).Among studies examining predictive validity, only those reporting the ability of a distance-limited walk test to predict VO 2peak or max , physical activity, discharge destination, or healthrelated quality of life were included.Among studies reporting reliability, only those reporting an intraclass correlation coefficient (ICC) were included.Among studies reporting measurement error, only those reporting minimal detectable change (MDC) and/or standard error of measurement (SEM) were included.

Study selection
Three reviewers screened titles and abstracts independently and in duplicate, and classified studies as potentially relevant or not relevant to the review.Fulltexts of potentially relevant records were uploaded to Covidence™ and screened by one of six reviewers to determine eligibility.A second reviewer was consulted to resolve uncertainty regarding the eligibility of a study.

Data extraction
A single reviewer independently extracted data on general study information, study characteristics, participant characteristics, walk test protocol and results from included studies.To ensure data accuracy and completeness, another reviewer randomly selected and verified data from 30% of included articles.
Discrepancies were resolved through discussion.Data on participant characteristics (i.e., age, time since stroke onset, sex, type of stroke, side of stroke, walking speed, use of walking aids/orthoses), walk test characteristics (i.e., name, walkway distances, pace, location, timing method, trials, rest interval, scoring, evaluator position/qualifications/training, instructions), and measurement properties, were collected.

Method of quality assessment
The methodological quality of included studies was assessed using the COnsensus-based Standards for the selection of health Measurements INstruments (COSMIN) Risk of Bias Checklist (Mokkink et al., 2018).The tool classifies each measurement property as very good, adequate, doubtful, or inadequate based on the lowest score reported on the corresponding checklist.The research team adapted the checklists and developed a checklist for assessing sensitivity to change based on the format of the COSMIN checklists (see Supplemental Digital Content 1B).Additionally, operational definitions were developed to optimize scoring consistency.For example, for reliability and measurement error, we defined a retest time interval over which patient stability would be assumed for three recovery phases post-stroke as: ≤ 1 day (acute), ≤ 5 days (subacute) and ≤ 3 weeks (chronic) based on results from longitudinal studies of walking (Jørgensen et al., 1995;Richards & Olney, 1996) and research team consensus (Salbach et al., 2017).A single author assessed the methodological quality of included studies, and a second author, not involved in the quality appraisal, was consulted to resolve uncertainty.COSMIN checklists were applied to studies reporting specific measurement properties, not for properties (i.e., MDC) computed using abstracted data.

Data synthesis and analysis
ICC values and associated 95% confidence intervals (CIs) were extracted when reported.The 95% CI is interpreted as the interval that will capture the true ICC value of the population 95% of the time when repeated random samples are drawn from the population (Shrout & Fleiss, 1979).ICC values used to estimate reliability were interpreted as excellent (ICC ≥ 0.75), acceptable (ICC > 0.40 to < 0.75) or poor (ICC ≤ 0.40) (Andresen, 2000).MDC at the 90% confidence level (MDC 90 ) was computed for studies reporting test-retest reliability estimates and standard deviation of baseline score using the following equations: 1 SEM = [SD x sqrt(1 -ICC)] (Beaton et al., 2001) and MDC 90 = [1.645x SEM x sqrt(2)] (Beaton et al., 2001).Constructs measured to evaluate validity were classified using the International Classification of Functioning, Disability and Health (World Health Organization, 2001) (ICF).We interpreted correlation coefficients as strong (≥ 0.70), moderate (0.50 to 0.69), weak (0.30 to 0.49) or negligible (< 0.30) (Landis & Koch, 1977).Effect size and standardized response mean values used to estimate sensitivity to change were interpreted as small (0.2), moderate (0.5), and large (≥ 0.8) (Cohen, 1977).For those studies evaluating torque at multiple points, only peak torque measured using isokinetic dynamometers was reported.Results for reliability, measurement error, validity, and sensitivity to change were presented by time post-stroke classified as acute (< 1 month), subacute (1-6 months), or chronic (> 6 months) (Hatem et al., 2016) using range/interquartile range (or mean/median values if range was not presented).To facilitate comparison between studies, frequency data were converted to percentages, results were converted to a common metric unit, and values were rounded to a consistent decimal place.

Appraisal of study methodology
Figures 2, 3, and 4 summarize critical appraisal results for articles assessing reliability and measurement error, construct validity, and sensitivity to change, respectively.All 11 articles evaluating reliability were rated as very good or adequate.The most prevalent issue was sub-optimal reporting of statistical methods (n = 4; 36%).Of the seven articles reporting on measurement error, all were rated as very good or adequate.The most prevalent issue was sub-optimal reporting of similar testing conditions (n = 2, 29%).Of the 33 articles reporting on construct validity, the number rated as very good, adequate, doubtful, and inadequate was 13 (39%), 3 (9%), 13 (39%), and 4 (12%), respectively.The most prevalent issue was other methodological flaws (n = 15; 45%), including insufficient descriptions of walk test evaluator position, qualifications, or training received.All 3 articles that evaluated sensitivity to change were rated as very good.

Participant and walk test characteristics
The number of articles describing people with acute, subacute and chronic stroke was 2 (5%), 6 (14%) and 21 (49%), respectively.Fourteen studies (33%) included participants in different phases.There were 43 evaluations of walk tests at a comfortable pace and 20 at a fast pace.The position of the evaluator was reported for 15 walk test protocols (24%) as beside (9 protocols (Alzahrani et al., 2009;Cheng et al., 2020;Fulk et al., 2008   2009; Salbach et al., 2001;Stephens & Goldie, 1999)) behind (5 protocols (English et al., 2006;English et al., 2007;Høyer et al., 2014;Ng et al., 2012)), and beside or behind as needed (1 protocol (Salbach et al., 2013)).Use of assistive devices was reported in 28 articles (65%).Eight protocols (13%) allowed physical assistance to walk.In 25 of 30 articles (83%) that named the walk test administered, the convention was to name the walk test according to the timed distance (e.g., for the 10-metre walk test, time taken to walk 10m is documented).Supplemental Digital Content 2 includes summaries of participant characteristics across articles and details of the 63 unique protocols for 12 walk tests.

Effect of walkway length and walking pace
One study (Ng et al., 2012) of 25 participants with chronic stroke did not find significant differences in performance on the 5 m, 8 m, or 10 m walk tests at comfortable or fast pace, indicating these walkway lengths yield similar speeds.Performance at a comfortable pace (mean 0.76-0.79metres per second (m/s)) was significantly slower than performance at a fast pace (mean 0.97-1.00m/s) for each walkway length.

Effect of walkway surface
In one study, the effects of walkway surface on 6 mCWT and 6 mFWT performance among 24 people with subacute stroke was examined (Stephens & Goldie, 1999).Participants walked significantly faster on parquetry (hardwood) than on carpet with a mean difference of 0.05 m/s and 0.03 m/s for the 6 mCWT and 6 mFWT, respectively.

Sensitivity to change
Table 4 presents estimates of sensitivity to change reported in 3 articles.Large ES/SRM were observed for the 5 mCWT, and medium ES and large SRM for the 5 mFWT, 10 mCWT, and 10 mFWT in people with acute and subacute stroke (Ahmed et al., 2003;English et al., 2006;Salbach et al., 2001).
Table 5 summarizes reliability, measurement error, sensitivity to change, and construct validity findings by walk test and recovery phase post-stroke.

Discussion
This novel review provides a comprehensive synthesis of existing literature on measurement properties for distance-limited walk tests in people with stroke.The results are extensive which makes it challenging to understand how they might inform the selection of a distance-limited walk test to measure gait speed post-stroke in clinical practice.We therefore offer the following framework to guide decision-making that integrates systematic review findings.  Modified Ashworth Scale (MAS) was used to classify ankle plantar-flexor tone as: no increase (MAS = 0), slight increase (MAS = 1-1+), and marked increase (MAS ≥ 2). 4 Patient's physical therapist determined the amount of physical assistance given. 5People dependent on 1 person to walk participated.A physical therapist secured the patient from behind by close manual support, holding the waistband, or by close presence and supervision, and only moved when the patient was in double support phase. 6Evaluator provided physical assistance at the waist to steady the person, if needed, but not to advance the foot. 7Orthoses were permitted if a prerequisite for safety.The extensive evidence presented in this review can help guide the selection of a distance-limited walk test for clinical use post-stroke based on principles of measurement and generalizability, the influence of protocol elements on performance, and available resources (e.g., space).The first measurement principle guiding selection is an understanding that reliability is a prerequisite of validity (Streiner DL et al., 2014).One must first choose a walk test protocol that has demonstrated excellent reliability indicated by not only the ICC value, but also the lower limit of the 95% CI, and, secondarily, evidence of construct validity in the 'population of interest'.Walking speed is a temporal-distance parameter of gait, not an abstract concept.Validity evidence increases our understanding of how strongly gait speed relates to impairments, activity limitations and participation restrictions (World Health Organization, 2001), and helps us to appreciate its relevance to human functioning, rehabilitation outcomes, and patient-centered goals.

D.K.-Y. Cheng et al. / Distance-limited walk tests post-stroke
The second principle guiding the selection of a distance-limited walk test for clinical use post-stroke relates to the generalizability of evidence to a particular clinical population (also known as external validity).If one's clinical practice involves communication and/or program evaluation of walk test performance across acute care, and inpatient and outpatient rehabilitation settings (i.e., the care continuum), then ideally one will choose a distance-limited walk test with evidence of excellent reliability and validity in people with acute, subacute, and chronic stroke.If clinical use of walk test performance is limited to a single practice setting, one could select a test that is reliable and valid among patients seen in that setting alone.For clinical practice along the care continuum post-stroke, review findings reveal that the 10 mCWT is the only test with evidence of excellent reliability and construct validity in people with acute, subacute, and chronic stroke.Comfortable gait speed measured using the 10 mCWT consistently relates to balance and strength impairments, and mobility/walking limitations across settings; and participation in activities of daily living, physical activity, and other meaningful activities relevant to the out-patient setting (Lang et al., 2011) in people with chronic stroke.If one's clinical practice is limited to treating people within 6 months post-stroke (acute and subacute phases), then the 5 mCWT is an excellent alternative, particularly for settings that cannot accommodate the 10 mCWT walkway length, given the evidence from this review of excellent reliability of the 5 mCWT and associations between 5 mCWT performance and important physical rehabilitation outcomes, such as motor function and basic mobility.Once reliability and validity evidence in the population of interest, and available space have been considered, a tertiary measurement consideration is sensitivity to change defined as the ability of a measure to detect change in the construct of interest (Cohen, 1977).Effect size/SRM estimates of sensitivity to change were large for the 5 mCWT and medium-to-large for the 10 mCWT in people with acute and subacute stroke (Ahmed et al., 2003;English et al., 2006;Salbach et al., 2001), reflecting the ability of both tests to capture change in walking capacity when individuals are likely participating in rehabilitation (Hall RE et al., 2018).
Generally, the walk test protocol, including instructions, acceleration/deceleration and timed distances, timing method, allowance for evaluator assistance and use of mobility devices, that is selected for clinical practice, should be identical to the one used in the reliability study supporting its use.Interestingly, review findings support an excellent level of reliability based on the ICC point estimate and lower 95% CI limit of diverse walk test protocols that did not allow physical assistance to walk (English et al., 2007;Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Isho & Usuda, 2016;Lam et al., 2010;Peters et al., 2014;Stephens & Goldie, 1999).These protocols included walkways of 3 m (Peters et al., 2014), 5 m (English et al., 2007), 6 m (Lam et al., 2010;Stephens & Goldie, 1999), and 10 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Isho & Usuda, 2016) traversed at a comfortable pace, and 6 m (Stephens & Goldie, 1999) and 10 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012) walked at a fast pace; acceleration/deceleration distances of 2.0 m (Faria et al., 2012;Flansbjer et al., 2005;Hiengkaew et al., 2012;Lam et al., 2010;Peters et al., 2014;Stephens & Goldie, 1999) or 2.5 m (English et al., 2007;Isho & Usuda, 2016); 0 or 1 practice trial and 1 test trial (English et al., 2007;Isho & Usuda, 2016;Lam et al., 2010;Stephens & Goldie, 1999), as well as the mean of 2 or 3 trials (Faria et al., 2012;Flansbjer et al., 2005) or the maximum of 3 trials (Faria et al., 2012); and individuals with variable levels of plantar flexor tone (Hiengkaew et al., 2012) and community ambulation (Peters et al., 2014).It appears that, regardless of the protocol, any standardized distance-limited test to evaluate walking speed in people with stroke not requiring assistance is highly reliable.However, it is important that selected walk tests be compared to tests with the same testing distance and protocol, as results from only one study of people chronic stroke (Lam et al., 2010) showed that walkway distance did not affect walking speed.One cannot assume that these results apply to people with acute and subacute stroke, populations that are often seen in rehabilitation settings with less stable walking capacity compared to people with chronic stroke (Christensen et al., 2008;Schepers et al., 2006).
Excellent reliability based on ICC magnitude alone was also observed for a small number of walk test protocols (i.e., 5 mCWT, 10 mCWT, and 10 mFWT) in studies of very good or adequate quality that allowed the evaluator to provide physical assistance (Cheng et al., 2020;Fulk & Echternach, 2008;Høyer et al., 2014), with lower 95%CI limits in the acceptable range (Cheng et al., 2020;Fulk & Echternach, 2008).These findings are extremely relevant to acute and inpatient rehabilitation settings in which a substantial proportion of people post-stroke require assistance to walk (Hall RE et al., 2018).Healthcare professionals in these settings should consider adopting a protocol that allows the evaluator to provide physical assistance at the waist (Cheng et al., 2020;Høyer et al., 2014), but not to advance the lower extremity (Cheng et al., 2020).In fact, in people with acute and subacute stroke walking at slow speeds (e.g., mean ∼0.25 m/s), the reliability of walk test protocols evaluated is excellent and MDC 90 values are small (0.07 or 0.12 m/s) (Fulk & Echternach, 2008;Høyer et al., 2014).
This review revealed gaps in the literature.Evidence for the reliability of the 5 mF-, 7 mC-, 7 mF-, 8 mC-, and 12 mCWT, for measurement error of the 5 mF-, 6 mF-, 7 mC-, 7 mF-, 8 mC-, and 12 mCWT, and for the construct validity of the 3 mC-, 6 mF-, and 8 mFWT, ideally across the care continuum, was lacking.Despite recommendations for the use of the 10 mCWT in clinical (Otterman et al., 2017;Sullivan et al., 2013;Teasell et al., 2020) and research (Kwakkel et al., 2017) settings, and its popularity in research studies (Salbach et al., 2014), there was limited research evaluating test-retest reliability and measurement error of this test in people with acute or subacute stroke.Furthermore, while some guidelines promote the 6-metre walk test for neurologic populations (Moore et al., 2018), our review found that evidence for reliability of this test was limited to the subacute stage, and the precision of the estimates is unknown because CIs were not reported (Lam et al., 2010;Stephens & Goldie, 1999).The vast majority of studies included in this review had limited applicability to rehabilitation settings as they enrolled people who walked faster than 0.4 m/s.Studies targeting people who walk slowly and may require assistance to walk, deficits commonly seen in acute care and inpatient rehabilitation settings (Hall RE et al., 2018), are needed.
This review has some limitations.Due to the extensive literature in this area and finite resources, we were unable to include evidence of validity for all constructs, studies of minimal clinically important change, or a more current review.More recent publications may address some of the gaps we identified.Although only one reviewer completed full text screening, data extraction and critical appraisal, extensive training and verification of data were undertaken.The review was comprehensive given the large number of databases searched and inclusion of any study reporting associations with gait speed for evidence of validity.

Conclusions
The 10 mCWT is the only measure demonstrating excellent reliability and construct validity across the care continuum post-stroke, and sensitivity to change in people with acute and subacute stroke.The 5 mCWT demonstrates excellent reliability, construct validity, and sensitivity to change in acute and subacute phases of stroke recovery.Despite wide variations, the majority of protocols for distancelimited tests have excellent reliability, and evidence of validity indicated by associations with important physical rehabilitation outcomes, even in people who require assistance to walk.Review findings provide guidance for future research and improved quality of reporting.
review.PT is an author on 1 article included in this review.Otherwise, the authors declare no conflicts of interest.Funding for this project is provided by the Canadian Institutes of Health Research, Heart & Stroke Foundation, Canadian Partnership for Stroke Recovery, and Canadian Frailty Network.NMS holds the Toronto Rehabilitation Institute Chair at the University of Toronto.
NR, not reported; NS, not significant; PA, physical activity; PASIPD, Physical Activity Scale for Individuals with Physical Disabilities; SAM, StepWatch Activity Monitor; STREAM, Stroke Rehabilitation Assessment of Movement. 1 Paretic side. 2 Non-paretic side.3Number of steps at a low rate is defined as < 30 steps per minute.4Number of steps at a high rate is defined as > 60 steps per minute.