How many will I need for this study?
Good research studies are planned [1], and at some point in the process investigators ask the question, “How many participants do we need?” Researchers should consider this in relation to their specific question. Eng [2], stated that it should be settled before beginning the study. The decision can be made on several considerations, such as resources, practical issues of conducting the research and time constraints [3]. The sample size can also be estimated formally, a process that allows the researcher to achieve both a statistically significant and clinically important result [4]. Sample size estimations are often maligned and misunderstood. They are common in health care disciplines, but not other disciplines [3]. They have their supporters [4, 5], their critics [6, 7] and alternatives have been offered [8, 9]. However, their use is unlikely to disappear [3], and if conducting a randomised controlled trial they are essential [10]. This paper will attempt to illustrate how sample size estimates can be used in certain situations and offer the calculations in their simplest form.
So what does a researcher have to do? The determination of sample size has been defined as, “The mathematical process of deciding, before a study begins, how many subjects should be studied [11].” It is the minimum number of participants needed to identify a significant difference, provided it exists [4]. Sample size estimation shares its origins with hypothesis testing [12]. To conduct a sample size estimate, the researcher must decide some parameters by using their expertise of the subject area. These are: the significance level (α), the power of the study (β), the difference to be detected and the variability of the measure under observation. In addition they have to decide if they are doing a one sided or a two sided test. In this paper all tests are two sided.
Two parameters are often set by convention, significance and power. The significance level (α) of the study is mostly set at 0.05. It corresponds to the probability of making a type I error, rejecting the null hypothesis when it is true. Controlling for type I errors prevents the rise of ineffective treatments [13]. The power of a study is usually set at 0.8. The power of a study refers to the chance of accepting the null hypothesis when it is false (Type II error) [14]. Biau et al. [13] referred to type I errors as false positives and type II errors as false negatives. They also emphasised the importance of sample size estimates as the sample size determines the risk of false negative results. The researcher also needs to estimate the minimum expected difference between the two groups being investigated. This is a subjective parameter and it will be based on clinical judgements and the expertise of the researchers [2]. Lastly, the variability of the outcome measure under investigation must be determined. For a continuous variable (interval/ratio), a standard deviation is required. It is unlikely to be available prior to the study [13], so it is usually determined from preliminary data or a review of the literature can provide estimates [2].
The actual calculations for sample sizes can be mystifying and frightening when first encountered (Equation 1) [13]. However, investigators have produced other methods that are friendlier, which are presented as equations, tables [15] or nomograms [16, 17]. Perhaps Lehr [18] offers the easiest way to estimate the number of subjects per group at p < 0.05 and power = 80% (Equation 2). Where s is the standard deviation and d is the difference to be detected. The author suggested the formula was easy to commit to memory. Also, if another level of power or significance was required it is simply a matter of substituting different number into the equation (Table 1).
(1)
(2)
Example:
Campbell et al. [15] offered an example where clinicians were looking for 5 mmHg change in blood pressure using a standard deviation of 17 mmHg. This calculation had set power to 0.80 and significance to 0.05. Equation 1 gives a value of 183 participants per group, after rounding up (Equation 3). The Lehr [18] equation (Equation 4) gives a value of 185 participants per group after rounding up. It is always a good idea to round up a sample size estimate, as rounding down will technically leave you short of participants. The estimates produced by the two equations are similar.
(3)
(4)
Not all studies seek to compare two groups or two treatments. Sometimes the objective is to describe a given characteristic in a specific population. In this type of design estimating the sample size is important because it determines how precise the estimate can be. If a researcher is prepared to declare an acceptable margin of error, sample size can be estimated using Equation 5. This is equal to half the width of a 95% confidence interval.
(5)
Again using the data from Campbell [15], it is decided that a confidence interval width of 5 is required, and Equation 6 estimates 47 participants are needed.
(6)
Sample size calculations have been an important development in statistics over the last 30 years [9, 19]. Their use has been beneficial, increasing median sample size in studies 100 fold [9]. More participants increases power, and reduces the chance of not detecting an effect when an effect does exist [4]. Using sample size estimates has been criticised philosophically [6], and on the inaccurate determination of the variability measures used. If the variability is underestimated, the study will be underpowered [13]. Nevertheless, carrying out studies such randomised controlled trials demands their use, and they do focus the attention to correct planning. When they are used, the estimates should be carried out on the primary outcome measure, which is decided a priori [9]. Lastly, put time and effort into deciding the variability of the outcome measure, getting it wrong will drastically affect the study [3].
References
1 | Lenth R(2001) Some practical guidelines for effective sample size determinationThe American Statistician55: 18793 |
2 | Eng J(2003) Sample size estimation: How many individuals should be studied? Radiology227: 309313 |
3 | Norman G, Monteiro S(2012) Sample size calculations: Should the emperor’s clothes be off the peg or made to measure?BMJ345: e5278 |
4 | Burmeister E, Aitken LM(2012) Sample size: How many is enough? Australian Critical Care25: 271274 |
5 | Sim J, Lewis M(2012) The size of a pilot study for a clinical trial should be calculated in relation to considerations of precision and efficiencyJ Clin Epidemiol65: 301308 |
6 | Bacchetti P(2010) Current sample size conventions: Flaws, harms, and alternativesBMC Medicine8: 17 |
7 | Bacchetti P, Wolf LE, Segal MR, McColloch CE(2005) Ethics and sample sizeAm J Epidemiol161: 105110 |
8 | Bacchetti P, McColloch CE, Segal MR(2008) Simple, defensible sample sizes based on cost efficiencyBiometrics64: 2577594 |
9 | Bland JM(2009) The tyranny of power: Is there a better way to calculate sample size?BMJ339: 11331135 |
10 | Schulz KF, Altman DG, Moher D(2010) CONSORT statement: Updated guidelines for reporting parallell group randomized trialsBMC Medicine8: 18 |
11 | Porta M, Last J(2008) A dictionary of epidemiologyNew YorkOUPeditors |
12 | Descoteaux J(2007) Statistical power: An historical introductionTrends in Quantitative Methods for Psychology3: 22834 |
13 | Biau D, Kerneis S, Porcher R(2008) The importance of sample size in the planning and interpretation of medical researchClin Orthop Relat Res466: 22822288 |
14 | Everitt BS(1995) Dictionary of Statistics in Medical SciencesCambridgeCambridge University Press |
15 | Campbell MJ, Julious SA, Altman D(1995) Estimating sample sizes for binary, ordered categorical and continuous outcomes in two group comparisonsBMJ311: 11451148 |
16 | Altman DG(1991) Practical statistical for medical researchLondonChapman & Hall |
17 | Whitley E, Ball J(2002) Staistics review Sample size calculationsCritical Care6: 335341 |
18 | Lehr R(1992) Sixteen s-squared over d-squared: A relation for crude sample size estimatesStatistics in Medicine11: 10991102 |
19 | Corty EW, Corty RW(2011) Setting the sample size to ensure narrow confidence intervals for precise estimation of population valuesNursing Research60: 2148153 |
Figures and Tables
Table 1
Power | Alpha level | ||
0.01 | 0.05 | 0.10 | |
0.80 | 23.5 | 16 | 12.5 |
0.90 | 30 | 21 | 17.5 |
0.95 | 36 | 26 | 22 |