You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

How many will I need for this study?

Good research studies are planned [1], and at some point in the process investigators ask the question, “How many participants do we need?” Researchers should consider this in relation to their specific question. Eng [2], stated that it should be settled before beginning the study. The decision can be made on several considerations, such as resources, practical issues of conducting the research and time constraints [3]. The sample size can also be estimated formally, a process that allows the researcher to achieve both a statistically significant and clinically important result [4]. Sample size estimations are often maligned and misunderstood. They are common in health care disciplines, but not other disciplines [3]. They have their supporters [4, 5], their critics [6, 7] and alternatives have been offered [8, 9]. However, their use is unlikely to disappear [3], and if conducting a randomised controlled trial they are essential [10]. This paper will attempt to illustrate how sample size estimates can be used in certain situations and offer the calculations in their simplest form.

So what does a researcher have to do? The determination of sample size has been defined as, “The mathematical process of deciding, before a study begins, how many subjects should be studied [11].” It is the minimum number of participants needed to identify a significant difference, provided it exists [4]. Sample size estimation shares its origins with hypothesis testing [12]. To conduct a sample size estimate, the researcher must decide some parameters by using their expertise of the subject area. These are: the significance level (α), the power of the study (β), the difference to be detected and the variability of the measure under observation. In addition they have to decide if they are doing a one sided or a two sided test. In this paper all tests are two sided.

Two parameters are often set by convention, significance and power. The significance level (α) of the study is mostly set at 0.05. It corresponds to the probability of making a type I error, rejecting the null hypothesis when it is true. Controlling for type I errors prevents the rise of ineffective treatments [13]. The power of a study is usually set at 0.8. The power of a study refers to the chance of accepting the null hypothesis when it is false (Type II error) [14]. Biau et al. [13] referred to type I errors as false positives and type II errors as false negatives. They also emphasised the importance of sample size estimates as the sample size determines the risk of false negative results. The researcher also needs to estimate the minimum expected difference between the two groups being investigated. This is a subjective parameter and it will be based on clinical judgements and the expertise of the researchers [2]. Lastly, the variability of the outcome measure under investigation must be determined. For a continuous variable (interval/ratio), a standard deviation is required. It is unlikely to be available prior to the study [13], so it is usually determined from preliminary data or a review of the literature can provide estimates [2].

The actual calculations for sample sizes can be mystifying and frightening when first encountered (Equation 1) [13]. However, investigators have produced other methods that are friendlier, which are presented as equations, tables [15] or nomograms [16, 17]. Perhaps Lehr [18] offers the easiest way to estimate the number of subjects per group at p <  0.05 and power = 80% (Equation 2). Where s is the standard deviation and d is the difference to be detected. The author suggested the formula was easy to commit to memory. Also, if another level of power or significance was required it is simply a matter of substituting different number into the equation (Table 1).

(1)
n=2×(Z1-α/2+Z1-β)2d2+0.25×Z1-α/22
where Z1 −α/2 = 1.96 at P = 0.05 and Z1 −β  = 0.84 at power of 0.8 and d2 is the targeted standardised difference between means (expected mean difference/standard deviation)
(2)
n=16×s2d2
where s is the standard deviation and d is the difference to be detected.

Example:

Campbell et al. [15] offered an example where clinicians were looking for 5 mmHg change in blood pressure using a standard deviation of 17 mmHg. This calculation had set power to 0.80 and significance to 0.05. Equation 1 gives a value of 183 participants per group, after rounding up (Equation 3). The Lehr [18] equation (Equation 4) gives a value of 185 participants per group after rounding up. It is always a good idea to round up a sample size estimate, as rounding down will technically leave you short of participants. The estimates produced by the two equations are similar.

(3)
n=2×(1.96+0.84)2(5/17)2+0.25×1.962=183
(4)
n=16×17252=185

Not all studies seek to compare two groups or two treatments. Sometimes the objective is to describe a given characteristic in a specific population. In this type of design estimating the sample size is important because it determines how precise the estimate can be. If a researcher is prepared to declare an acceptable margin of error, sample size can be estimated using Equation 5. This is equal to half the width of a 95% confidence interval.

(5)
n=4s2d2
where s is the standard deviation and d is the declared acceptable margin of error.

Again using the data from Campbell [15], it is decided that a confidence interval width of 5 is required, and Equation 6 estimates 47 participants are needed.

(6)
n=4(17)252=47

Sample size calculations have been an important development in statistics over the last 30 years [9, 19]. Their use has been beneficial, increasing median sample size in studies 100 fold [9]. More participants increases power, and reduces the chance of not detecting an effect when an effect does exist [4]. Using sample size estimates has been criticised philosophically [6], and on the inaccurate determination of the variability measures used. If the variability is underestimated, the study will be underpowered [13]. Nevertheless, carrying out studies such randomised controlled trials demands their use, and they do focus the attention to correct planning. When they are used, the estimates should be carried out on the primary outcome measure, which is decided a priori [9]. Lastly, put time and effort into deciding the variability of the outcome measure, getting it wrong will drastically affect the study [3].

References

1 

Lenth R2001Some practical guidelines for effective sample size determinationThe American Statistician5518793

2 

Eng J2003Sample size estimation: How many individuals should be studied? Radiology227309313

3 

Norman G, Monteiro S2012Sample size calculations: Should the emperor’s clothes be off the peg or made to measure?BMJ345e5278

4 

Burmeister E, Aitken LM2012Sample size: How many is enough? Australian Critical Care25271274

5 

Sim J, Lewis M2012The size of a pilot study for a clinical trial should be calculated in relation to considerations of precision and efficiencyJ Clin Epidemiol65301308

6 

Bacchetti P2010Current sample size conventions: Flaws, harms, and alternativesBMC Medicine817

7 

Bacchetti P, Wolf LE, Segal MR, McColloch CE2005Ethics and sample sizeAm J Epidemiol161105110

8 

Bacchetti P, McColloch CE, Segal MR2008Simple, defensible sample sizes based on cost efficiencyBiometrics642577594

9 

Bland JM2009The tyranny of power: Is there a better way to calculate sample size?BMJ33911331135

10 

Schulz KF, Altman DG, Moher D2010CONSORT statement: Updated guidelines for reporting parallell group randomized trialsBMC Medicine818

11 

Porta M, Last J2008A dictionary of epidemiologyNew YorkOUPeditors

12 

Descoteaux J2007Statistical power: An historical introductionTrends in Quantitative Methods for Psychology322834

13 

Biau D, Kerneis S, Porcher R2008The importance of sample size in the planning and interpretation of medical researchClin Orthop Relat Res46622822288

14 

Everitt BS1995Dictionary of Statistics in Medical SciencesCambridgeCambridge University Press

15 

Campbell MJ, Julious SA, Altman D1995Estimating sample sizes for binary, ordered categorical and continuous outcomes in two group comparisonsBMJ31111451148

16 

Altman DG1991Practical statistical for medical researchLondonChapman & Hall

17 

Whitley E, Ball J2002Staistics review Sample size calculationsCritical Care6335341

18 

Lehr R1992Sixteen s-squared over d-squared: A relation for crude sample size estimatesStatistics in Medicine1110991102

19 

Corty EW, Corty RW2011Setting the sample size to ensure narrow confidence intervals for precise estimation of population valuesNursing Research602148153

Figures and Tables

Table 1

Coefficients to substitute for 16 for different significance and power specifications [18]

PowerAlpha level
0.010.050.10
0.8023.51612.5
0.90302117.5
0.95362622