TY - JOUR
T1 - A myriad of methods
T2 - Calculated sample size for two proportions was dependent on the choice of sample size formula and software
AU - Bell, Melanie L.
AU - Teixeira-Pinto, Armando
AU - McKenzie, Joanne E.
AU - Olivier, Jake
N1 - Funding Information:
Funding: A.T.-P. was supported by the Australian National Health and Medical Research Council Grant 402764 to the Screening and Test Evaluation Program.
PY - 2014/5
Y1 - 2014/5
N2 - Objectives Several methods exist to calculate sample size for the difference of proportions (risk difference). Researchers are often unaware that there are different formulae, different underlying assumptions, and what the impact of choice of formula is on the calculated sample size. The aim of this study was to discuss and compare different sample size formulae for the risk difference. Study Design and Setting Four sample size formulae were used to calculate sample size for nine scenarios. Software documentation for SAS, Stata, G*Power, PASS, StatXact, and several R libraries were searched for default assumptions. Each package was used to calculate sample size for two scenarios. Results We demonstrate that for a set of parameters, sample size can vary as much as 60% depending on the formula used. Varying software and assumptions yielded discrepancies of 78% and 7% between the smallest and largest calculated sizes, respectively. Discrepancies were most pronounced when powering for large risk differences. The default assumptions varied considerably between software packages, and defaults were not clearly documented. Conclusion Researchers should be aware of the assumptions in power calculations made by different statistical software packages. Assumptions should be explicitly stated in grant proposals and manuscripts and should match proposed analyses.
AB - Objectives Several methods exist to calculate sample size for the difference of proportions (risk difference). Researchers are often unaware that there are different formulae, different underlying assumptions, and what the impact of choice of formula is on the calculated sample size. The aim of this study was to discuss and compare different sample size formulae for the risk difference. Study Design and Setting Four sample size formulae were used to calculate sample size for nine scenarios. Software documentation for SAS, Stata, G*Power, PASS, StatXact, and several R libraries were searched for default assumptions. Each package was used to calculate sample size for two scenarios. Results We demonstrate that for a set of parameters, sample size can vary as much as 60% depending on the formula used. Varying software and assumptions yielded discrepancies of 78% and 7% between the smallest and largest calculated sizes, respectively. Discrepancies were most pronounced when powering for large risk differences. The default assumptions varied considerably between software packages, and defaults were not clearly documented. Conclusion Researchers should be aware of the assumptions in power calculations made by different statistical software packages. Assumptions should be explicitly stated in grant proposals and manuscripts and should match proposed analyses.
KW - Binary
KW - Continuity correction
KW - Difference in proportions
KW - Power
KW - Risk difference
KW - Sample size
KW - Statistical software
UR - http://www.scopus.com/inward/record.url?scp=84897479934&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84897479934&partnerID=8YFLogxK
U2 - 10.1016/j.jclinepi.2013.10.008
DO - 10.1016/j.jclinepi.2013.10.008
M3 - Article
C2 - 24439070
AN - SCOPUS:84897479934
SN - 0895-4356
VL - 67
SP - 601
EP - 605
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
IS - 5
ER -