Power difference in a χ2 test vs generalized linear mixed model in the presence of missing data – a simulation study



Abstract Background Longitudinal randomized controlled trials (RCTs) often aim to test and measure the effect of treatment between arms at a single time point. A two-sample χ2 test is a common statistical approach when outcome data are binary. However, only complete outcomes are used in the analysis. Missing responses are common in longitudinal RCTs and by only analyzing complete data, power may be reduced and estimates could be biased. Generalized linear mixed models (GLMM) with a random intercept can be used to test and estimate the treatment effect, which may increase power and reduce bias. Methods We simulated longitudinal binary RCT data to compare the performance of a complete case χ2 test to a GLMM in terms of power, type I error, relative bias, and coverage under different missing data mechanisms (missing completely at random and missing at random). We considered how the baseline probability of the event, within subject correlation, and dropout rates under various missing mechanisms impacted each performance measure. Results When outcome data were missing completely at random, both χ2 and GLMM produced unbiased estimates; however, the GLMM returned an absolute power gain up to from 12.0% as compared to the χ2 test. When outcome data were missing at random, the GLMM yielded an absolute power gain up to 42.7% and estimates were unbiased or less biased compared to the χ2 test. Conclusions Investigators wishing to test for a treatment effect between treatment arms in longitudinal RCTs with binary outcome data in the presence of missing data should use a GLMM to gain power and produce minimally unbiased estimates instead of a complete case χ2 test.
Date made available2020

Cite this