12 Additional Consideration
12.1 Minimizing Sample Size
Replace categorical outcomes with continuous variables e.g. hypertension diagnosis \(\rightarrow\) mean BP \(\rightarrow\) often requires smaller N
Substitute a more precise outcome e.g.
- assay with smaller SD
- make measurements in duplicate or triplicate; from CLT, SD(mean) \(<\) SD(x)
Reduce variability with alternate study design.
- Paired studies where each patient is own control
- Matching cases and controls for age, gender, severity
- Matched samples reduce variability, SD, and sample size
12.1.1 Paired Study Design
- Revise asthma protocol with both treatments on same subject
- Now a one-sample t-test on differences d
- \(H_0\): d = 0
- \(H_1\): d \(\neq\) 0
- \(sd_d\) = SD on the difference, here 1 L. Since this quantity is rarely found in the literature, you may need to estimate it from \(\sqrt{\sigma_1^2 + \sigma_2^2 - \rho \cdot \sigma_1 \cdot \sigma_2}\), where \(\sigma_i\) = standard deviation of the \(i^{th}\) measurement, and \(\rho\) is the Pearson correlation between them.
power.t.test(n = NULL, delta = 0.2, sd = 1, sig.level = 0.05, power = 0.8, alternative = "two.sided",
type = "one.sample")
One-sample t test power calculation
n = 198.2
delta = 0.2
sd = 1
sig.level = 0.05
power = 0.8
alternative = two.sided
Compare the required sample size from the paired design with the unpaired. Is a paired design appropriate?
12.2 Alternate Strategies
More than 2 groups or predictors:
- Use an ANOVA sample size calculator
- Regression-based (F-test) power calculators
- Choose the most important pair-wise comparisons
Although equal group maximizes study power:
- In a cohort study, it may be cheaper or easier to recruit controls
- In a case-control study, disease may be rare, but controls abundant
- For 1:k randomization or case-control design
- Perform sample size calculations for 2 groups, n = # cases
- \(n' = \frac{n \cdot (k+1)}{2 \cdot k}\) = # of cases in 1:k study design
Computer Simulations:
- Monte Carlo simulation studies: simulate thousands of realistic datasets with different sample and effect sizes
- Bootstrap resampling: Treats pilot data as population data and sample with replacement; in each sample, the some observations may appear many times or not at all. Create and analyze 1000-2000 different data sets and test the proposed analysis plan.
- Data simulations are not a casual undertaking. Creating realistic data takes effort and may lead to independent, publishable studies
12.3 Homework Exercise
In the most recent Canadian Health Measures survey from Statistics Canada, the prevalence of obesity in Canadian children was 13.0%, and their mean BMI z-score was 0.27 (CI = –0.39 to 1.23). If you need an SD for the latter, you may use the fact that z-scores are designed to have an SD=1. Additional data may be found in CMAJ, 2016.
Based on your clinical experience, how do you think Manitoba children fare in comparison? Design a study to test your hypothesis, and describe your sample size calculation. The description should be brief enough for the Methods section of a manuscript (2-3 lines) and should always provide enough information for a reviewer to reproduce. Vary the effect size to cover several plausible scenarios. If you’re feeling enthusiastic, create a power curve (plot of effect size vs sample size).
You could choose either a continuous or categorical variable as your primary outcome, which will determine your design and sample size calculation. You only need to choose one or the other to hand in. However, I would also like you to identify an example of the other type and briefly explain how you would go about developing a power calculation had you gone that route (one line will suffice).