12 Additional Consideration

12.1 Minimizing Sample Size

Replace categorical outcomes with continuous variables e.g. hypertension diagnosis \(\rightarrow\) mean BP \(\rightarrow\) often requires smaller N
Substitute a more precise outcome e.g.
- assay with smaller SD
- make measurements in duplicate or triplicate; from CLT, SD(mean) \(<\) SD(x)
Reduce variability with alternate study design.
- Paired studies where each patient is own control
- Matching cases and controls for age, gender, severity
- Matched samples reduce variability, SD, and sample size

12.1.1 Paired Study Design

Revise asthma protocol with both treatments on same subject
Now a one-sample t-test on differences d
- \(H_0\): d = 0
- \(H_1\): d \(\neq\) 0
- \(sd_d\) = SD on the difference, here 1 L. Since this quantity is rarely found in the literature, you may need to estimate it from \(\sqrt{\sigma_1^2 + \sigma_2^2 - \rho \cdot \sigma_1 \cdot \sigma_2}\), where \(\sigma_i\) = standard deviation of the \(i^{th}\) measurement, and \(\rho\) is the Pearson correlation between them.

power.t.test(n = NULL, delta = 0.2, sd = 1, sig.level = 0.05, power = 0.8, alternative = "two.sided", 
    type = "one.sample")


     One-sample t test power calculation 

              n = 198.2
          delta = 0.2
             sd = 1
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

Compare the required sample size from the paired design with the unpaired. Is a paired design appropriate?

12.2 Alternate Strategies

More than 2 groups or predictors:

Use an ANOVA sample size calculator
Regression-based (F-test) power calculators
Choose the most important pair-wise comparisons

Although equal group maximizes study power:

In a cohort study, it may be cheaper or easier to recruit controls
In a case-control study, disease may be rare, but controls abundant
For 1:k randomization or case-control design
- Perform sample size calculations for 2 groups, n = # cases
- \(n' = \frac{n \cdot (k+1)}{2 \cdot k}\) = # of cases in 1:k study design

Computer Simulations:

Monte Carlo simulation studies: simulate thousands of realistic datasets with different sample and effect sizes
Bootstrap resampling: Treats pilot data as population data and sample with replacement; in each sample, the some observations may appear many times or not at all. Create and analyze 1000-2000 different data sets and test the proposed analysis plan.
Data simulations are not a casual undertaking. Creating realistic data takes effort and may lead to independent, publishable studies

12.3 Homework Exercise

In the most recent Canadian Health Measures survey from Statistics Canada, the prevalence of obesity in Canadian children was 13.0%, and their mean BMI z-score was 0.27 (CI = –0.39 to 1.23). If you need an SD for the latter, you may use the fact that z-scores are designed to have an SD=1. Additional data may be found in CMAJ, 2016.

Based on your clinical experience, how do you think Manitoba children fare in comparison? Design a study to test your hypothesis, and describe your sample size calculation. The description should be brief enough for the Methods section of a manuscript (2-3 lines) and should always provide enough information for a reviewer to reproduce. Vary the effect size to cover several plausible scenarios. If you’re feeling enthusiastic, create a power curve (plot of effect size vs sample size).

You could choose either a continuous or categorical variable as your primary outcome, which will determine your design and sample size calculation. You only need to choose one or the other to hand in. However, I would also like you to identify an example of the other type and briefly explain how you would go about developing a power calculation had you gone that route (one line will suffice).